Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The correct way to select cells? #4

Open
WhirlFirst opened this issue Nov 26, 2022 · 16 comments
Open

The correct way to select cells? #4

WhirlFirst opened this issue Nov 26, 2022 · 16 comments

Comments

@WhirlFirst
Copy link

I am confused is this the right way to get the selected index?
In your code, you define the selected_idx as:

selected_idx = label_r.loc[:,select_drug]!=na

and you use it as:

data = data_r.loc[selected_idx.index,:]

but actually selected_idx (the output of line 109) still contains all the index, like this:
image

so it means that in line 118 the selected_idx.index still have all the ids and this code failed to filters the data with na. And I suspect this also cause the error mentioned in other issues like #3. Could you please recheck your code and results? Thanks.

@juychen
Copy link
Collaborator

juychen commented Nov 28, 2022

Hi, I tried to fix this issue. Could you please test whether it is okay now?

@WhirlFirst
Copy link
Author

Sorry, I don't know why you delete this code? I think the NA data still can not be filtered.

#label_r=label_r.fillna(na)

@juychen
Copy link
Collaborator

juychen commented Nov 28, 2022

Sorry, I don't know why you delete this code? I think the NA data still can not be filtered.

#label_r=label_r.fillna(na)

selected_idx = label_r.loc[:,select_drug]!=na

I hope to filter out na values in L109

@WhirlFirst
Copy link
Author

I have shown above that you cannot filter out na values in L109, because the selected_idx still contains all the indexes.
image

Please carefully recheck your code.

@WhirlFirst
Copy link
Author

selected_idx is a vector indicating which index contains NA value, but it has the index of all the data (length=1280). so it is impossible to filter out NA values by using its index. it would help if you changed it into
selected_idx = label_r[label_r.loc[:,select_drug]!=na]
Please tell me if you need me to clarify my meaning in Chinese.

@juychen
Copy link
Collaborator

juychen commented Nov 28, 2022

selected_idx is a vector indicating which index contains NA value, but it has the index of all the data (length=1280). so it is impossible to filter out NA values by using its index. it would help if you changed it into selected_idx = label_r[label_r.loc[:,select_drug]!=na] Please tell me if you need me to clarify my meaning in Chinese.

I see, tried to fix it in the previous commit

@WhirlFirst
Copy link
Author

So why did you commit this line?

#label_r=label_r.fillna(na)

I think this should be worked with
selected_idx = label_r.loc[:,select_drug]!=na

请问这个代码真的是你们做实验用的代码吗?为什么有这些bug的情况下还能得到结果呢?

@WhirlFirst WhirlFirst reopened this Dec 1, 2022
@WhirlFirst
Copy link
Author

@juychen @Wang-Cankun @PegasusAM @OSU-BMBL-admin Could you please give some explanation about these bugs in your project code?

@juychen
Copy link
Collaborator

juychen commented Dec 1, 2022

Some issues may happen when merging from the development branch. We are checking the issue at the moment. We will fix it soon

@WhirlFirst
Copy link
Author

Hi @juychen, Do you release the final version of your code? I have a question why did you commit this line?

#label_r=label_r.fillna(na)

Has the bug in the code been resolved? Thanks for the adata results you provided. Can you provide a bash file to get your adata file from the current code?

请问代码的Bug解决了吗?感谢你们提供你们跑出来的结果。但能否提供一个从现在代码得到你们adata文件的运行文件?这是复现结果很重要的一步。

@juychen
Copy link
Collaborator

juychen commented Dec 23, 2022

Hi @juychen, Do you release the final version of your code? I have a question why did you commit this line?

#label_r=label_r.fillna(na)

Has the bug in the code been resolved? Thanks for the adata results you provided. Can you provide a bash file to get your adata file from the current code?

请问代码的Bug解决了吗?感谢你们提供你们跑出来的结果。但能否提供一个从现在代码得到你们adata文件的运行文件?这是复现结果很重要的一步。
Hi, have you try to clone the repository, download the data in a new directory and to re-run the code? Will there still be any bugs?

@WhirlFirst
Copy link
Author

WhirlFirst commented Dec 23, 2022

Yes. I clone the repository. I still have the bug. @juychen

Traceback (most recent call last): File "bulkmodel.py", line 369, in <module> run_main(args) File "bulkmodel.py", line 264, in run_main optimizer,loss_function,epochs,exp_lr_scheduler,load=load_model,save_path=preditor_path) File "/nfs_beijing/minsheng/scbig/bioinfoDownStream/scDEAL1222/trainers.py", line 400, in train_predictor_model loss = loss_function(output, y) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1165, in forward label_smoothing=self.label_smoothing) File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2996, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) IndexError: Target 2 is out of bounds.
I think it is caused by committing the this line
label_r=label_r.fillna(na)
So that this line can not work.

selected_idx = label_r.loc[:,select_drug]!=na

@juychen
Copy link
Collaborator

juychen commented Jan 1, 2023

Yes. I clone the repository. I still have the bug. @juychen

Traceback (most recent call last): File "bulkmodel.py", line 369, in <module> run_main(args) File "bulkmodel.py", line 264, in run_main optimizer,loss_function,epochs,exp_lr_scheduler,load=load_model,save_path=preditor_path) File "/nfs_beijing/minsheng/scbig/bioinfoDownStream/scDEAL1222/trainers.py", line 400, in train_predictor_model loss = loss_function(output, y) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1165, in forward label_smoothing=self.label_smoothing) File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2996, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) IndexError: Target 2 is out of bounds. I think it is caused by committing the this line label_r=label_r.fillna(na) So that this line can not work.

selected_idx = label_r.loc[:,select_drug]!=na

Hi, I have fixed the corresponding bug by adding the fillna code. Also, I have edited the code to enhance usability, including the selection of CPU devices and making result folders if not exists. Updates are now pushed to the main branch and it should work correctly now. I have tested starting the configuration from scratch by the following procedure: create a new folder, clone code from GitHub, download the data, and run code following the command line. You can clean up the previous version and start from scratch to see whether it works.

@WhirlFirst
Copy link
Author

Thanks! @juychen. So could you reproduce the same h5ad results(https://portland-my.sharepoint.com/:u:/g/personal/junyichen8-c_my_cityu_edu_hk/EYru-LaQC1tHlFZSnf1RA_cBjXwIafy-iDsajEWjh8xcjA?e=2sE61e) by using the new code?

@SZ-qing
Copy link

SZ-qing commented Mar 7, 2023

I reproduced three datasets based on the parameters of the author's adata file, and found that the final results were very poor, and I felt that it was necessary for the author to double-check the code and data

@LCGaoZzz
Copy link

LCGaoZzz commented Apr 8, 2024

I reproduced three datasets based on the parameters of the author's adata file, and found that the final results were very poor, and I felt that it was necessary for the author to double-check the code and data

Yes, I've noticed that the results were indeed poor and far from what's described in the article. This certainly necessitates a thorough review of the code and data by the authors to ensure accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants