Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The following datasets are not complete in terms of sample size. #28

Open
Dr3753 opened this issue Jun 2, 2024 · 5 comments
Open

The following datasets are not complete in terms of sample size. #28

Dr3753 opened this issue Jun 2, 2024 · 5 comments

Comments

@Dr3753
Copy link

Dr3753 commented Jun 2, 2024

GREIN is a fantastic tool for exploring RNA-seq data, and I greatly appreciate it. However, there appears to be an issue where certain datasets include only 20 samples each, which is not consistent with the sample size listed in GEO. It seems there might be some bugs present. Could I re-procressing the following datasets?
GSE184941
GSE190504
GSE180280
GSE183947
GSE189757

GSE146009
GSE162960
GSE165255
GSE183984
GSE107422

GSE179746
GSE158420
GSE171415
GSE142441
GSE172356

GSE181273
GSE133626
GSE147493
GSE179252
GSE184336

GSE113255
GSE126304
GSE127165
GSE142083
GSE173855

GSE112026
GSE179351
Than you very much!!

@Mario-Medvedovic
Copy link
Member

Thank you for pointing this out. We were aware that there were occasional cases when this happened, but did not know there were so many of them. We will re-run these.

I am curious, how did you compile the list? Is this an exhaustive list of datasets with this problem, of they are just datasets that you were interested in and they happened to be problematic?

@Dr3753
Copy link
Author

Dr3753 commented Jun 5, 2024

@Mario-Medvedovic Thank you for your reply! I am an oncologist and I am currently systematically searching for RNA data of tumors. My search criteria include: 1) human solid tumor tissue, 2) bulk RNA-seq, and 3) data from 2009 onwards. I have checked the number of datasets available from the 171 included studies and have compiled the list above. Although I only need four of them, I thought it would be appropriate to report all of them to you.
By the way, this platform is very helpful, and I have recommended it to my colleagues. They all agree.

@Mario-Medvedovic
Copy link
Member

Thank you for info. As I said, I will re-run these. I will also run a comprehensive check over all datasets. It is very gratifying to hear that somebody like yourself finds GREIN useful.

@Mario-Medvedovic
Copy link
Member

All datasets in the list above have been re-processed now, and all except one (GSE184336) are available in GREIN. For GSE184336, our pipeline failed in extracting fastq files. We intent to troubleshoot, but this may take a while.

@Dr3753
Copy link
Author

Dr3753 commented Jun 11, 2024

@Mario-Medvedovic Thank you for the update on the reprocessing of the datasets. It was completed much faster than expected, and it has already helped a lot. There isn't much I could help, I will continue to report any bugs I encounter in the future as a form of support. Thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants