Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] inconsistency formatting behaviour between standardise and merge #72

Open
1 task done
jfy133 opened this issue Mar 9, 2023 · 4 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@jfy133
Copy link
Contributor

jfy133 commented Mar 9, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Problem description

I've noticed that when writiting the tutotiral, that for standardise the output header column is named as count. Whereas in merge it represents the file name.

I wonder if we should match the behaviour between the two, so both merge and standardise use the same column header format

However as this I now wonder we could even just collapse the two commands in two one... simply have standardise, which can accept one or more profiles (with if more provided, all are automatically merged...? But then someone may wish to merge themselves later on... so maybe safer as it is)

Code sample

Code run:

Traceback:

Environment

Anything else?

For example, if I merge output of standardise of one tool, and merge of another tool

taxonomy_id 2612_pe-ERR5766176-db_mOTU 2612_se-ERR5766180-db_mOTU count  
40518 20 2 NA  
216816 1 0 NA  
1680 6 1 NA  
1262820 1 0 NA  
74426 2 1 NA  
1907654 1 0 NA  
1852370 3 1 NA  
39491 3 0 NA  
33039 2 0 NA  
39486 1 0 NA  

Where coutn was from a stadnarise on kraken output

@jfy133 jfy133 added the bug Something isn't working label Mar 9, 2023
@Midnighter
Copy link
Contributor

When using merge, each column represents one profile/sample. I don't see how in a wide table you would do this? Certainly in the long (tidy) format, you have only three columns: taxonomy_id, count, and sample.

@jfy133
Copy link
Contributor Author

jfy133 commented Mar 9, 2023

But in standardise

image

It is by default wide, so still works, you just rename count to the file name ?

And if it's long, it's still a single extra column with a single-entry of the sample name

@Midnighter
Copy link
Contributor

Oh, that's how you mean. I guess, in this case wide/long are actually the same 😆 We could offer the wide/long option, though, and then change output accordingly. You think that would be better/more consistent?

@jfy133
Copy link
Contributor Author

jfy133 commented Mar 9, 2023

I think so, see the issue in the tutorial for an example in #66 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants