Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantifiable interfaces for results reading in FlashLFQ #793

Merged
merged 16 commits into from
Sep 6, 2024

Conversation

mzhastings
Copy link
Contributor

@mzhastings mzhastings commented Aug 19, 2024

Overall: created interfaces that enable external result files to be usable by FlashLFQ through the IdentificationAdapter

Interfaces

  • IQuantifiableRecord defines the information needed to create the identification object usable by FlashLFQ
  • IQuantifiableResultFile outlines behavior to turn results into an IEnumerable of IQuantifiableRecords and to create the dictionary linking file names from the external result files to their local file paths which are used to make the identification object

Implementation

  • MsFraggerPsm implements IQuantifiableRecord, creating the list of protein group information including the protein accession, gene name, and organism associated with that protein
  • MSFraggerPsmFile implements IQuantifiableResultFile, creating a dictionary that links the full file path of MS data to the file name associated with the protein

Classes

  • IdentificationAdapter contains a method that takes in a IQuantifiableResultFile and outputs an identification

Copy link

codecov bot commented Aug 20, 2024

Codecov Report

Attention: Patch coverage is 89.77273% with 9 lines in your changes missing coverage. Please review.

Project coverage is 75.52%. Comparing base (71828ee) to head (0171ba8).

Files Patch % Lines
...nalResults/IndividualResultRecords/MsFraggerPsm.cs 77.41% 0 Missing and 7 partials ⚠️
...rs/ExternalResults/ResultFiles/MsFraggerPsmFile.cs 89.47% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #793      +/-   ##
==========================================
+ Coverage   75.48%   75.52%   +0.04%     
==========================================
  Files         201      202       +1     
  Lines       30857    30945      +88     
  Branches     3112     3129      +17     
==========================================
+ Hits        23292    23371      +79     
- Misses       7039     7040       +1     
- Partials      526      534       +8     
Files Coverage Δ
mzLib/FlashLFQ/ResultsReading/MzLibExtensions.cs 100.00% <100.00%> (ø)
...rs/ExternalResults/ResultFiles/MsFraggerPsmFile.cs 94.73% <89.47%> (-5.27%) ⬇️
...nalResults/IndividualResultRecords/MsFraggerPsm.cs 91.02% <77.41%> (-8.98%) ⬇️

Copy link
Contributor

@nbollis nbollis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An excellent display of object oriented programming. Best first time PR that I have seen to date.
I have a few suggestions, but nothing major. Comments were main stylistic and simple things to improve readability and usability.
Excess white space can be removed at the end of many files.

string proteinAccessions;
string geneName;
string organism;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, you should mention that what happens here is also done in UsefulProteomicsDatabases/ProteinDatabaseLoader. i.e., you're parsing a fasta header. What you have is fine for now, but ideally, a future refactor would create a method for parsing fasta headers that is shared by Readers and UsefulProteomicsDatabases

@Alexander-Sol
Copy link
Contributor

There should be more comments at the level of whole classes/interfaces.

Like, the descriptions you have in the summary comment for this PR are great. If you just copy and paste those into the top of their respective interfaces/classes, that would be perfect

using System.Text;
using System.Threading.Tasks;

namespace FlashLFQ.ResultsReading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the namespace to just FlashLFQ

mzLib/FlashLFQ/ResultsReading/IdentificationAdapter.cs Outdated Show resolved Hide resolved
/// <summary>
/// MsFragger v22.0 output renames the header "PeptideProphet Probability" as just "Probability".
/// Headers are mutually exclusive, will not both occur in the same file.
/// As such, both instances need to be accounted for seperately as optional fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should delete this line (100)

Copy link
Contributor

@nbollis nbollis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, its almost there

trishorts
trishorts previously approved these changes Aug 23, 2024
@trishorts trishorts merged commit 7055d84 into smith-chem-wisc:master Sep 6, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants