Releases: bluenote-1577/sylph
Releases · bluenote-1577/sylph
v0.6.1
2024-04-09
- Made -u estimation with short-reads slightly more robust. See CHANGELOG.
- v0.6.0 has a conda install issue. Hopefully v0.6.1 fixes...
latest
Commits
- d0d44f9: Update README.md (Jim Shaw)
- 62a32f1: Update README.md (Jim Shaw)
- ca28402: Update README.md (Jim Shaw)
- f57e486: Update README.md (Jim Shaw)
- 8158e6c: v0.6.1 initial - refactored some code and added automatic diversity detection for estimating identity (bluenote-1577)
- 5db2d41: Update README.md (Jim Shaw)
- 14d3580: Update README.md (Jim Shaw)
- cc39ba4: Update README.md (Jim Shaw)
- a931df9: Update README.md (Jim Shaw)
- 83409c2: Update README.md (Jim Shaw)
- 02232c7: v0.6.1 fixed bug with -I. pushing now to try and fix conda... (bluenote-1577)
- 41a4a50: Merge branch 'main' of https://github.com/bluenote-1577/sylph (bluenote-1577)
- 2c75d89: README (bluenote-1577)
v0.6.0
sylph v0.6.0 release: New output column, lazy raw paired fastq profiling: 2024-04-06
Major
- A new column called
kmers_reassigned
is now in the profile output. This states how many k-mers are lost due to reassignment for that particular genome. -1, -2
options are now available forsylph profile
. You can now dosylph profile database.syldb -1 1.fq -2 2.fq ...
v0.5.1
sylph v0.5.1 release: Memory improvement and bug fixes : Dec 27 2023
Major
- Scalable cuckoo filters are now used for read deduplication for memory savings.
- Deduplication algorithm improved. **v0.5.0 worked poorly on highly (>15%) duplicated read sets. **
- Shorter reads can be sketched now. Down to 32bp instead of 63 bp before.
v0.5.0
sylph v0.5.0 release: Big improvements on real illumina data : Dec 23 2023
Major
In previous versions, sylph was underperforming on real illumina data sets. See #5
This is because many real illumina datasets have a non-trivial number of duplicate reads. Duplicate reads mess up sylph's statistical model.
For the single and paired sketching options, a new deduplication routine has been added. This will be described in version 2 of our preprint.
This increases sketching memory by 3-4x but greatly increases performance on real datasets with > 1-2% of duplication, especially for low-abundance genomes.
For paired-end illumina reads with non-trivial (> 1% duplication), sylph can now
- detect up to many more species low-abundance species below 0.3x coverage
- give better coverage/abundance estimates for low-abundance species
BREAKING
- sequence sketches (sylsp) have changed formats. Sequences will need to be re-sketched.
--read-length
option removed and incorporated into the sketches by default. (suggested by @fplaza)
Other changes
v0.4.1
sylph v0.4.1 - getting ready for preprinting
MINOR
- A few minor changes to help texts and options. Also fixed versioning issue.
v0.4.0
sylph v0.4.0 release: major interface changes
BREAKING
- renamed
sylph contain
tosylph query
. - methods for sketching are drastically different now. E.g. we use
-g genome1.fa genom2.fa
for specifying genomes and-r read1.fa read2.fq
for specifying reads when sketching.
Major
-u
or--estimate-unknown
options are now present for estimating unknown organisms in the sample.- When using
-u
, associated options--read-seq-id
and--read-len
are available for calculating true coverages with sylph, i.e., coverages concordant with read mapping
Minor
- Coverage calculation is slightly different now.
v0.3.0
sylph v0.3.0 release: first class support for pseudotax, now called "profile" - 2023-10-01
Continuing development of sylph taxonomic profiling.
BREAKING
--pseudotax
option in previous version is now a new command calledprofile
.- Databases are enabled for profiling by default.
- Changed file suffices to
syldb
andsylsp
.
Major
- Default parameter changes. --min-spacing is set to 30 now.
- Made profiling faster with some algorithmic tweaks.
- Coverage calculated slightly differently
- Many small software changes with respect to threading and outputs
v0.2.0
sylph v0.2.0 release: pseudotax improved - 2023-09-19
BREAKING
- Sylph's *.sylqueries are no longer compatible with older versions of sylph (< v0.2). Files will need to be resketched.
Major
- Fixed a major bug for the
--pseudotax
option that required redesigning file formats. Please use--enable-pseudotax
when using usingcontain --pseudotax
from now on. --pseudotax
option gives relative abundances now. We are gaining some confidence that this approach gives a rough, but surprisingly decent taxonomic classification.- Changed how
Eff_cov
is calculated. We just use the median coverage now, except when we apply coverage-adjustment
Minor
- Fixed command line ambiguity for sketching outputs.
-s
has been replaced with-d
forsylph sketch
. - Sylph outputs the results after processing every sample, instead of batching results, now
v0.1.0
First major release of sylph. See CHANGELOG.md for information.