-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
package torchtext #17129
package torchtext #17129
Conversation
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
# upstream did not publish any tags, and a performance-critical | ||
# fix appears after the last bump (for 0.0.3). This is what e.g. | ||
# torchtext uses upstream (installing through git), so we add a ".1" | ||
{% set version = "0.0.3.1" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason for this is https://github.com/pytorch/text/blob/v0.11.0-rc3/requirements.txt#L11, and wanting not to leave out a potentially very substantial performance improvement from jekbradbury/revtok#4 (the PR notes 100-fold improvement by interning strings in some cases), which is fact the only change after 0.0.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be better to patch, just to reference the pypi version but ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean use 0.0.3 and carry a patch for jekbradbury/revtok#4? I don't mind either way. That repo looks pretty dead otherwise (last commit 4.5 years ago), and I don't expect anyone to start using micro
versions even if it does get revived, which is why I chose it.
@h-vetinari I have a question (I tried building torchtext and torchaudio, and failed; that's why I've been keeping an eye on your progress): Can we perhaps have a clearer strategy about ecosystems such as pytorch and tensorflow? I think the current setup is really unproductive and it cannot be sustainable with this highly decentralized approach. I can give you a lot of practical examples and problems I ran into over the past few months dealing with pytorch and tensorflow from conda/conda-forge (and, with all the cuda stuff added in, it is a toxic mess). I have been thinking, it might be time to rethink this and have a strategy for "ecosystem building", rather than "package building", so tensorflow and all its sprawling pieces can be built together more or less and the same for pytorch (we are still massively stuck on torchvision which has a bug that has consequences all over the place). This will especially make sense for collections of packages which tend to be released together and pinned together (which seems to be the case with tensorflow and pytorch). So maybe we need to rethink this instead of dealing with these packages as isolated items to be built individually? Copying @hmaarrfk who's been extremely helpful as I started getting involved with this recently. |
Hey, thanks or taking an interest!
I don't think this is necessary. The hardest part is to get the packages built at all, but with the abstractions that conda offers, we have full control over things like rebuilding for ABI-relevant version updates, and this is something that pip simply cannot do well, so we see all the terrifying workarounds upstream. In other words, IMO the infrastructure is there to do just that kind of "ecosystem building", only distributed over several feedstocks. It would of course help help hugely if facebook/google would support conda-forge on the whole packaging effort - in both cases the main packages are a real pain to build and therefore a big bottleneck. facebook obviously has done a lot of its own work for the Long story short: I think we don't need a big change in strategy, but more resources would help a lot, and especially a more aligned upstream maintainership.
Sorry I haven't had time to look at that. Will see how things go in the next few weeks.
torchtext is done on linux, osx is blocked on conda-forge/sentencepiece-feedstock#25 (which would also help improve linux); torchaudio is blocked on conda-forge/kaldi-feedstock#15. |
I would suggest making two separate PRs. Seems like you ran out of space....
|
Nah, there's no point, revtok is tiny. And it's not gonna be a problem once the builds are per-python-verion on the feedstock. |
Fully agree on the upstream mess and resources. I hope you're right about strategy 😃 |
Hi friend! We really, really, really appreciate that you have taken the time to make a PR on In an effort to maintain this repository and increase the signal-to-noise for open PRs, the maintainers of If you'd like to keep it open, please comment/push and we will be happy to oblige! Note that very old PRs will likely need to be rebased on Cheers and thank you for contributing to this community effort! |
Not stale. |
Hi friend! We really, really, really appreciate that you have taken the time to make a PR on In an effort to maintain this repository and increase the signal-to-noise for open PRs, the maintainers of If you'd like to keep it open, please comment/push and we will be happy to oblige! Note that very old PRs will likely need to be rebased on Cheers and thank you for contributing to this community effort! |
Not stale |
Hi again! About a month ago, we commented on this PR saying it would be closed in another month if it was still inactive. It has been a month and so now it is being closed. Thank you so much for making it in the first place and contributing to the community project that is Cheers and have a great day! |
@conda-forge/staged-recipes (I also just found out that torchtext now depends on torchaudio, weeee) |
I am trying to publish Relevant issues: |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
*looks at when this PR was opened* haha, welcome to the club 🙃 |
I've invited you as a collaborator on the fork so you'll be able to push commits into this PR. For now I've fixed some obvious issues myself, now that I realized that this doesn't depend on torchaudio after all (I think I got caught out by a comment when I previously tried to rebase the patch that disables the compilation of the third_party libs). |
@h-vetinari Thank you for adding me. I have limited knowledge of CMake and packaging third-party libraries into conda-forge recipes. Still learning. Will try my best to contribute. |
The error says |
a3484da
to
1a5ce24
Compare
Co-Authored-By: H. Vetinari <h.vetinari@gmx.com>
Since the pytorch 1.13 migration isn't through yet (and torchtext 0.14 seems to need the newest pytorch API), I dropped back to 0.13.1, and this seems to be building OK now (🤞). Together with conda-forge/torchdata-feedstock#6, I'm planning to get to pytorch 1.13 / torchtext 0.14 ASAP after the feedstock is created. PTAL @conda-forge/help-python-c CC @conda-forge/pytorch-cpu @conda-forge/sentencepiece @xhochy @hmaarrfk @ngam |
Awesome work patching their build process to match what we need. |
@h-vetinari Thank you very much for your hard work. Excited to see this come through. So this will use |
For now yes. As soon as we have the feedstock (and conda-forge/torchdata-feedstock#6 merged), we can let the pytorch 1.13 migrator come through, and couple that with a bump to torchtext 0.14.0 (I now know already which patches are necessary). |
Awesome! Thank you. It would be great to make it compatible with |
As I said, this is what we're starting with here, so those consumers should be fine. |
Thanks! TBH, ~98% of the work for this PR was getting a usable (shared) lib out of sentencepiece. Glad that seems to have been achieved now. 🙃 |
Hooray!! @h-vetinari THANK YOU SO MUCH!! |
came up in conda-forge/pytorch-lightning-feedstock#76
Hopefully there is an end to all the pytorch flavours at some point 😅