Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build lib separately (shared on unix; stay static on win) #33

Merged
merged 20 commits into from
Dec 17, 2022

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Dec 10, 2022

Probably time to give up this hopeless battle for the shared windows libs

Closes #32
Closes #30
Closes #26
Closes #25

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari h-vetinari changed the title WIP: Build lib separately (shared on unix; stay static on win) Build lib separately (shared on unix; stay static on win) Dec 12, 2022
Copy link
Member Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @xhochy

I finally cracked this one. 🥳
The main change to the previous attempts (#26, #30, #32) was that I gave up on shared libraries on windows - which, honestly, I feel I should have done months ago. Protobuf+windows+shared is just an insane effort (also seen in grpc, etc.), so I don't feel too bad about that.

This package comes with some binaries as well, which are currently placed with libsentencepiece (because I didn't want to do further CMake surgery), making it akin to libprotobuf carrying protoc. Are you strongly against that? If yes, how would you name an output for the binaries?

Would appreciate your review. 🙃

Likewise for you all, @isuruf @hmaarrfk @carterbox

Comment on lines +83 to +86
# binaries
{% for each_bin in ["decode", "encode", "export_vocab", "normalize", "train"] %}
- spm_{{ each_bin }} --help
{% endfor %}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here for binaries

Comment on lines +410 to +413
-#define ABSL_FLAG(Type, name, defautl_value, help) \
- absl::Flag<Type> FLAGS_##name(#name, #Type, help, defautl_value);
+#define STPC_FLAG(Type, name, default_value, help) \
+ sentencepiece::Flag<Type> FLAGS_##name(#name, #Type, help, default_value);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be ways to avoid patch 6, but sentencepiece definitely does not use abseil in the way that's prescribed by abseil - it's already heavily blurring the lines of third_party with sentencepiece-specific glue code, but redefining the ABSL_FLAG macro was the last straw at the time that just made me rip out this whole piece and place it somewhere else. Perhaps there's a less nuclear way.

recipe/build-lib.sh Outdated Show resolved Hide resolved
recipe/patches/0004-use-C-17.patch Show resolved Hide resolved
@@ -0,0 +1,884 @@
From 9f225f6f83ccea1909974ee95da9b724d017dad3 Mon Sep 17 00:00:00 2001
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you be able to maintain this patch "manually"? Seems like the changes here could be better persisted as a well-formed sed command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to get some of these changes upstream in some form...

@xhochy
Copy link
Member

xhochy commented Dec 12, 2022

Looks good in general. The comments were mostly there to hint at places where you could ease your maintenance.

@hmaarrfk
Copy link

given you got xonchy's attention, i think you are in good hands! Ill refrain from further comments.

@carterbox
Copy link
Member

This package comes with some binaries as well, which are currently placed with libsentencepiece (because I didn't want to do further CMake surgery)

In most cases, you can build the package without installing it to PREFIX. Then manually install components to PREFIX using a bash script for each recipe output. Editing CMake is typically only required if you are trying to separate libs into multiple outputs because you have to separate the CMake config files (but binaries don't have config files).

@h-vetinari
Copy link
Member Author

@carterbox
Thanks for taking a look. Yes, it would be possible to hard remove $PREFIX/bin/spm_* also in build-lib.sh, but that still leaves the question of how to name the output with only the binaries, and whether that's a net value add (i.e. the python bindings should probably still depend on whatever contains the binaries, otherwise people will be caught out by their absence when just installing sentencepiece).

@carterbox
Copy link
Member

how to name the output with only the binaries

It could be named spm or sentencepiece-spm. The python package becomes sentencepiece-python and sentencepiece is a metapackage that includes everything.

whether that's a net value add

Can't comment on value add. Not familiar enough with the sizes of the binaries or use patterns of this package.

@h-vetinari
Copy link
Member Author

It could be named spm or sentencepiece-spm. The python package becomes sentencepiece-python and sentencepiece is a metapackage that includes everything.

Not a bad idea IMO. Thoughts @xhochy?

@h-vetinari
Copy link
Member Author

@carterbox, I implemented your idea, PTAL. Still interested if you'd also have some thoughts on this approach @xhochy :)

recipe/build-lib.sh Show resolved Hide resolved
@carterbox
Copy link
Member

@conda-forge-admin, please rerender

recipe/meta.yaml Outdated Show resolved Hide resolved
@carterbox
Copy link
Member

OK. No more suggestions after that last one. Looks very good! Great job, @h-vetinari.

Co-authored-by: Daniel Ching <carterbox@users.noreply.github.com>
Copy link
Contributor

@setu4993 setu4993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, @h-vetinari! Thanks for being persistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing headers on OSX
6 participants