-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Dockerfile to include checked out version #338
Conversation
You do not want to do this for multiple reasons:
If you want to build a |
1. You can and should be able to build a container from an arbitrary branch
of source.
2. Agreed. The pr here is moving to just pip install .all. Not in
dev/editable mode
3. Sure we can remove the source as part of the build except for the
example/tutorial scripts with a multistage build.
4. Yes. Light containers are good. We could probably strip some
dependencies out too if we care so much about this. The Nemo source is
realistically a very small fraction of the container size.
On Fri, Feb 7, 2020 at 2:15 PM Jonathan DEKHTIAR ***@***.***> wrote:
You do not want to do this for multiple reason:
1.
You do not want to copy your local directory in the project. It makes
the container build process non reproducible.
2.
You do not want to install a package in development mode: pip -e
".[all]". It's a risky and non stable way to provide a package to a
user.
3.
If you need nemo source inside the container, for dev for instance,
you should mount the directory as a docker volume in the container and pip
--upgrade -e ".[all]"
4.
It makes the docker container more "heavy" therefore takes more place
on the hard-drive
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#338?email_source=notifications&email_token=AAQ4F2DZPKIAVOXQGRTKJRTRBWXOVA5CNFSM4KRHRAA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELEHDHY#issuecomment-583561631>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQ4F2E6IWGYC6MWQKAHIO3RBWXOVANCNFSM4KRHRAAQ>
.
--
Ryan Leary
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block pending discussion
Please have a look at the most recent update. I have attempted to address all feedback/concerns, while maintaining flexibility. To create a release container that is reproducible, you can use this dockerfile and build the image in the following way: RELEASE_TAG=v0.9.0
git clone git@github.com:NVIDIA/NeMo.git && cd NeMo
git checkout tags/$RELEASE_TAG
docker build --build-arg NEMO_VERSION=$(git describe --tags) -t nvcr.io/nvidia/nemo:$RELEASE_TAG The same process works for a public release, as does a nightly/master build, as does simply building an image that is based on a local branch that we want to push and test on another system. The (nemo) ➜ NeMo git:(fix-dockerfile) docker build -t this-will-fail-because-no-nemo-version .
[+] Building 2.7s (16/23)
[...]
=> CACHED [nemo-deps 8/8] RUN pip install --disable-pip-version-check --no-cache-dir -r requirements.txt 0.0s
=> ERROR [nemo 1/8] RUN /usr/bin/test -n "$NEMO_VERSION" && /bin/echo "export NEMO_VERSION=${NEMO_VERSION}" >> /root/.bashrc && /bin/echo "export BASE_IMAGE=nvcr.io/nvidia/pytorch:20.01-py3" >> /root/.bashrc 0.8s
[...] Within the container, at runtime, environment variables are set that specify the base image used as well as the commit/tag that was used. root@1992d9e98eea:/workspace/nemo# env | grep -E 'NEMO_VERSION|BASE_IMAGE'
NEMO_VERSION=v0.9.0-601-ga977f38f
BASE_IMAGE=nvcr.io/nvidia/pytorch:20.01-py3 Between the mandatory build argument and inclusion as a runtime environment variable, I believe this provides adequate traceability as to what code was used to create the image. I would argue, in fact, that it's more traceable than the proposed alternative in this thread, because you could, in theory, use an arbitrary version of the Dockerfile to generate an image for a different tag. While a copy of the source into Docker is done, it is done in a scratch layer and uses multi-stage builds + experimental Docker run mount to ensure the source is not included in the final image. Other modifications have been made within the image to ensure the lightest possible end container. The proposed Dockerfile generates an image that is ~200MB larger than the base NGC PyTorch container. The current Dockerfile generates an image that is over 2GB larger than the base NGC container. For development purposes, you can create an image that contains only the NeMo dependencies and then map the local source tree into the container at runtime by selecting the appropriate target: # build a dev image that doesnt have to change unless the dependencies do
docker build --build-arg NEMO_VERSION=dev --target nemo-deps -t nemo:20.01-py3-devel .
# checkout any branch you like
git checkout wip-add-new-feature
docker run -it --rm --gpus all -v $(pwd):/workspace/nemo nemo:20.01-py3-devel
pip install -e .[all] Any changes made in the source will be reflected both inside and outside the container, and NeMo can be reinstalled as needed. |
Signed-off-by: Ryan Leary <rleary@nvidia.com>
7534a71
to
a1ebf56
Compare
The recent change included in the PR to fix unit tests introduces a regression/bug in the Dockerfile. With the Dockerfile you should be able to build the currently checked out version (and any local changes).
The current Dockerfile doesn't support this -- basically mandates that a Docker image may only be made from a branch of the main repository.
Signed-off-by: Ryan Leary rleary@nvidia.com