Fix Dockerfile to include checked out version #338

ryanleary · 2020-02-07T02:51:43Z

The recent change included in the PR to fix unit tests introduces a regression/bug in the Dockerfile. With the Dockerfile you should be able to build the currently checked out version (and any local changes).

The current Dockerfile doesn't support this -- basically mandates that a Docker image may only be made from a branch of the main repository.

Signed-off-by: Ryan Leary rleary@nvidia.com

Dockerfile

DEKHTIARJonathan · 2020-02-07T19:15:54Z

You do not want to do this for multiple reasons:

You do not want to copy your local directory in the project. It makes the container build process non reproducible. Installing from PyPI or Github should be the only way acceptable for a release container.
You do not want to install a package in development mode: pip -e ".[all]". It's a risky and non stable way to provide a package to a user.
If you need nemo source inside the container, for dev for instance, you should mount the directory as a docker volume in the container and pip --upgrade -e ".[all]"
It makes the docker container more "heavy" therefore takes more place on the hard-drive.

If you want to build a nightly or latest container, then you directly pull the latest version from github

ryanleary · 2020-02-07T19:51:54Z

1. You can and should be able to build a container from an arbitrary branch of source. 2. Agreed. The pr here is moving to just pip install .all. Not in dev/editable mode 3. Sure we can remove the source as part of the build except for the example/tutorial scripts with a multistage build. 4. Yes. Light containers are good. We could probably strip some dependencies out too if we care so much about this. The Nemo source is realistically a very small fraction of the container size.

On Fri, Feb 7, 2020 at 2:15 PM Jonathan DEKHTIAR ***@***.***> wrote: You do not want to do this for multiple reason: 1. You do not want to copy your local directory in the project. It makes the container build process non reproducible. 2. You do not want to install a package in development mode: pip -e ".[all]". It's a risky and non stable way to provide a package to a user. 3. If you need nemo source inside the container, for dev for instance, you should mount the directory as a docker volume in the container and pip --upgrade -e ".[all]" 4. It makes the docker container more "heavy" therefore takes more place on the hard-drive — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#338?email_source=notifications&email_token=AAQ4F2DZPKIAVOXQGRTKJRTRBWXOVA5CNFSM4KRHRAA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELEHDHY#issuecomment-583561631>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQ4F2E6IWGYC6MWQKAHIO3RBWXOVANCNFSM4KRHRAAQ> .

-- Ryan Leary

blisc

Block pending discussion

ryanleary · 2020-02-11T21:04:52Z

Please have a look at the most recent update. I have attempted to address all feedback/concerns, while maintaining flexibility.

To create a release container that is reproducible, you can use this dockerfile and build the image in the following way:

RELEASE_TAG=v0.9.0
git clone git@github.com:NVIDIA/NeMo.git && cd NeMo
git checkout tags/$RELEASE_TAG
docker build --build-arg NEMO_VERSION=$(git describe --tags) -t nvcr.io/nvidia/nemo:$RELEASE_TAG

The same process works for a public release, as does a nightly/master build, as does simply building an image that is based on a local branch that we want to push and test on another system. The NEMO_VERSION build arg is mandatory and the docker image build fails if it is not specified.

(nemo) ➜  NeMo git:(fix-dockerfile) docker build -t this-will-fail-because-no-nemo-version .
[+] Building 2.7s (16/23)
[...]
 => CACHED [nemo-deps 8/8] RUN pip install --disable-pip-version-check --no-cache-dir -r requirements.txt                                                                                                                                                                                                           0.0s
 => ERROR [nemo 1/8] RUN /usr/bin/test -n "$NEMO_VERSION" &&     /bin/echo "export NEMO_VERSION=${NEMO_VERSION}" >> /root/.bashrc &&     /bin/echo "export BASE_IMAGE=nvcr.io/nvidia/pytorch:20.01-py3" >> /root/.bashrc                                                                                            0.8s
[...]

Within the container, at runtime, environment variables are set that specify the base image used as well as the commit/tag that was used.

root@1992d9e98eea:/workspace/nemo# env | grep -E 'NEMO_VERSION|BASE_IMAGE'
NEMO_VERSION=v0.9.0-601-ga977f38f
BASE_IMAGE=nvcr.io/nvidia/pytorch:20.01-py3

Between the mandatory build argument and inclusion as a runtime environment variable, I believe this provides adequate traceability as to what code was used to create the image. I would argue, in fact, that it's more traceable than the proposed alternative in this thread, because you could, in theory, use an arbitrary version of the Dockerfile to generate an image for a different tag.

While a copy of the source into Docker is done, it is done in a scratch layer and uses multi-stage builds + experimental Docker run mount to ensure the source is not included in the final image. Other modifications have been made within the image to ensure the lightest possible end container. The proposed Dockerfile generates an image that is ~200MB larger than the base NGC PyTorch container. The current Dockerfile generates an image that is over 2GB larger than the base NGC container.

For development purposes, you can create an image that contains only the NeMo dependencies and then map the local source tree into the container at runtime by selecting the appropriate target:

# build a dev image that doesnt have to change unless the dependencies do
docker build --build-arg NEMO_VERSION=dev --target nemo-deps -t nemo:20.01-py3-devel .
# checkout any branch you like
git checkout wip-add-new-feature
docker run -it --rm --gpus all -v $(pwd):/workspace/nemo nemo:20.01-py3-devel
pip install -e .[all]

Any changes made in the source will be reflected both inside and outside the container, and NeMo can be reinstalled as needed.

Discussion has happened

Signed-off-by: Ryan Leary <rleary@nvidia.com>

blisc approved these changes Feb 7, 2020

View reviewed changes

blisc reviewed Feb 7, 2020

View reviewed changes

Dockerfile Show resolved Hide resolved

blisc self-requested a review February 10, 2020 23:13

blisc previously requested changes Feb 10, 2020

View reviewed changes

okuchaiev previously approved these changes Feb 11, 2020

View reviewed changes

ryanleary dismissed okuchaiev’s stale review via 9cb9aa8 February 11, 2020 20:47

Fix Dockerfile to include checked out version

a1ebf56

Signed-off-by: Ryan Leary <rleary@nvidia.com>

ryanleary force-pushed the fix-dockerfile branch from 7534a71 to a1ebf56 Compare February 11, 2020 21:34

okuchaiev approved these changes Feb 11, 2020

View reviewed changes

okuchaiev merged commit 53c5da4 into NVIDIA:master Feb 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Dockerfile to include checked out version #338

Fix Dockerfile to include checked out version #338

ryanleary commented Feb 7, 2020

DEKHTIARJonathan commented Feb 7, 2020 •

edited

Loading

ryanleary commented Feb 7, 2020 via email

blisc left a comment

ryanleary commented Feb 11, 2020 •

edited

Loading

Fix Dockerfile to include checked out version #338

Fix Dockerfile to include checked out version #338

Conversation

ryanleary commented Feb 7, 2020

DEKHTIARJonathan commented Feb 7, 2020 • edited Loading

ryanleary commented Feb 7, 2020 via email

blisc left a comment

Choose a reason for hiding this comment

ryanleary commented Feb 11, 2020 • edited Loading

DEKHTIARJonathan commented Feb 7, 2020 •

edited

Loading

ryanleary commented Feb 11, 2020 •

edited

Loading