Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to documentation and docker setups from my getting spun up #101

Closed
wants to merge 12 commits into from

Conversation

tfoote
Copy link
Collaborator

@tfoote tfoote commented Nov 30, 2021

This is an aggregate of my making progress on learning how things are working and trying to streamline things a bit.

The usage of rocker requires a new feature I just added to work around the upstream image setup: osrf/rocker#164

Copy link
Contributor

@chapulina chapulina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of rocker requires a new feature I just added

Do we need to wait for a release before merging this PR?

README.md Outdated Show resolved Hide resolved
docker/build_and_run_docker.sh Outdated Show resolved Hide resolved
--security-opt seccomp=unconfined \
$DOCKER_OPTS \
mbari_lrauv_debug
rocker --nvidia --x11 --user --user-override-name=developer --user-preserve-home -- mbari_lrauv_tests tmuxinator start debug_and_plot -n debug_session -p src/lrauv/docker/tests/debug_integration_mux.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it up to @arjo129 to check if this covers his use case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good News: The system launches correctly with the correct panes.

Bad News: I may be doing something wrong but when I run via rocker I get the following error:

[GUI] [Err] [Ogre2RenderEngine.cc:905]  Unable to create the rendering window: OGRE EXCEPTION(3:RenderingAPIException): OpenGL 3.3 is not supported. Please update your graphics card drivers. in GL3PlusRenderSyste
m::initialiseContext at /var/lib/jenkins/workspace/ogre-2.2-debbuilder/repo/RenderSystems/GL3Plus/src/OgreGL3PlusRenderSystem.cpp (line 3364)                                                                       
[GUI] [Err] [Ogre2RenderEngine.cc:913] Unable to create the rendering window after [11] attempts.                                                                                                                   
[GUI] [Err] [Ogre2RenderEngine.cc:834] Failed to create dummy render window.

Might be some (nvidia?) driver issue or a problem with the way I setup rocker 🤔 .
Also is the --nvidia really nessecary? What about scenarios where we aren't using an nvidia card. I don't think we really need cuda acceleration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your OS and graphics card and graphics card driver? It's possible if you're using a too new version

The --nvidia option is mounting the nvidia graphic card into docker to give it access to do hardware accelerated rendering. The cuda support is still in a pending PR.

This option means that you get " --gpus all " injected into your docker arguments and the nvidia runtime driver requirements are injected into the container using multi stage container builds. (instead of needing to explicitly extend from it at the top of the Dockerfile.) It's replacing

https://github.com/osrf/lrauv/pull/101/files#diff-c826ab2996e92237b4bb80b3cee5f11d231efb7ddb40884fa5bfd2bbe4dcd4b7L54

Before docker added the --gpus option for using Intel graphics cards you need to use --devices /dev/dri/card0 instead of the --runtime=nvidia But I think now having the nvidia drivers installed is just extra space.

docker/tests/Dockerfile Show resolved Hide resolved
@tfoote
Copy link
Collaborator Author

tfoote commented Dec 1, 2021

I have released rocker 0.2.7 to get the new command line argument

Copy link
Contributor

@chapulina chapulina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just have minor comments and I think we should wait for @arjo129 's approval too.

README.md Outdated
@@ -24,7 +24,9 @@ Optionally, you may choose to build this repository using Docker, for
convenience.
Make sure you have a recent version of [Docker](https://docs.docker.com/) and
[nvidia-docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)
installed. Next to get started simply run the following command.
installed.
And [rocker](https://github.com/osrf/rocker) available as debian package `python3-rocker` with the ROS packages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: this project doesn't use ROS, so users landing here may not have ROS packages setup. They may setup it at later stages to get colcon, but up here on the README, it may not be clear how to set it up.

I always prefer to err on the side of explicitness in this situation to avoid confusing new users who may not be familiar with ROS packages. So I'd suggest either removing the part about "debian package python3-rocker with the ROS packages" and letting the rocker entry page explain how to set it up, or repeating the step-by-step here on the README.

cd docker
docker build -t mbari_lrauv -f empty_world/Dockerfile ..
rocker --nvidia --x11 --user mbari_lrauv bash
ign launch lrauv_world.ign
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The launch file was removed in #102 , we should use plain ign gazebo commands now

wget
wget \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get -qq clean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth it cleaning it here, considering we'll do an update below again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this removes the contents of the apt cache from the layer, making the docker container much slimmer. Otherwise you just end up with outdated caches in every layer of the container.

Copy link
Member

@arjo129 arjo129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes.

I've been unable to get rocker working on my comp correctly. Could be some GPU setup issue. I'm worrying if we really need the --nvidia flag. There are some minor knits in regards to style.

# setup ignition environment
ENTRYPOINT ["/entrypoint.bash"]

CMD ign launch lrauv_world.ign
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Knit: New line at EOF.

run: docker build -t mbari_lrauv . -f docker/tests/Dockerfile
- name: Run tests
run: docker run --rm mbari_lrauv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Knit: Newline at EOF.

--security-opt seccomp=unconfined \
$DOCKER_OPTS \
mbari_lrauv_debug
rocker --nvidia --x11 --user --user-override-name=developer --user-preserve-home -- mbari_lrauv_tests tmuxinator start debug_and_plot -n debug_session -p src/lrauv/docker/tests/debug_integration_mux.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good News: The system launches correctly with the correct panes.

Bad News: I may be doing something wrong but when I run via rocker I get the following error:

[GUI] [Err] [Ogre2RenderEngine.cc:905]  Unable to create the rendering window: OGRE EXCEPTION(3:RenderingAPIException): OpenGL 3.3 is not supported. Please update your graphics card drivers. in GL3PlusRenderSyste
m::initialiseContext at /var/lib/jenkins/workspace/ogre-2.2-debbuilder/repo/RenderSystems/GL3Plus/src/OgreGL3PlusRenderSystem.cpp (line 3364)                                                                       
[GUI] [Err] [Ogre2RenderEngine.cc:913] Unable to create the rendering window after [11] attempts.                                                                                                                   
[GUI] [Err] [Ogre2RenderEngine.cc:834] Failed to create dummy render window.

Might be some (nvidia?) driver issue or a problem with the way I setup rocker 🤔 .
Also is the --nvidia really nessecary? What about scenarios where we aren't using an nvidia card. I don't think we really need cuda acceleration.

@tfoote
Copy link
Collaborator Author

tfoote commented Dec 2, 2021

Is it worth it cleaning it here, considering we'll do an update below again?

Yes, the standard practice when building dockerfiles is to clear the cache at every layer. This has two major reasons. First is that it keeps the image size down. Every file change in the system at the end of a step is saved into the layer. So if you do 3 apt-get updates in three consecutive layers you will get three copies of the cache in each layer assuming they're slightly different because they were pulled at different times. And secondly, because docker caches previous layers, you have no knowledge of how long ago the last update was so you are going to have to do another update. And the cost of the incremental update versus a fresh update is small, and gives you a much more consistent result.

@tfoote tfoote requested a review from arjo129 December 2, 2021 09:47
@mabelzhang
Copy link
Collaborator

mabelzhang commented Dec 3, 2021

Butting in with a nitpick request in my head for a while - can we rename this image so that it doesn't start with mbari? Technically, this is not mbari's image, so it might be called osrf_lrauv (name of this repo) or something other than MBARI LRAUV.

Confusion arises because we host a different, official image on MBARI DockerHub that has the MBARI binary in it: https://hub.docker.com/r/mbari/lrauv-ignition-sim When you pull it, it's called mbari/lrauv-ignition-sim. At the end of the day, I end up with a handful of images under my docker images, all called some combination of MBARI LRAUV (the official no-MBARI-source one which I push to DockerHub, the with-source one that's a dependency, and my personal development one. Not counting this one in the osrf repo).

The more we can disambiguate these names, the less confusing it will be for ourselves.

@tfoote
Copy link
Collaborator Author

tfoote commented Dec 6, 2021

Good idea @mabelzhang I've updated it with the osrf_lrauv to mirror the repo name for clarity.

@tfoote
Copy link
Collaborator Author

tfoote commented Dec 6, 2021

I worked with @arjo129 on slack and we isolated his crashing to be related to the integrated AMD card getting selected and his prime-select settings seemingly not going into the environment.

@tfoote tfoote self-assigned this Dec 7, 2021

# Clean up apt
RUN rm -rf /var/lib/apt/lists/* \
ignition-fortress\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ignition-fortress\
ignition-garden\

wget
wget \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get -qq clean

# Add Ignition's latest packages, which may be more up-to-date than the ones from the MBARI image
RUN /bin/sh -c 'echo "deb http://packages.osrfoundation.org/gazebo/ubuntu-stable `lsb_release -cs` main" > /etc/apt/sources.list.d/gazebo-stable.list' && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to add the nightly repo now, for Garden

@caguero caguero mentioned this pull request Jan 19, 2022
39 tasks
@caguero caguero added this to the 2022 M1 milestone Jan 19, 2022
@caguero caguero mentioned this pull request Nov 2, 2021
40 tasks
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
Following up feedback in review

Signed-off-by: Tully Foote <tfoote@osrfoundation.org>
@tfoote
Copy link
Collaborator Author

tfoote commented Jan 28, 2022

I've rebased this for garden.

With the new rebuilds of the garden image with different UIDs it needs osrf/rocker#168

@hidmic
Copy link
Collaborator

hidmic commented Feb 17, 2022

I worked with @arjo129 on slack and we isolated his crashing to be related to the integrated AMD card getting selected and his prime-select settings seemingly not going into the environment.

Circling back here. I did not find anything particularly unusual in rocker. @arjo129 I'll assume that you've prime-select'ed nvidia. Could it be an issue with your LD_LIBRARY_PATH (NVIDIA/nvidia-docker#612)? Would you mind giving it a shot?

@hidmic
Copy link
Collaborator

hidmic commented Feb 18, 2022

By that I meant we try the fix proposed in NVIDIA/nvidia-docker#612 (comment).

@caguero
Copy link
Collaborator

caguero commented Jun 28, 2022

Closing this PR for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants