Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: broken is_docker check #8711

Merged
merged 2 commits into from
Jul 26, 2022
Merged

fix: broken is_docker check #8711

merged 2 commits into from
Jul 26, 2022

Conversation

maxstrobel
Copy link
Contributor

@maxstrobel maxstrobel commented Jul 25, 2022

Checking if /workspace exists is not a reliable method to check if
the process runs in a docker container.

Reusing the logic from the npm "is-docker" package to check if the
process runs in a container.
References: https://github.com/sindresorhus/is-docker/blob/main/index.js

Fixes #8710.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Enhanced detection of Docker environments in YOLOv5 utility functions.

📊 Key Changes

  • Improved is_docker utility function to accurately determine if running inside a Docker container.
  • The method now checks for /.dockerenv file and inspects /proc/self/cgroup for the presence of "docker".

🎯 Purpose & Impact

  • The update ensures that YOLOv5 can more reliably detect Docker environments, which is crucial for certain setups and configurations.
  • Users running YOLOv5 in Docker containers may experience more tailored behavior or optimizations based on the improved environment detection. 🐳💡

@glenn-jocher
Copy link
Member

glenn-jocher commented Jul 25, 2022

@maxstrobel thanks for the PR! Have you tested this in the Docker image and it returns True?

It looks like I used the Path('/.dockerenv').exists() method originally but abandoned it for some reason as it's commented in the current function.

Could you merge the 3 functions into 1 since they are all similarly themed? Thanks!

@maxstrobel
Copy link
Contributor Author

maxstrobel commented Jul 26, 2022

@glenn-jocher

Yes, it worked in my tests.

I intentionally divided the logic into those functions. Python allows short circuiting on boolean operators (https://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not) - this means we can evaluate first the cheap has_docker_env() and if this function returns True, we do not run the second function.
I'm afraid that if we squeeze everything in one function, the same becomes very unreadable and complicated to understand.

@glenn-jocher
Copy link
Member

@maxstrobel got it. How about dropping the two has_ functions the is_docker() function since they're not used anywhere else?

@maxstrobel
Copy link
Contributor Author

def is_docker() -> bool:
    """Check if the process runs inside a docker container."""
    result = Path('/.dockerenv').exists()
    try:
        with open("/proc/self/cgroup") as file:
            result = result or any("docker" in line for line in file)
    except OSError:
        pass
    return result

This could be a solution within one function, but I see with this solution some drawbacks:

  1. The file /proc/self/cgroup is always opened - before it was only opened when /.dockerenv does not exist.
  2. You can test the separate function easier.
  3. Less readable.

I'd propose to go with the 3 function approach - it's cleaner from a software architecture point of view. We could make the two has functions also private:

def _has_docker_env() -> bool:
    """Check if /.dockerenv exists."""
    return Path('/.dockerenv').exists()


def _has_docker_cgroup() -> bool:
    """Check if docker is in control groups."""
    try:
        with open("/proc/self/cgroup") as file:
            return any("docker" in line for line in file)
    except OSError:
        return False


def is_docker() -> bool:
    """Check if the process runs inside a docker container."""
    return _has_docker_env() or _has_docker_cgroup()

@glenn-jocher
Copy link
Member

glenn-jocher commented Jul 26, 2022

@maxstrobel how about this compromise? EDIT: nevermind, your earlier implementation is cleaner!

def is_docker() -> bool:
    """Check if the process runs inside a docker container."""

    def has_docker_cgroup() -> bool:
        """Check if docker is in control groups."""
        try:
            with open("/proc/self/cgroup") as file:
                return any("docker" in line for line in file)
        except OSError:
            return False

    return Path('/.dockerenv').exists() or has_docker_cgroup()

@glenn-jocher
Copy link
Member

glenn-jocher commented Jul 26, 2022

Or maybe: EDIT: nevermind, your earlier implementation is cleaner!

def is_docker() -> bool:
    """Check if the process runs inside a docker container."""
    if Path('/.dockerenv').exists():
        return True
    try:  # check if docker is in control groups
        with open("/proc/self/cgroup") as file:
            return any("docker" in line for line in file)
    except OSError:
        return False

@maxstrobel
Copy link
Contributor Author

@glenn-jocher
Now I'm a bit confused with the EDITs 😅

Which version should we take now?

@glenn-jocher
Copy link
Member

I took your short function and prevented the file opening script from running unless necessary:

def is_docker() -> bool:
    """Check if the process runs inside a docker container."""
    if Path('/.dockerenv').exists():
        return True
    try:  # check if docker is in control groups
        with open("/proc/self/cgroup") as file:
            return any("docker" in line for line in file)
    except OSError:
        return False

Checking if ``/workspace`` exists is not a reliable method to check if
the process runs in a docker container.

Reusing the logic from the npm "is-docker" package to check if the
process runs in a container.
References: https://github.com/sindresorhus/is-docker/blob/main/index.js

Fixes #8710.
@maxstrobel
Copy link
Contributor Author

@glenn-jocher

Got it! Seems like a good compromise. I updated the PR to the latest state.
Feel free to merge & close the ticket!

@glenn-jocher glenn-jocher merged commit 0b5ac22 into ultralytics:master Jul 26, 2022
@glenn-jocher
Copy link
Member

@maxstrobel PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

@glenn-jocher
Copy link
Member

@maxstrobel just noticed an issue with this PR in #8694 (comment)

It looks like Google Colab is now being identified as a Docker environment. To reproduce:

from pathlib import Path
Path("/.dockerenv").exists()

This might be why I had the alternate implementation before and simply forgot about this problem.

@maxstrobel
Copy link
Contributor Author

maxstrobel commented Aug 1, 2022

@glenn-jocher - True, I can also confirm this issue.
I had a deeper look into Colab & it looks like the second part of the test (cgroups) would work.
I could remove the redundant part from the function & fix this behavior.

What do you think?

@glenn-jocher
Copy link
Member

@maxstrobel well, there's the question of whether Colab instances actually are running in Docker containers (or at least some sort of container). Maybe the function is working correctly.

If cgroups runs quickly though (and Colab is not actually running in a container) then it might make sense just use it directly.

@maxstrobel
Copy link
Contributor Author

@glenn-jocher Hard to say... I tried to find out what runs below Colab, but couldn't find good answers.

What I found out:

  1. There's an empty /.dockerenv file.
  2. No docker related things inside cgroup
    image

According to stackoverflow and a quick & dirty research, most people recommend checking the stuff under /proc as we do.

https://stackoverflow.com/questions/23513045/how-to-check-if-a-process-is-running-inside-docker-container
https://stackoverflow.com/questions/20010199/how-to-determine-if-a-process-runs-inside-lxc-docker

We could also specify an env variable in the Dockerfile, which we then can use to check if we are inside a container.

ctjanuhowski pushed a commit to ctjanuhowski/yolov5 that referenced this pull request Sep 8, 2022
Checking if ``/workspace`` exists is not a reliable method to check if
the process runs in a docker container.

Reusing the logic from the npm "is-docker" package to check if the
process runs in a container.
References: https://github.com/sindresorhus/is-docker/blob/main/index.js

Fixes ultralytics#8710.

Co-authored-by: Maximilian Strobel <Maximilian.Strobel@infineon.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
@glenn-jocher
Copy link
Member

@maxstrobel based on your findings and the community suggestions, it seems like sticking with the current implementation that checks the /proc directory for the Docker environment is the most reliable approach. I'd recommend keeping the function as-is, and we can include a note in the documentation regarding the observed behavior in Google Colab. Thank you for the thorough investigation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Defective is_docker
2 participants