Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Upgrade PyTorch to version 1.13.1 #2430

Merged
merged 9 commits into from
Jan 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,14 @@ are cloned in adjacent directories then you could run this in the `elasticsearch
./gradlew :x-pack:plugin:ml:qa:native-multi-node-tests:javaRestTest --tests "org.elasticsearch.xpack.ml.integration.MlJobIT" --include-build ../ml-cpp
```

Before using `--include-build` for the first time when building the C++ together with Elasticsearch
it's necessary to run this command in the `elasticsearch` directory to verify the extra components
used in the C++ build:

```
./gradlew --write-verification-metadata sha256 help --include-build ~/ml-cpp
```

## Pull Requests

Every change made to ml-cpp must be held to a high standard, Pull Requests are equally important as they document changes and decissions that have been made. `You Know, for Search` - a descriptive and relevant summary of the change helps people to find your PR later on.
Expand Down
5 changes: 4 additions & 1 deletion bin/pytorch_inference/CCmdLineParser.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

#include <ver/CBuildInfo.h>

#include <torch/csrc/api/include/torch/version.h>

#include <boost/program_options.hpp>

#include <iostream>
Expand Down Expand Up @@ -85,7 +87,8 @@ bool CCmdLineParser::parse(int argc,
return false;
}
if (vm.count("version") > 0) {
std::cerr << ver::CBuildInfo::fullInfo() << std::endl;
std::cerr << "PyTorch Version " << TORCH_VERSION << std::endl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

<< ver::CBuildInfo::fullInfo() << std::endl;
return false;
}
if (vm.count("modelid") > 0) {
Expand Down
63 changes: 51 additions & 12 deletions build-setup/linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,14 +249,53 @@ sudo ./cmake-3.23.2-Linux-x86_64.sh --skip-license --prefix=/usr/local/cmake

Please ensure `/usr/local/cmake/bin` is in your `PATH` environment variable.

### OpenSSL

Python 3.10 requires OpenSSL 1.1. No other version is acceptable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No OpenSSL 3.0?

OpenSSL 1.1.1 is LTS until 11th September 2023, hopefully we can update next time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, next time we upgrade Python presumably a different version will be needed.

But it's important to remember that we only use OpenSSL to build Python and we only use Python to build PyTorch. It's not like we use them in every build or ship them. As long as the version we've used is secure on the day we build PyTorch it's good enough.


If the `openssl-devel` package for your distribution happens to be version 1.1 then you can skip this step. Otherwise, you need to build OpenSSL 1.1 from source.

Download `openssl-1.1.1q.tar.gz` from <https://www.openssl.org/source/old/1.1.1/openssl-1.1.1q.tar.gz>, then build as follows:

```
tar zxvf openssl-1.1.1q.tar.gz
cd openssl-1.1.1q
./Configure --prefix=/usr/local/gcc103 shared linux-`uname -m`
make
make install
```

### Python

PyTorch currently requires Python 3.6, 3.7 or 3.8, and version 3.7 appears to cause fewest problems in their test status matrix, so we use that. If your system does not have a requisite version of Python install it with a package manager or build the last 3.7 release from source by downloading `Python-3.7.9.tgz` from <https://www.python.org/ftp/python/3.7.9/Python-3.7.9.tgz> then extract and build:
PyTorch currently requires Python 3.7 or higher; we use version 3.10. If your system does not have a requisite version of Python install it with a package manager or build the last 3.10 release from source by downloading `Python-3.10.9.tgz` from <https://www.python.org/ftp/python/3.10.9/Python-3.10.9.tgz> then extract as follows:

```
tar xzf Python-3.10.9.tgz
cd Python-3.10.9
```

If the distribution you are building on uses OpenSSL 1.1 as its built in OpenSSL version then configure as follows:

```
tar -xzf Python-3.7.9.tgz
cd Python-3.7.9
./configure --prefix=/usr/local/gcc103 --enable-optimizations
```

If you had to build OpenSSL 1.1 yourself then on x86_64 configure like this:

```
sed -i -e 's~ssldir/lib~ssldir/lib64~' configure
./configure --prefix=/usr/local/gcc103 --enable-optimizations --with-openssl=/usr/local/gcc103 --with-openssl-rpath=/usr/local/gcc103/lib64
```

or on aarch64 configure like this:

```
./configure --prefix=/usr/local/gcc103 --enable-optimizations --with-openssl=/usr/local/gcc103 --with-openssl-rpath=/usr/local/gcc103/lib
```

Finally, build as follows:

```
make
sudo make altinstall
```
Expand All @@ -280,24 +319,26 @@ Then copy the shared libraries to the system directory:
sudo cp /opt/intel/mkl/lib/intel64/libmkl*.so /usr/local/gcc103/lib
```

### PyTorch 1.11.0
### PyTorch 1.13.1

(This step requires a reasonable amount of memory. It failed on a machine with 8GB of RAM. It succeeded on a 16GB machine.)

PyTorch requires that certain Python modules are installed. Install these modules with `pip` using the same Python version you will build PyTorch with. If you followed the instructions above and built Python from source use `python3.7`:
PyTorch requires that certain Python modules are installed. Install these modules with `pip` using the same Python version you will build PyTorch with. If you followed the instructions above and built Python from source use `python3.10`:

```
sudo /usr/local/gcc103/bin/python3.7 -m pip install install numpy ninja pyyaml setuptools cffi typing_extensions future six requests dataclasses
sudo /usr/local/gcc103/bin/python3.10 -m pip install install numpy ninja pyyaml setuptools cffi typing_extensions future six requests dataclasses
```

For aarch64 the `ninja` module is not available, so use:

```
sudo /usr/local/gcc103/bin/python3.7 -m pip install install numpy pyyaml setuptools cffi typing_extensions future six requests dataclasses
sudo /usr/local/gcc103/bin/python3.10 -m pip install install numpy pyyaml setuptools cffi typing_extensions future six requests dataclasses
```

Then obtain the PyTorch code:

```
git clone --depth=1 --branch=v1.11.0 git@github.com:pytorch/pytorch.git
git clone --depth=1 --branch=v1.13.1 git@github.com:pytorch/pytorch.git
cd pytorch
git submodule sync
git submodule update --init --recursive
Expand Down Expand Up @@ -325,11 +366,9 @@ export USE_MKLDNN=ON
export USE_QNNPACK=OFF
export USE_PYTORCH_QNNPACK=OFF
[ $(uname -m) = x86_64 ] && export USE_XNNPACK=OFF
# Breakpad is undesirable as it causes libtorch_cpu to have an executable stack
export USE_BREAKPAD=OFF
export PYTORCH_BUILD_VERSION=1.11.0
export PYTORCH_BUILD_VERSION=1.13.1
export PYTORCH_BUILD_NUMBER=1
/usr/local/gcc103/bin/python3.7 setup.py install
/usr/local/gcc103/bin/python3.10 setup.py install
```

Once built copy headers and libraries to system directories:
Expand Down
35 changes: 14 additions & 21 deletions build-setup/macos.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ For example, you might create a `.bashrc` file in your home directory containing
```
umask 0002
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home
export PYTHONHOME=/Library/Frameworks/Python.framework/Versions/3.7
export PYTHONHOME=/Library/Frameworks/Python.framework/Versions/3.10
export PATH=$JAVA_HOME/bin:$PYTHONHOME/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
# Only required if building the C++ code directly using cmake - adjust depending on the location of your Git clone
export CPP_SRC_HOME=$HOME/ml-cpp
Expand Down Expand Up @@ -50,12 +50,13 @@ The above environment variables only need to be set when building tools on macOS

The first major piece of development software to install is Apple's development environment, Xcode, which can be downloaded from <https://developer.apple.com/download/> . You will need to register as a developer with Apple. Alternatively, you can get the latest version of Xcode from the App Store.

For C++17 Xcode 10 is required, and this requires macOS High Sierra or above. Therefore you must be running macOS High Sierra (10.13) or above to build the Machine Learning C++ code.
For C++17 Xcode 10 is required, and this requires macOS High Sierra or above. Therefore you must be running macOS High Sierra (10.13) or above to build the Machine Learning C++ code. The binary download of Python 3.10 requires macOS Big Sur or above, so to follow the instructions in this page to the letter you'll need macOS Big Sur (11) or above.

- If you are using High Sierra, you must install Xcode 10.1.x
- If you are using Mojave, you must install Xcode 11.3.x
- If you are using Catalina, you must install Xcode 12.4.x
- If you are using Big Sur or Monterey, you must install Xcode 13.1.x or above
- If you are using Big Sur, you must install Xcode 13.2.x
- If you are using Monterey or Ventura, you must install Xcode 14.2.x or above

Xcode is distributed as a `.xip` file; simply double click the `.xip` file to expand it, then drag `Xcode.app` to your `/Applications` directory.
(Older versions of Xcode can be downloaded from [here](https://developer.apple.com/download/more/), provided you are signed in with your Apple ID.)
Expand Down Expand Up @@ -120,26 +121,26 @@ sudo mkdir -p /usr/local/bin
sudo ln -s /Applications/CMake.app/Contents/bin/cmake /usr/local/bin/cmake
```

### Python 3.7
### Python 3.10

PyTorch currently requires Python 3.6, 3.7 or 3.8, and version 3.7 appears to cause fewest problems in their test status matrix, so we use that.
PyTorch currently requires Python 3.7 or higher; we use version 3.10.

Download the graphical installer for Python 3.7.9 from <https://www.python.org/ftp/python/3.7.9/python-3.7.9-macosx10.9.pkg>.
Download the graphical installer for Python 3.10.9 from <https://www.python.org/ftp/python/3.10.9/python-3.10.9-macosx11.pkg>.

Install using all the default options. When the installer completes a Finder window pops up. Double click the `Install Certificates.command` file in this folder to install the SSL certificates Python needs.

### PyTorch 1.11.0
### PyTorch 1.13.1

PyTorch requires that certain Python modules are installed. To install them:

```
sudo /Library/Frameworks/Python.framework/Versions/3.7/bin/pip3.7 install install numpy ninja pyyaml setuptools cffi typing_extensions future six requests dataclasses
sudo /Library/Frameworks/Python.framework/Versions/3.10/bin/pip3.10 install install numpy ninja pyyaml setuptools cffi typing_extensions future six requests dataclasses
```

Then obtain the PyTorch code:

```
git clone --depth=1 --branch=v1.11.0 https://github.com/pytorch/pytorch.git
git clone --depth=1 --branch=v1.13.1 https://github.com/pytorch/pytorch.git
cd pytorch
git submodule sync
git submodule update --init --recursive
Expand All @@ -153,9 +154,8 @@ a heuristic virus scanner looking for potentially dangerous function
calls in our shipped product will not encounter these functions that run
external processes.

Edit `tools/setup_helpers/cmake.py b/tools/setup_helpers/cmake.py` and
add `'DNNL_TARGET_ARCH'` to the list of environment variables that get
passed through to CMake (around line 267).
Edit `tools/setup_helpers/cmake.py` and add `"DNNL_TARGET_ARCH"` to the list
of environment variables that get passed through to CMake (around line 215).

Build as follows:

Expand All @@ -170,16 +170,9 @@ export USE_MKLDNN=ON
export USE_QNNPACK=OFF
export USE_PYTORCH_QNNPACK=OFF
[ $(uname -m) = x86_64 ] && export USE_XNNPACK=OFF
# TODO: avoid this by upgrading to Python 3.9 next time we rebuild
# dependencies. The build scripts misdetect the architecture on Apple
# M1 because we are using Python 3.7, which is x86_64 only on macOS,
# and the PyTorch build uses CMake and Ninja via Python. Python 3.9
# has a native arm64 build for macOS, so we should switch to that next
# time.
[ $(uname -m) != x86_64 ] && export CMAKE_OSX_ARCHITECTURES=`uname -m`
export PYTORCH_BUILD_VERSION=1.11.0
export PYTORCH_BUILD_VERSION=1.13.1
export PYTORCH_BUILD_NUMBER=1
/Library/Frameworks/Python.framework/Versions/3.7/bin/python3.7 setup.py install
/Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10 setup.py install
```

Once built copy headers and libraries to system directories:
Expand Down
16 changes: 9 additions & 7 deletions build-setup/windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,13 +177,13 @@ lib -NOLOGO strptime.obj
copy strptime.lib C:\usr\local\lib
```

### Python 3.7
### Python 3.10

PyTorch currently requires Python 3.6, 3.7 or 3.8, and version 3.7 appears to cause fewest problems in their test status matrix, so we use that.
PyTorch currently requires Python 3.7 or higher; we use version 3.10.

Download the executable installer for Python 3.7.9 from <https://www.python.org/ftp/python/3.7.9/python-3.7.9-amd64.exe>.
Download the executable installer for Python 3.10.9 from <https://www.python.org/ftp/python/3.10.9/python-3.10.9-amd64.exe>.

Right click on the installer and "Run as administrator". (Note that evelating privileges during the install is not sufficient for the Python 3.7.9 installer, it needs to have elevated privileges when first run. Obviously this is bad practice, but that's the way it is in version 3.7.9.)
Right click on the installer and "Run as administrator". (Note that evelating privileges during the install is not sufficient for the Python 3.10.9 installer, it needs to have elevated privileges when first run. Obviously this is bad practice, but that's the way it is in version 3.10.9.)

On the first installer screen click "Customize installation". (Although "Install Now" seems like it would do the job, the "Install launcher for all users" option literally only installs the _launcher_ for all users, not Python itself.)

Expand All @@ -193,7 +193,9 @@ On the "Advanced Options" screen, check "Install for all users" and "Add Python

For the time being, do not take advantage of the option on the final installer screen to reconfigure the machine to allow paths longer than 260 characters. We still support Windows versions that do not have this option.

### PyTorch 1.11.0
### PyTorch 1.13.1

(This step requires a lot of memory. It failed on a machine with 12GB of RAM. It just about fitted on a 20GB machine. 32GB RAM is recommended.)

PyTorch requires that certain Python modules are installed. Start a command prompt "cmd.exe" using "Run as administrator". In it run:

Expand All @@ -207,7 +209,7 @@ Next, in a Git bash shell run:

```
cd /c/tools
git clone --depth=1 --branch=v1.11.0 git@github.com:pytorch/pytorch.git
git clone --depth=1 --branch=v1.13.1 git@github.com:pytorch/pytorch.git
cd pytorch
git submodule sync
git submodule update --init --recursive
Expand Down Expand Up @@ -258,7 +260,7 @@ set USE_QNNPACK=OFF
set USE_PYTORCH_QNNPACK=OFF
set USE_XNNPACK=OFF
set MSVC_Z7_OVERRIDE=OFF
set PYTORCH_BUILD_VERSION=1.11.0
set PYTORCH_BUILD_VERSION=1.13.1
set PYTORCH_BUILD_NUMBER=1
python setup.py install
```
Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/build_linux_aarch64_cross_build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
HOST=docker.elastic.co
ACCOUNT=ml-dev
REPOSITORY=ml-linux-aarch64-cross-build
VERSION=9
VERSION=10

set -e

Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/build_linux_aarch64_native_build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ sleep 5
HOST=docker.elastic.co
ACCOUNT=ml-dev
REPOSITORY=ml-linux-aarch64-native-build
VERSION=9
VERSION=10

set -e

Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/build_linux_build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ sleep 5
HOST=docker.elastic.co
ACCOUNT=ml-dev
REPOSITORY=ml-linux-build
VERSION=24
VERSION=25

set -e

Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/build_macosx_build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
HOST=docker.elastic.co
ACCOUNT=ml-dev
REPOSITORY=ml-macosx-build
VERSION=15
VERSION=16

set -e

Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/linux_aarch64_cross_builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
#

# Increment the version here when a new tools/3rd party components image is built
FROM docker.elastic.co/ml-dev/ml-linux-aarch64-cross-build:9
FROM docker.elastic.co/ml-dev/ml-linux-aarch64-cross-build:10

MAINTAINER David Roberts <dave.roberts@elastic.co>

Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/linux_aarch64_cross_image/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ RUN \
RUN \
mkdir -p /usr/local/sysroot-aarch64-linux-gnu/usr && \
cd /usr/local/sysroot-aarch64-linux-gnu/usr && \
wget --quiet -O - https://s3-eu-west-1.amazonaws.com/prelert-artifacts/dependencies/usr-aarch64-linux-gnu-9.tar.bz2 | tar jxf - && \
wget --quiet -O - https://s3-eu-west-1.amazonaws.com/prelert-artifacts/dependencies/usr-aarch64-linux-gnu-10.tar.bz2 | tar jxf - && \
cd .. && \
ln -s usr/lib lib && \
ln -s usr/lib64 lib64
Expand Down
2 changes: 1 addition & 1 deletion dev-tools/docker/linux_aarch64_native_builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
#

# Increment the version here when a new tools/3rd party components image is built
FROM docker.elastic.co/ml-dev/ml-linux-aarch64-native-build:9
FROM docker.elastic.co/ml-dev/ml-linux-aarch64-native-build:10

MAINTAINER David Roberts <dave.roberts@elastic.co>

Expand Down
Loading