[R-package] Add GPU install options (fixes #3765) #3779

jameslamb · 2021-01-18T04:56:43Z

This PR offers a fix for #3765. In that issue, @szilard described some issues using CMake-based builds of the LightGBM R package with GPU support. Specifically, compilation failed because LightGBM couldn't find OpenCL and Boost.

Changes in this PR

This PR adds the following command-line options to build_r.R

--boost-root
--boost-dir
--boost-include-dir
--boost-librarydir
--opencl-include-dir
--opencl-library

The approach it takes is similar to how the Python package handles these same arguments:

LightGBM/python-package/setup.py

Lines 131 to 144 in 706f2af

    
           if use_gpu: 
        
               cmake_cmd.append("-DUSE_GPU=ON") 
        
               if boost_root: 
        
                   cmake_cmd.append("-DBOOST_ROOT={0}".format(boost_root)) 
        
               if boost_dir: 
        
                   cmake_cmd.append("-DBoost_DIR={0}".format(boost_dir)) 
        
               if boost_include_dir: 
        
                   cmake_cmd.append("-DBoost_INCLUDE_DIR={0}".format(boost_include_dir)) 
        
               if boost_librarydir: 
        
                   cmake_cmd.append("-DBOOST_LIBRARYDIR={0}".format(boost_librarydir)) 
        
               if opencl_include_dir: 
        
                   cmake_cmd.append("-DOpenCL_INCLUDE_DIR={0}".format(opencl_include_dir)) 
        
               if opencl_library: 
        
                   cmake_cmd.append("-DOpenCL_LIBRARY={0}".format(opencl_library))

https://github.com/microsoft/LightGBM/blame/706f2af7badc26f6ec68729469ec6ec79a66d802/python-package/README.rst#L95-L111

How I tested this

I adopted @szilard 's reproducible example from #3765

Create a new directory and clone LightGBM into it

mkdir lgb-gpu-test
cd lgb-gpu-test
git clone --recursive https://github.com/microsoft/LightGBM.git
pushd LightGBM
    git fetch origin fix/r-gpu
    git checkout fix/r-gpu
popd

Write a file Dockerfile with the following content. The laptop I do GPU development on has CUDA 10.2 installed so I chose a CUDA 10.2 image, but I expect this would work for other versions.

Dockerfile (click me)

FROM nvidia/cuda:10.2-devel-ubuntu18.04

ENV DEBIAN_FRONTEND="noninteractive"

RUN apt-get update && \
    apt-get install \
    	-y \
    	software-properties-common \
    	apt-transport-https

RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 && \
    add-apt-repository 'deb [arch=amd64] https://cran.rstudio.com/bin/linux/ubuntu bionic-cran40/' && \
    apt-get update && \
    apt-get install -y \
    		r-base

RUN apt-get install -y \
	git \
	wget \
	libcurl4-openssl-dev \
	default-jdk-headless \
	libssl-dev \
	libxml2-dev \
	cmake

ENV MAKE="make -j$(nproc)"

RUN R -e 'install.packages(c("R6","data.table","jsonlite"), repos = "https://cran.rstudio.com/")'

RUN apt-get install -y \
		libboost-dev \
		libboost-system-dev \
		libboost-filesystem-dev \
		ocl-icd-opencl-dev \
		opencl-headers \
		clinfo

RUN mkdir -p /etc/OpenCL/vendors && \
    echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd   ## otherwise lightgm segfaults at runtime (compiles fine without it)

COPY LightGBM /tmp/LightGBM

RUN cd /tmp/LightGBM && \
	git submodule init && \
	git submodule update --recursive && \
    Rscript build_r.R \
    	--use-gpu \
    	--opencl-library=/usr/lib/x86_64-linux-gnu/libOpenCL.so \
    	--boost-librarydir=/usr/lib/x86_64-linux-gnu

Build the image

docker build --no-cache -t test-lgb-gpu -f Dockerfile .

Run a container from that image. Open two terminals. From one, open an R session in the container. In the other, open a shell running nvidia-smi.

nvidia-docker run \
	-w /tmp/LightGBM \
	-it test-lgb-gpu \
	R

nvidia-docker run \
	-w /tmp/LightGBM \
	-it gbmperf-lgb-gpu \
	watch -n 3 nvidia-smi

In the R shell, install some packages and then run @szilard 's test script from R package install with GPU support fails #3765 (comment)

test R script (click me)

install.packages(c("ROCR", "curl"), repos = "https://cran.r-project.org")

library(data.table)
library(ROCR)
library(lightgbm)
library(Matrix)

set.seed(123)

d_train <- fread("https://s3.amazonaws.com/benchm-ml--main/train-1m.csv", showProgress=FALSE)
d_test <- fread("https://s3.amazonaws.com/benchm-ml--main/test.csv", showProgress=FALSE)

d_all <- rbind(d_train, d_test)
d_all$dep_delayed_15min <- ifelse(d_all$dep_delayed_15min=="Y",1,0)

d_all_wrules <- lgb.convert_with_rules(d_all)       
d_all <- d_all_wrules$data
cols_cats <- names(d_all_wrules$rules) 

d_train <- d_all[1:nrow(d_train)]
d_test <- d_all[(nrow(d_train)+1):(nrow(d_train)+nrow(d_test))]

p <- ncol(d_all)-1
dlgb_train <- lgb.Dataset(data = as.matrix(d_train[,1:p]), label = d_train$dep_delayed_15min, free_raw_data = FALSE)

md <- lgb.train(
	data = dlgb_train, 
	objective = "binary", 
	nrounds = 100, num_leaves = 512, learning_rate = 0.1, 
	categorical_feature = cols_cats,
	device = "gpu",
	verbose = 0
)

phat <- predict(md, data = as.matrix(d_test[,1:p]))
rocr_pred <- prediction(phat, d_test$dep_delayed_15min)
cat(performance(rocr_pred, "auc")@y.values[[1]],"\n")

Based on the output of nvidia-smi, I'm fairly sure that training is actually taking advantage of the GPU.

Notes for Reviewers

I think we should have a CI job for R + GPU, but that that's outside the scope of this PR. ~~I'll add an issue and update here.~~ [R-package] Add an R GPU job in CI #3780
I chose not to hard-code any default values into build_r.R. I think this will make this more stable, even if it means users need to do a little bit more configuration. Let me know if you disagree with this.

StrikerRUS · 2021-01-18T13:08:57Z

@jameslamb Please note this my comment in #3765

I'm not sure that newer version from Ubuntu ppa is better than preinstalled native version from NVIDIA in case you are really using NVIDIA cards for training.
#3765 (comment)

So I believe it worth to run tests without

RUN apt-get install -y \
                ...
		ocl-icd-opencl-dev \
		opencl-headers \
                ...

before merging this PR. Could you please do this as I guess you already have easy access to the environment you've described in your starting comment?

jameslamb · 2021-01-18T15:33:41Z

@jameslamb Please note this my comment in #3765

I'm not sure that newer version from Ubuntu ppa is better than preinstalled native version from NVIDIA in case you are really using NVIDIA cards for training.
#3765 (comment)

So I believe it worth to run tests without
RUN apt-get install -y \
                ...
		ocl-icd-opencl-dev \
		opencl-headers \
                ...
before merging this PR. Could you please do this as I guess you already have easy access to the environment you've described in your starting comment?

I just tried this, and compilation failed

-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- R version passed into FindLibR.cmake: 4.0.3
-- Found LibR: /usr/lib/R  
-- LIBR_EXECUTABLE: /usr/bin/R
-- LIBR_INCLUDE_DIRS: /usr/share/R/include
-- LIBR_CORE_LIBRARY: /usr/lib/R/lib/libR.so
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - not found
-- Looking for CL_VERSION_1_1
-- Looking for CL_VERSION_1_1 - not found
-- Looking for CL_VERSION_1_0
-- Looking for CL_VERSION_1_0 - not found
CMake Error at /usr/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
  Could NOT find OpenCL (missing: OpenCL_INCLUDE_DIR)
Call Stack (most recent call first):
  /usr/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.10/Modules/FindOpenCL.cmake:132 (find_package_handle_standard_args)
  CMakeLists.txt:138 (find_package)


-- Configuring incomplete, errors occurred!
See also "/tmp/RtmpH3RoRU/R.INSTALL9e1d4796b2/lightgbm/src/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/RtmpH3RoRU/R.INSTALL9e1d4796b2/lightgbm/src/build/CMakeFiles/CMakeError.log".
Error in .run_shell_command("cmake", c(cmake_args, "..")) : 
  Command failed with exit code: 1
* removing '/usr/local/lib/R/site-library/lightgbm'
Error in .run_shell_command(install_cmd, install_args) : 
  Command failed with exit code: 1
Execution halted

StrikerRUS · 2021-01-18T15:58:33Z

I just tried this, and compilation failed

OK, expected.

Please try passing -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ as script arguments, to make sure that this use case can also be covered with new arguments.

…o fix/r-gpu

jameslamb · 2021-01-18T18:25:43Z

Building like this, with ocl-icd-opencl-dev and opencl-headers installation removed, succeeded 🎉

Rscript build_r.R \
    	--use-gpu \
    	--opencl-library=/usr/local/cuda/lib64/libOpenCL.so \
    	--opencl-include-dir=/usr/local/cuda/include/

full Dockerfile

FROM nvidia/cuda:10.2-devel-ubuntu18.04

ENV DEBIAN_FRONTEND="noninteractive"

RUN apt-get update && \
    apt-get install \
    	-y \
    	software-properties-common \
    	apt-transport-https

RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 && \
    add-apt-repository 'deb [arch=amd64] https://cran.rstudio.com/bin/linux/ubuntu bionic-cran40/' && \
    apt-get update && \
    apt-get install -y \
    		r-base

RUN apt-get install -y \
	git \
	wget \
	libcurl4-openssl-dev \
	default-jdk-headless \
	libssl-dev \
	libxml2-dev \
	cmake

ENV MAKE="make -j$(nproc)"

RUN R -e 'install.packages(c("R6","data.table","jsonlite"), repos = "https://cran.rstudio.com/")'

RUN apt-get install -y \
		libboost-dev \
		libboost-system-dev \
		libboost-filesystem-dev \
		clinfo

RUN mkdir -p /etc/OpenCL/vendors && \
    echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd   ## otherwise lightgm segfaults at runtime (compiles fine without it)

COPY LightGBM /tmp/LightGBM

RUN cd /tmp/LightGBM && \
	git submodule init && \
	git submodule update --recursive && \
    Rscript build_r.R \
    	--use-gpu \
    	--opencl-library=/usr/local/cuda/lib64/libOpenCL.so \
    	--opencl-include-dir=/usr/local/cuda/include/

Build Logs (click me)

-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - found
-- Found OpenCL: /usr/local/cuda/lib64/libOpenCL.so (found version "1.2") 
-- OpenCL include directory: /usr/local/cuda/include
-- Boost version: 1.65.1
-- Found the following Boost libraries:
--   filesystem
--   system
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Success
-- Using _mm_prefetch
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Success
-- Using _mm_malloc
-- Configuring done
-- Generating done
...
[100%] Built target _lightgbm
Found library file: /tmp/RtmpaLCutl/R.INSTALL9e5da08f48/lightgbm/src/lib_lightgbm.so to move to /usr/local/lib/R/site-library/00LOCK-lightgbm/00new/lightgbm/libs
Removing 'build/' directory
** R
** data
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (lightgbm)

I ran the testing code but with verbose = 1, and can see the following logs confirming that the GPU was utilized

[LightGBM] [Info] Number of positive: 192982, number of negative: 807018
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 8
[LightGBM] [Info] Using GPU Device: GeForce RTX 2070 with Max-Q Design, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 8 dense feature groups (7.63 MB) transferred to GPU in 0.020953 secs. 0 sparse feature groups

StrikerRUS

LGTM except one typo! But I haven't dug deep into R code.

build_r.R

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

jameslamb · 2021-01-19T00:08:47Z

... I haven't dug deep into R code.

I think that since we have so many CI jobs for R + CMake, and since this PR didn't change any CI scripts or tests, we can be pretty confident that the changes to build_r.R didn't break the experience of building the CPU package with Rscript build_r.R. All of the non-GPU command line options are covered by tests

Rscript build_r.R --use-mingw:

LightGBM/.ci/test_r_package_windows.ps1

Line 130 in c871496

$env:BUILD_R_FLAGS = "c('--skip-install', '--use-mingw')"
Rscript build_r.R --use-msys2:

LightGBM/.ci/test_r_package_windows.ps1

Line 133 in c871496

$env:BUILD_R_FLAGS = "c('--skip-install', '--use-msys2')"

Rscript build_r.R --skip-install

LightGBM/.ci/test_r_package_windows.ps1

Lines 130 to 135 in c871496

    
             $env:BUILD_R_FLAGS = "c('--skip-install', '--use-mingw')" 
        
           } elseif ($env:TOOLCHAIN -eq "MSYS") { 
        
             Write-Output "Telling R to use MSYS" 
        
             $env:BUILD_R_FLAGS = "c('--skip-install', '--use-msys2')" 
        
           } elseif ($env:TOOLCHAIN -eq "MSVC") { 
        
             $env:BUILD_R_FLAGS = "'--skip-install'"

LightGBM/.ci/test_r_package.sh

Line 113 in c871496

Rscript build_r.R --skip-install || exit -1

github-actions · 2023-08-24T02:18:41Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

jameslamb added 2 commits January 17, 2021 22:15

[R-package] Add GPU install options (fixes #3765)

21b038f

whitespace

6b74b07

jameslamb added the feature label Jan 18, 2021

jameslamb requested review from Laurae2, guolinke and StrikerRUS January 18, 2021 04:56

jameslamb mentioned this pull request Jan 18, 2021

R package install with GPU support fails #3765

Closed

linting

7b58db7

Merge branch 'master' into fix/r-gpu

a2ed9fe

jameslamb added 2 commits January 18, 2021 12:12

Merge branch 'master' into fix/r-gpu

ac207f2

Merge branch 'fix/r-gpu' of https://github.com/microsoft/LightGBM int…

e56fa2e

…o fix/r-gpu

StrikerRUS approved these changes Jan 18, 2021

View reviewed changes

build_r.R Outdated Show resolved Hide resolved

Update build_r.R

e519e93

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

jameslamb merged commit 8593f85 into master Jan 19, 2021

jameslamb deleted the fix/r-gpu branch January 19, 2021 02:08

jameslamb mentioned this pull request Apr 9, 2022

[R-package] [gpu] Unable to properly pass boost directories to Rbuild on Windows 10 #5135

Open

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R-package] Add GPU install options (fixes #3765) #3779

[R-package] Add GPU install options (fixes #3765) #3779

jameslamb commented Jan 18, 2021 •

edited

Loading

StrikerRUS commented Jan 18, 2021

jameslamb commented Jan 18, 2021

StrikerRUS commented Jan 18, 2021 •

edited

Loading

jameslamb commented Jan 18, 2021

StrikerRUS left a comment

jameslamb commented Jan 19, 2021

github-actions bot commented Aug 24, 2023

	if use_gpu:
	cmake_cmd.append("-DUSE_GPU=ON")
	if boost_root:
	cmake_cmd.append("-DBOOST_ROOT={0}".format(boost_root))
	if boost_dir:
	cmake_cmd.append("-DBoost_DIR={0}".format(boost_dir))
	if boost_include_dir:
	cmake_cmd.append("-DBoost_INCLUDE_DIR={0}".format(boost_include_dir))
	if boost_librarydir:
	cmake_cmd.append("-DBOOST_LIBRARYDIR={0}".format(boost_librarydir))
	if opencl_include_dir:
	cmake_cmd.append("-DOpenCL_INCLUDE_DIR={0}".format(opencl_include_dir))
	if opencl_library:
	cmake_cmd.append("-DOpenCL_LIBRARY={0}".format(opencl_library))

[R-package] Add GPU install options (fixes #3765) #3779

[R-package] Add GPU install options (fixes #3765) #3779

Conversation

jameslamb commented Jan 18, 2021 • edited Loading

Changes in this PR

How I tested this

Notes for Reviewers

StrikerRUS commented Jan 18, 2021

jameslamb commented Jan 18, 2021

StrikerRUS commented Jan 18, 2021 • edited Loading

jameslamb commented Jan 18, 2021

StrikerRUS left a comment

Choose a reason for hiding this comment

jameslamb commented Jan 19, 2021

github-actions bot commented Aug 24, 2023

jameslamb commented Jan 18, 2021 •

edited

Loading

StrikerRUS commented Jan 18, 2021 •

edited

Loading