Python GIL overhaul (#517)

* Development updates (deeplearning4j#9053) * RL4J: Add generic update rule (#502) Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com> * Shyrma reduce (#481) * - start working on improving of cpu legacy code for reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on improving legacy loops Signed-off-by: Yurii <iuriish@yahoo.com> * - still working on improving reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - further work on improving reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - testing speed run of new reduce op Signed-off-by: Yurii <iuriish@yahoo.com> * - working on improvement of default loop for reduce op Signed-off-by: Yurii <iuriish@yahoo.com> * - update signatures of stuff which calls reduce ops Signed-off-by: Yurii <iuriish@yahoo.com> * - make corrections in cuda reduce kernels Signed-off-by: Yurii <iuriish@yahoo.com> * - change loop for default case in broadcast legacy ops Signed-off-by: Yurii <iuriish@yahoo.com> * - comment some shape stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - comment unnecessary prints in RNGtests Signed-off-by: Yurii <iuriish@yahoo.com> * - finish to resolve conflicts after master has been merged Signed-off-by: Yurii <iuriish@yahoo.com> * - get rid of some compilation mistakes of cuda stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - minor changes Signed-off-by: Yurii <iuriish@yahoo.com> * - further search for bug causing crash on java test Signed-off-by: Yurii <iuriish@yahoo.com> * - add scalar case in reduce_ ... exec stuff Signed-off-by: Yurii <iuriish@yahoo.com> * - minor corrections in NAtiveOps.cu Signed-off-by: Yurii <iuriish@yahoo.com> * - add switch to scalar case execReduceXD functions Signed-off-by: Yurii <iuriish@yahoo.com> * - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce Signed-off-by: Yurii <iuriish@yahoo.com> * - correct cuda mirrorPad Signed-off-by: Yurii <iuriish@yahoo.com> * - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce Signed-off-by: Yurii <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> * Add support for CUDA 11.0 (#492) * Add support for CUDA 11.0 * libnd4j tweaks for CUDA 11 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * bindings update, again? Signed-off-by: raver119@gmail.com <raver119@gmail.com> * * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy * update API to match CUDA 8 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * * Update version of JavaCPP Presets for CPython * C++ updated for cuDNN 8.0 Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one more test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * 128-bit alignment for workspaces Signed-off-by: raver119@gmail.com <raver119@gmail.com> * change seed in 1 test Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Fix dependecy duplication in python4j-parent pom * Fix group id for in python4j-numpy * few tests tweaked Signed-off-by: raver119@gmail.com <raver119@gmail.com> * Remove macosx-x86_64-gpu from nd4j-tests-tensorflow * few minor tweaks for IndexReduce Signed-off-by: raver119@gmail.com <raver119@gmail.com> * one test removed Signed-off-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: raver119@gmail.com <raver119@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> * RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504) Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com> Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> * Removed dead code (deeplearning4j#9057) Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com> * performance improvement (deeplearning4j#9055) * performance improvement Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com> * revert some changes Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com> * Development updates (deeplearning4j#9064) * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL Signed-off-by: Samuel Audet <samuel.audet@gmail.com> * Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR * Update cherry pick again from last master revision. Co-authored-by: Samuel Audet <samuel.audet@gmail.com> Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com> Co-authored-by: raver119 <raver119@gmail.com> Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com> Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
KonduitAI · Aug 18, 2020 · 354f398 · 354f398
1 parent 72f5c18
commit 354f398
Show file tree

Hide file tree

Showing 24 changed files with 1,627 additions and 1,626 deletions.
diff --git a/pom.xml b/pom.xml
@@ -28,7 +28,7 @@
     <packaging>pom</packaging>
 
     <name>deeplearning4j</name>
-    <description>Deeplearning4j Monorepo</description>
+    <description>Deeplearning4ffj Monorepo</description>
     <url>http://deeplearning4j.org/</url>
 
     <licenses>
@@ -299,6 +299,7 @@
         <numpy.javacpp.version>${numpy.version}-${javacpp-presets.version}</numpy.javacpp.version>
 
         <openblas.version>0.3.10</openblas.version>
+
         <mkl.version>2020.2</mkl.version>
         <opencv.version>4.4.0</opencv.version>
         <ffmpeg.version>4.3.1</ffmpeg.version>

diff --git a/.../src/main/java/org/deeplearning4j/rl4j/agent/learning/algorithm/AdvantageActorCritic.java b/.../src/main/java/org/deeplearning4j/rl4j/agent/learning/algorithm/AdvantageActorCritic.java
@@ -1,110 +1,110 @@
-/*******************************************************************************
- * Copyright (c) 2020 Konduit K.K.
- *
- * This program and the accompanying materials are made available under the
- * terms of the Apache License, Version 2.0 which is available at
- * https://www.apache.org/licenses/LICENSE-2.0.
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations
- * under the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- ******************************************************************************/
-package org.deeplearning4j.rl4j.agent.learning.algorithm;
-
-import lombok.Builder;
-import lombok.Data;
-import lombok.NonNull;
-import lombok.experimental.SuperBuilder;
-import org.deeplearning4j.rl4j.agent.learning.update.FeaturesLabels;
-import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
-import org.deeplearning4j.rl4j.experience.StateActionPair;
-import org.deeplearning4j.rl4j.helper.INDArrayHelper;
-import org.deeplearning4j.rl4j.network.CommonLabelNames;
-import org.deeplearning4j.rl4j.network.CommonOutputNames;
-import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
-import org.deeplearning4j.rl4j.observation.Observation;
-import org.nd4j.linalg.api.ndarray.INDArray;
-import org.nd4j.linalg.factory.Nd4j;
-
-import java.util.List;
-
-// TODO: Add support for RNN
-/**
- * This the "Algorithm S3 Asynchronous advantage actor-critic" of <i>Asynchronous Methods for Deep Reinforcement Learning</i>
- * @see <a href="https://arxiv.org/pdf/1602.01783.pdf">Asynchronous Methods for Deep Reinforcement Learning on arXiv</a>, page 14
- * <p/>
- * Note: The output of threadCurrent must contain a channel named "value".
- */
-public class AdvantageActorCritic implements IUpdateAlgorithm<Gradients, StateActionPair<Integer>> {
-
-    private final ITrainableNeuralNet threadCurrent;
-
-    private final int actionSpaceSize;
-    private final double gamma;
-
-    public AdvantageActorCritic(@NonNull ITrainableNeuralNet threadCurrent,
-                                int actionSpaceSize,
-                                @NonNull Configuration configuration) {
-        this.threadCurrent = threadCurrent;
-        this.actionSpaceSize = actionSpaceSize;
-        gamma = configuration.getGamma();
-    }
-
-    @Override
-    public Gradients compute(List<StateActionPair<Integer>> trainingBatch) {
-        int size = trainingBatch.size();
-
-        INDArray features = INDArrayHelper.createBatchForShape(size, trainingBatch.get(0).getObservation().getData().shape());
-        INDArray values = Nd4j.create(size, 1);
-        INDArray policy = Nd4j.zeros(size, actionSpaceSize);
-
-        StateActionPair<Integer> stateActionPair = trainingBatch.get(size - 1);
-        double value;
-        if (stateActionPair.isTerminal()) {
-            value = 0;
-        } else {
-            INDArray valueOutput = threadCurrent.output(stateActionPair.getObservation()).get(CommonOutputNames.ActorCritic.Value);
-            value = valueOutput.getDouble(0);
-        }
-
-        for (int i = size - 1; i >= 0; --i) {
-            stateActionPair = trainingBatch.get(i);
-
-            Observation observation = stateActionPair.getObservation();
-
-            features.putRow(i, observation.getData());
-
-            value = stateActionPair.getReward() + gamma * value;
-
-            //the critic
-            values.putScalar(i, value);
-
-            //the actor
-            double expectedV = threadCurrent.output(observation)
-                    .get(CommonOutputNames.ActorCritic.Value)
-                    .getDouble(0);
-            double advantage = value - expectedV;
-            policy.putScalar(i, stateActionPair.getAction(), advantage);
-        }
-
-        FeaturesLabels featuresLabels = new FeaturesLabels(features);
-        featuresLabels.putLabels(CommonLabelNames.ActorCritic.Value, values);
-        featuresLabels.putLabels(CommonLabelNames.ActorCritic.Policy, policy);
-
-        return threadCurrent.computeGradients(featuresLabels);
-    }
-
-    @SuperBuilder
-    @Data
-    public static class Configuration {
-        /**
-         * The discount factor (default is 0.99)
-         */
-        @Builder.Default
-        double gamma = 0.99;
-    }
-}
+/*******************************************************************************
+ * Copyright (c) 2020 Konduit K.K.
+ *
+ * This program and the accompanying materials are made available under the
+ * terms of the Apache License, Version 2.0 which is available at
+ * https://www.apache.org/licenses/LICENSE-2.0.
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ *
+ * SPDX-License-Identifier: Apache-2.0
+ ******************************************************************************/
+package org.deeplearning4j.rl4j.agent.learning.algorithm;
+
+import lombok.Builder;
+import lombok.Data;
+import lombok.NonNull;
+import lombok.experimental.SuperBuilder;
+import org.deeplearning4j.rl4j.agent.learning.update.FeaturesLabels;
+import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
+import org.deeplearning4j.rl4j.experience.StateActionPair;
+import org.deeplearning4j.rl4j.helper.INDArrayHelper;
+import org.deeplearning4j.rl4j.network.CommonLabelNames;
+import org.deeplearning4j.rl4j.network.CommonOutputNames;
+import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
+import org.deeplearning4j.rl4j.observation.Observation;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.factory.Nd4j;
+
+import java.util.List;
+
+// TODO: Add support for RNN
+/**
+ * This the "Algorithm S3 Asynchronous advantage actor-critic" of <i>Asynchronous Methods for Deep Reinforcement Learning</i>
+ * @see <a href="https://arxiv.org/pdf/1602.01783.pdf">Asynchronous Methods for Deep Reinforcement Learning on arXiv</a>, page 14
+ * <p/>
+ * Note: The output of threadCurrent must contain a channel named "value".
+ */
+public class AdvantageActorCritic implements IUpdateAlgorithm<Gradients, StateActionPair<Integer>> {
+
+    private final ITrainableNeuralNet threadCurrent;
+
+    private final int actionSpaceSize;
+    private final double gamma;
+
+    public AdvantageActorCritic(@NonNull ITrainableNeuralNet threadCurrent,
+                                int actionSpaceSize,
+                                @NonNull Configuration configuration) {
+        this.threadCurrent = threadCurrent;
+        this.actionSpaceSize = actionSpaceSize;
+        gamma = configuration.getGamma();
+    }
+
+    @Override
+    public Gradients compute(List<StateActionPair<Integer>> trainingBatch) {
+        int size = trainingBatch.size();
+
+        INDArray features = INDArrayHelper.createBatchForShape(size, trainingBatch.get(0).getObservation().getData().shape());
+        INDArray values = Nd4j.create(size, 1);
+        INDArray policy = Nd4j.zeros(size, actionSpaceSize);
+
+        StateActionPair<Integer> stateActionPair = trainingBatch.get(size - 1);
+        double value;
+        if (stateActionPair.isTerminal()) {
+            value = 0;
+        } else {
+            INDArray valueOutput = threadCurrent.output(stateActionPair.getObservation()).get(CommonOutputNames.ActorCritic.Value);
+            value = valueOutput.getDouble(0);
+        }
+
+        for (int i = size - 1; i >= 0; --i) {
+            stateActionPair = trainingBatch.get(i);
+
+            Observation observation = stateActionPair.getObservation();
+
+            features.putRow(i, observation.getData());
+
+            value = stateActionPair.getReward() + gamma * value;
+
+            //the critic
+            values.putScalar(i, value);
+
+            //the actor
+            double expectedV = threadCurrent.output(observation)
+                    .get(CommonOutputNames.ActorCritic.Value)
+                    .getDouble(0);
+            double advantage = value - expectedV;
+            policy.putScalar(i, stateActionPair.getAction(), advantage);
+        }
+
+        FeaturesLabels featuresLabels = new FeaturesLabels(features);
+        featuresLabels.putLabels(CommonLabelNames.ActorCritic.Value, values);
+        featuresLabels.putLabels(CommonLabelNames.ActorCritic.Policy, policy);
+
+        return threadCurrent.computeGradients(featuresLabels);
+    }
+
+    @SuperBuilder
+    @Data
+    public static class Configuration {
+        /**
+         * The discount factor (default is 0.99)
+         */
+        @Builder.Default
+        double gamma = 0.99;
+    }
+}
diff --git a/.../org/deeplearning4j/rl4j/agent/learning/update/updater/NeuralNetUpdaterConfiguration.java b/.../org/deeplearning4j/rl4j/agent/learning/update/updater/NeuralNetUpdaterConfiguration.java
@@ -1,18 +1,18 @@
-package org.deeplearning4j.rl4j.agent.learning.update.updater;
-
-import lombok.Builder;
-import lombok.Data;
-import lombok.experimental.SuperBuilder;
-
-@SuperBuilder
-@Data
-/**
- * The configuration for neural network updaters
- */
-public class NeuralNetUpdaterConfiguration {
-    /**
-     * Will synchronize the target network at every <i>targetUpdateFrequency</i> updates (default: no update)
-     */
-    @Builder.Default
-    int targetUpdateFrequency = Integer.MAX_VALUE;
-}
+package org.deeplearning4j.rl4j.agent.learning.update.updater;
+
+import lombok.Builder;
+import lombok.Data;
+import lombok.experimental.SuperBuilder;
+
+@SuperBuilder
+@Data
+/**
+ * The configuration for neural network updaters
+ */
+public class NeuralNetUpdaterConfiguration {
+    /**
+     * Will synchronize the target network at every <i>targetUpdateFrequency</i> updates (default: no update)
+     */
+    @Builder.Default
+    int targetUpdateFrequency = Integer.MAX_VALUE;
+}
diff --git a/...eplearning4j/rl4j/agent/learning/update/updater/async/AsyncGradientsNeuralNetUpdater.java b/...eplearning4j/rl4j/agent/learning/update/updater/async/AsyncGradientsNeuralNetUpdater.java
@@ -1,43 +1,43 @@
-/*******************************************************************************
- * Copyright (c) 2020 Konduit K.K.
- *
- * This program and the accompanying materials are made available under the
- * terms of the Apache License, Version 2.0 which is available at
- * https://www.apache.org/licenses/LICENSE-2.0.
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
- * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
- * License for the specific language governing permissions and limitations
- * under the License.
- *
- * SPDX-License-Identifier: Apache-2.0
- ******************************************************************************/
-package org.deeplearning4j.rl4j.agent.learning.update.updater.async;
-
-import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
-import org.deeplearning4j.rl4j.agent.learning.update.updater.INeuralNetUpdater;
-import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
-
-/**
- * A {@link INeuralNetUpdater} that updates a neural network and sync a target network at defined intervals
- */
-public class AsyncGradientsNeuralNetUpdater extends BaseAsyncNeuralNetUpdater<Gradients> {
-    /**
-     * @param threadCurrent The thread-current network
-     * @param sharedNetworksUpdateHandler An instance shared among all threads that updates the shared networks
-     */
-    public AsyncGradientsNeuralNetUpdater(ITrainableNeuralNet threadCurrent,
-                                          AsyncSharedNetworksUpdateHandler sharedNetworksUpdateHandler) {
-        super(threadCurrent, sharedNetworksUpdateHandler);
-    }
-
-    /**
-     * Perform the necessary updates to the networks.
-     * @param gradients A {@link Gradients} that will be used to update the network.
-     */
-    @Override
-    public void update(Gradients gradients) {
-        updateAndSync(gradients);
-    }
-}
+/*******************************************************************************
+ * Copyright (c) 2020 Konduit K.K.
+ *
+ * This program and the accompanying materials are made available under the
+ * terms of the Apache License, Version 2.0 which is available at
+ * https://www.apache.org/licenses/LICENSE-2.0.
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ *
+ * SPDX-License-Identifier: Apache-2.0
+ ******************************************************************************/
+package org.deeplearning4j.rl4j.agent.learning.update.updater.async;
+
+import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
+import org.deeplearning4j.rl4j.agent.learning.update.updater.INeuralNetUpdater;
+import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
+
+/**
+ * A {@link INeuralNetUpdater} that updates a neural network and sync a target network at defined intervals
+ */
+public class AsyncGradientsNeuralNetUpdater extends BaseAsyncNeuralNetUpdater<Gradients> {
+    /**
+     * @param threadCurrent The thread-current network
+     * @param sharedNetworksUpdateHandler An instance shared among all threads that updates the shared networks
+     */
+    public AsyncGradientsNeuralNetUpdater(ITrainableNeuralNet threadCurrent,
+                                          AsyncSharedNetworksUpdateHandler sharedNetworksUpdateHandler) {
+        super(threadCurrent, sharedNetworksUpdateHandler);
+    }
+
+    /**
+     * Perform the necessary updates to the networks.
+     * @param gradients A {@link Gradients} that will be used to update the network.
+     */
+    @Override
+    public void update(Gradients gradients) {
+        updateAndSync(gradients);
+    }
+}