Skip to content

Commit

Permalink
Python GIL overhaul (#517)
Browse files Browse the repository at this point in the history
* Development updates (deeplearning4j#9053)

* RL4J: Add generic update rule (#502)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

* Shyrma reduce (#481)

* - start working on improving of cpu legacy code for reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving legacy loops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - still working on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further work on improving reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - testing speed run of new reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - working on improvement of default loop for reduce op

Signed-off-by: Yurii <iuriish@yahoo.com>

* - update signatures of stuff which calls reduce ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - make corrections in cuda reduce kernels

Signed-off-by: Yurii <iuriish@yahoo.com>

* - change loop for default case in broadcast legacy ops

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment some shape stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - comment unnecessary prints in RNGtests

Signed-off-by: Yurii <iuriish@yahoo.com>

* - finish to resolve conflicts after master has been merged

Signed-off-by: Yurii <iuriish@yahoo.com>

* - get rid of some compilation mistakes of cuda stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor changes

Signed-off-by: Yurii <iuriish@yahoo.com>

* - further search for bug causing crash on java test

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add scalar case in reduce_ ... exec stuff

Signed-off-by: Yurii <iuriish@yahoo.com>

* - minor corrections in NAtiveOps.cu

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add switch to scalar case execReduceXD functions

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in ConstantShapeHelper::createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

* - correct cuda mirrorPad

Signed-off-by: Yurii <iuriish@yahoo.com>

* - add support for vectors old shape in cuda createShapeInfoWithNoUnitiesForReduce

Signed-off-by: Yurii <iuriish@yahoo.com>

Co-authored-by: raver119 <raver119@gmail.com>

* Add support for CUDA 11.0 (#492)

* Add support for CUDA 11.0

* libnd4j tweaks for CUDA 11

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* bindings update, again?

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update versions of JavaCPP Presets for FFmpeg, OpenBLAS, and NumPy

* update API to match CUDA 8

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* * Update version of JavaCPP Presets for CPython

* C++ updated for cuDNN 8.0

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one more test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* 128-bit alignment for workspaces

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* change seed in 1 test

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Fix dependecy duplication in python4j-parent pom

* Fix group id for in python4j-numpy

* few tests tweaked

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* Remove macosx-x86_64-gpu from nd4j-tests-tensorflow

* few minor tweaks for IndexReduce

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

* one test removed

Signed-off-by: raver119@gmail.com <raver119@gmail.com>

Co-authored-by: raver119@gmail.com <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* RL4J: Add SyncTrainer and AgentLearnerBuilder for a few algorithms (#504)

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>

Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>

* Removed dead code (deeplearning4j#9057)

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* performance improvement (deeplearning4j#9055)

* performance improvement

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* revert some changes

Signed-off-by: Dariusz Zbyrad <dariusz.zbyrad@gmail.com>

* Development updates (deeplearning4j#9064)

 * Update versions of JavaCPP Presets for OpenCV, FFmpeg, and MKL

Signed-off-by: Samuel Audet <samuel.audet@gmail.com>

* Cherry pick rl4j changes from most recent KonduitAI/deeplearning4j PR

* Update cherry pick again from last master revision.

Co-authored-by: Samuel Audet <samuel.audet@gmail.com>
Co-authored-by: Alexandre Boulanger <44292157+aboulang2002@users.noreply.github.com>
Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
Co-authored-by: raver119 <raver119@gmail.com>
Co-authored-by: Serhii Shepel <9946053+sshepel@users.noreply.github.com>
Co-authored-by: dariuszzbyrad <dariusz.zbyrad@gmail.com>
  • Loading branch information
7 people authored Aug 18, 2020
1 parent 72f5c18 commit 354f398
Show file tree
Hide file tree
Showing 24 changed files with 1,627 additions and 1,626 deletions.
3 changes: 2 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
<packaging>pom</packaging>

<name>deeplearning4j</name>
<description>Deeplearning4j Monorepo</description>
<description>Deeplearning4ffj Monorepo</description>
<url>http://deeplearning4j.org/</url>

<licenses>
Expand Down Expand Up @@ -299,6 +299,7 @@
<numpy.javacpp.version>${numpy.version}-${javacpp-presets.version}</numpy.javacpp.version>

<openblas.version>0.3.10</openblas.version>

<mkl.version>2020.2</mkl.version>
<opencv.version>4.4.0</opencv.version>
<ffmpeg.version>4.3.1</ffmpeg.version>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,110 +1,110 @@
/*******************************************************************************
* Copyright (c) 2020 Konduit K.K.
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
package org.deeplearning4j.rl4j.agent.learning.algorithm;

import lombok.Builder;
import lombok.Data;
import lombok.NonNull;
import lombok.experimental.SuperBuilder;
import org.deeplearning4j.rl4j.agent.learning.update.FeaturesLabels;
import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
import org.deeplearning4j.rl4j.experience.StateActionPair;
import org.deeplearning4j.rl4j.helper.INDArrayHelper;
import org.deeplearning4j.rl4j.network.CommonLabelNames;
import org.deeplearning4j.rl4j.network.CommonOutputNames;
import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
import org.deeplearning4j.rl4j.observation.Observation;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;

import java.util.List;

// TODO: Add support for RNN
/**
* This the "Algorithm S3 Asynchronous advantage actor-critic" of <i>Asynchronous Methods for Deep Reinforcement Learning</i>
* @see <a href="https://arxiv.org/pdf/1602.01783.pdf">Asynchronous Methods for Deep Reinforcement Learning on arXiv</a>, page 14
* <p/>
* Note: The output of threadCurrent must contain a channel named "value".
*/
public class AdvantageActorCritic implements IUpdateAlgorithm<Gradients, StateActionPair<Integer>> {

private final ITrainableNeuralNet threadCurrent;

private final int actionSpaceSize;
private final double gamma;

public AdvantageActorCritic(@NonNull ITrainableNeuralNet threadCurrent,
int actionSpaceSize,
@NonNull Configuration configuration) {
this.threadCurrent = threadCurrent;
this.actionSpaceSize = actionSpaceSize;
gamma = configuration.getGamma();
}

@Override
public Gradients compute(List<StateActionPair<Integer>> trainingBatch) {
int size = trainingBatch.size();

INDArray features = INDArrayHelper.createBatchForShape(size, trainingBatch.get(0).getObservation().getData().shape());
INDArray values = Nd4j.create(size, 1);
INDArray policy = Nd4j.zeros(size, actionSpaceSize);

StateActionPair<Integer> stateActionPair = trainingBatch.get(size - 1);
double value;
if (stateActionPair.isTerminal()) {
value = 0;
} else {
INDArray valueOutput = threadCurrent.output(stateActionPair.getObservation()).get(CommonOutputNames.ActorCritic.Value);
value = valueOutput.getDouble(0);
}

for (int i = size - 1; i >= 0; --i) {
stateActionPair = trainingBatch.get(i);

Observation observation = stateActionPair.getObservation();

features.putRow(i, observation.getData());

value = stateActionPair.getReward() + gamma * value;

//the critic
values.putScalar(i, value);

//the actor
double expectedV = threadCurrent.output(observation)
.get(CommonOutputNames.ActorCritic.Value)
.getDouble(0);
double advantage = value - expectedV;
policy.putScalar(i, stateActionPair.getAction(), advantage);
}

FeaturesLabels featuresLabels = new FeaturesLabels(features);
featuresLabels.putLabels(CommonLabelNames.ActorCritic.Value, values);
featuresLabels.putLabels(CommonLabelNames.ActorCritic.Policy, policy);

return threadCurrent.computeGradients(featuresLabels);
}

@SuperBuilder
@Data
public static class Configuration {
/**
* The discount factor (default is 0.99)
*/
@Builder.Default
double gamma = 0.99;
}
}
/*******************************************************************************
* Copyright (c) 2020 Konduit K.K.
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
package org.deeplearning4j.rl4j.agent.learning.algorithm;

import lombok.Builder;
import lombok.Data;
import lombok.NonNull;
import lombok.experimental.SuperBuilder;
import org.deeplearning4j.rl4j.agent.learning.update.FeaturesLabels;
import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
import org.deeplearning4j.rl4j.experience.StateActionPair;
import org.deeplearning4j.rl4j.helper.INDArrayHelper;
import org.deeplearning4j.rl4j.network.CommonLabelNames;
import org.deeplearning4j.rl4j.network.CommonOutputNames;
import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;
import org.deeplearning4j.rl4j.observation.Observation;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;

import java.util.List;

// TODO: Add support for RNN
/**
* This the "Algorithm S3 Asynchronous advantage actor-critic" of <i>Asynchronous Methods for Deep Reinforcement Learning</i>
* @see <a href="https://arxiv.org/pdf/1602.01783.pdf">Asynchronous Methods for Deep Reinforcement Learning on arXiv</a>, page 14
* <p/>
* Note: The output of threadCurrent must contain a channel named "value".
*/
public class AdvantageActorCritic implements IUpdateAlgorithm<Gradients, StateActionPair<Integer>> {

private final ITrainableNeuralNet threadCurrent;

private final int actionSpaceSize;
private final double gamma;

public AdvantageActorCritic(@NonNull ITrainableNeuralNet threadCurrent,
int actionSpaceSize,
@NonNull Configuration configuration) {
this.threadCurrent = threadCurrent;
this.actionSpaceSize = actionSpaceSize;
gamma = configuration.getGamma();
}

@Override
public Gradients compute(List<StateActionPair<Integer>> trainingBatch) {
int size = trainingBatch.size();

INDArray features = INDArrayHelper.createBatchForShape(size, trainingBatch.get(0).getObservation().getData().shape());
INDArray values = Nd4j.create(size, 1);
INDArray policy = Nd4j.zeros(size, actionSpaceSize);

StateActionPair<Integer> stateActionPair = trainingBatch.get(size - 1);
double value;
if (stateActionPair.isTerminal()) {
value = 0;
} else {
INDArray valueOutput = threadCurrent.output(stateActionPair.getObservation()).get(CommonOutputNames.ActorCritic.Value);
value = valueOutput.getDouble(0);
}

for (int i = size - 1; i >= 0; --i) {
stateActionPair = trainingBatch.get(i);

Observation observation = stateActionPair.getObservation();

features.putRow(i, observation.getData());

value = stateActionPair.getReward() + gamma * value;

//the critic
values.putScalar(i, value);

//the actor
double expectedV = threadCurrent.output(observation)
.get(CommonOutputNames.ActorCritic.Value)
.getDouble(0);
double advantage = value - expectedV;
policy.putScalar(i, stateActionPair.getAction(), advantage);
}

FeaturesLabels featuresLabels = new FeaturesLabels(features);
featuresLabels.putLabels(CommonLabelNames.ActorCritic.Value, values);
featuresLabels.putLabels(CommonLabelNames.ActorCritic.Policy, policy);

return threadCurrent.computeGradients(featuresLabels);
}

@SuperBuilder
@Data
public static class Configuration {
/**
* The discount factor (default is 0.99)
*/
@Builder.Default
double gamma = 0.99;
}
}
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
package org.deeplearning4j.rl4j.agent.learning.update.updater;

import lombok.Builder;
import lombok.Data;
import lombok.experimental.SuperBuilder;

@SuperBuilder
@Data
/**
* The configuration for neural network updaters
*/
public class NeuralNetUpdaterConfiguration {
/**
* Will synchronize the target network at every <i>targetUpdateFrequency</i> updates (default: no update)
*/
@Builder.Default
int targetUpdateFrequency = Integer.MAX_VALUE;
}
package org.deeplearning4j.rl4j.agent.learning.update.updater;

import lombok.Builder;
import lombok.Data;
import lombok.experimental.SuperBuilder;

@SuperBuilder
@Data
/**
* The configuration for neural network updaters
*/
public class NeuralNetUpdaterConfiguration {
/**
* Will synchronize the target network at every <i>targetUpdateFrequency</i> updates (default: no update)
*/
@Builder.Default
int targetUpdateFrequency = Integer.MAX_VALUE;
}
Original file line number Diff line number Diff line change
@@ -1,43 +1,43 @@
/*******************************************************************************
* Copyright (c) 2020 Konduit K.K.
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
package org.deeplearning4j.rl4j.agent.learning.update.updater.async;

import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
import org.deeplearning4j.rl4j.agent.learning.update.updater.INeuralNetUpdater;
import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;

/**
* A {@link INeuralNetUpdater} that updates a neural network and sync a target network at defined intervals
*/
public class AsyncGradientsNeuralNetUpdater extends BaseAsyncNeuralNetUpdater<Gradients> {
/**
* @param threadCurrent The thread-current network
* @param sharedNetworksUpdateHandler An instance shared among all threads that updates the shared networks
*/
public AsyncGradientsNeuralNetUpdater(ITrainableNeuralNet threadCurrent,
AsyncSharedNetworksUpdateHandler sharedNetworksUpdateHandler) {
super(threadCurrent, sharedNetworksUpdateHandler);
}

/**
* Perform the necessary updates to the networks.
* @param gradients A {@link Gradients} that will be used to update the network.
*/
@Override
public void update(Gradients gradients) {
updateAndSync(gradients);
}
}
/*******************************************************************************
* Copyright (c) 2020 Konduit K.K.
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
package org.deeplearning4j.rl4j.agent.learning.update.updater.async;

import org.deeplearning4j.rl4j.agent.learning.update.Gradients;
import org.deeplearning4j.rl4j.agent.learning.update.updater.INeuralNetUpdater;
import org.deeplearning4j.rl4j.network.ITrainableNeuralNet;

/**
* A {@link INeuralNetUpdater} that updates a neural network and sync a target network at defined intervals
*/
public class AsyncGradientsNeuralNetUpdater extends BaseAsyncNeuralNetUpdater<Gradients> {
/**
* @param threadCurrent The thread-current network
* @param sharedNetworksUpdateHandler An instance shared among all threads that updates the shared networks
*/
public AsyncGradientsNeuralNetUpdater(ITrainableNeuralNet threadCurrent,
AsyncSharedNetworksUpdateHandler sharedNetworksUpdateHandler) {
super(threadCurrent, sharedNetworksUpdateHandler);
}

/**
* Perform the necessary updates to the networks.
* @param gradients A {@link Gradients} that will be used to update the network.
*/
@Override
public void update(Gradients gradients) {
updateAndSync(gradients);
}
}
Loading

0 comments on commit 354f398

Please sign in to comment.