Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skorch emulator #89

Merged
merged 25 commits into from
Feb 19, 2021
Merged

Skorch emulator #89

merged 25 commits into from
Feb 19, 2021

Conversation

marcosfelt
Copy link
Collaborator

@marcosfelt marcosfelt commented Feb 16, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR is a complete rewrite of the ExperimentalEmulator. The current implementation has not been giving good results on new training and is very difficult to debug. The debugging challenges are primarily due to the complexity of the training loops and the code organization.

This PR instead uses skorch, which provides a scikit-learn API for pytorch. This enables use of all of the scikit-learn preprocessing functionality, which I make heavy use of here.

The overall API has a similar feel to the old one. You create a domain, load in a dataset, instantiate the emulator, and start training. That's it.

from summit.benchmarks import (
    ExperimentalEmulator,
    ANNRegressor,
    ReizmanSuzukiEmulator,
)
from summit.utils.dataset import DataSet
import matplotlib.pyplot as plt

domain = ReizmanSuzukiEmulator.setup_domain()
ds = DataSet.read_csv("data/reizman_suzuki_case_1.csv")
exp = ExperimentalEmulator(
    "test_reizman",
    domain,
    dataset=ds,
    regressor=ANNRegressor,
)
exp.train(max_epochs=1000, cv_folds=5, random_state=100)
exp.parity_plot(include_test=True)

Running that code gives this result:
image

One of the nice things is the ability to run grid search via cross-validation to identify hyperparameters:

params = {
    "regressor__net__max_epochs": [200, 500, 1000]
}
res = exp.train(cv_folds=5, random_state=100, search_params=params)
new_params = exp.predictor.get_params()

print(f"Selected number of epochs: {new_params['regressor__net__max_epochs']}")

Some additional features:

  • Automatic ensembling of models trained using cross-validation
  • Put Bayesian Neural Networks on a deprecation path as the code causes problems and you can do pretty well with basic ANNs
  • A RegressorRegistry, which allows regressors to be instantiated when they are pulled from disk
  • A script for training emulators included with the package and generating a report at scripts/train_emulator

This also refactors and removes some dead code.

@marcosfelt marcosfelt merged commit be54b50 into master Feb 19, 2021
@marcosfelt marcosfelt deleted the skorch-emulator branch April 17, 2021 13:16
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExperimentalEmulator save location
1 participant