Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detele compute #12

Closed
wants to merge 11 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ This model is then compared to an Azure AutoML run.


## Summary
**In 1-2 sentences, explain the problem statement: e.g "This dataset contains data about... we seek to predict..."**
The dataset contains data about direct marketing campaigns of a bank institution.
Our goal is to predict if a client will subscribe a term deposit.

**In 1-2 sentences, explain the solution: e.g. "The best performing model was a ..."**

Expand Down
1 change: 1 addition & 0 deletions model.pkl
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�
Binary file added outputs/automl_model.onnx
Binary file not shown.
Binary file added outputs/model.pkl
Binary file not shown.
20 changes: 13 additions & 7 deletions train.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
from azureml.core.run import Run
from azureml.core import Workspace, Dataset
from azureml.data.dataset_factory import TabularDatasetFactory
import joblib

def clean_data(data):
# Dict for cleaning data
Expand Down Expand Up @@ -42,31 +44,35 @@ def main():
parser = argparse.ArgumentParser()

parser.add_argument('--C', type=float, default=1.0, help="Inverse of regularization strength. Smaller values cause stronger regularization")
parser.add_argument('--max_iter', type=int, default=100, help="Maximum number of iterations to converge")
parser.add_argument('--max_iter', type=int, default=1000, help="Maximum number of iterations to converge")

args = parser.parse_args()

run = Run.get_context()

run.log("Regularization Strength:", np.float(args.C))
run.log("Max iterations:", np.int(args.max_iter))
run.log("Regularization Strength:", float(args.C))
run.log("Max iterations:", int(args.max_iter))

# TODO: Create TabularDataset using TabularDatasetFactory
# Data is located at:
# "https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv"
url = "https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv"

ds = ### YOUR CODE HERE ###

ds = TabularDatasetFactory.from_delimited_files(url)

x, y = clean_data(ds)

# TODO: Split data into train and test sets.

### YOUR CODE HERE ###a
x_train, x_test, y_train, y_test = train_test_split(x, y)

model = LogisticRegression(C=args.C, max_iter=args.max_iter).fit(x_train, y_train)

os.makedirs('outputs', exist_ok=True)
joblib.dump(model, "outputs/model.pkl")

accuracy = model.score(x_test, y_test)
run.log("Accuracy", np.float(accuracy))
run.log("Accuracy", float(accuracy))

if __name__ == '__main__':
main()
783 changes: 525 additions & 258 deletions udacity-project.ipynb

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions udacity-project.ipynb.amltmp

Large diffs are not rendered by default.