-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variation between current and earlier versions #180
Comments
Experiment 1:
Counts (using
Trawler count is pretty close to its old value, the others not so close. results of run are here:
Details of run (on branch
Copy training file down to instance then:
|
Old school features. These features are the same as those used for the paper except for bug fixes and using run details (on branch old-school-features): First couple runs had (a) stability and then (b) too low learning rate issues. This is with branch:
./deploy_cloudml.py --env dev --model prod.vessel_characterization --job_name old_ald_school --config_file deploy_characterization.yaml Something was off here in the mmsi lists and I had to manually regenerate them so I didn't get crashes
==== Also for fishing
|
Simpler features (branch simpler-features). Reduce features more to make things simpler. Check performance. (First try, using reported course fared poorly, retry with implied course. That also faired poorly -- was using raw, rather than logged feature. Suspect that is the problem. [Update] Using raw features was the primary problem removing them allowed the model to train. Seems to be not quite as good though.
[Above versions were's comitted as none worked well except the last which was only OK. Main issue with features is how to generate rapidly, so try with only simple to generate features:
|
There is significant variation in the classification of vessels since the earlier release.
Shows up globally, but no on the test set. Probably because...
This is primarily in Chinese vessels, which tend to be in an area of poor coverage
and we have limited ground truth.
Two possible explanations:
Change in training data. We added more data since then and the training / test sets were
recomputed as a result.
Change in features.
Started using cos/sin trick to deal with cyclic parameters (unlikely to matter)
Added sin(lat) to so that we can look at seasonal data (could lead to overfitting).
The text was updated successfully, but these errors were encountered: