-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training a network in python #135
Comments
A script converting numpy array into leveldb is the easiest solution. // 1. Convert the original data format into datum
bool ReadImageToDatum(const string& filename, const int label,
const int height, const int width, Datum* datum) {
cv::Mat cv_img;
if (height > 0 && width > 0) {
cv::Mat cv_img_origin = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
cv::resize(cv_img_origin, cv_img, cv::Size(height, width));
} else {
cv_img = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
}
if (!cv_img.data) {
LOG(ERROR) << "Could not open or find file " << filename;
return false;
}
if (height > 0 && width > 0) {
}
datum->set_channels(3);
datum->set_height(cv_img.rows);
datum->set_width(cv_img.cols);
datum->set_label(label);
datum->clear_data();
datum->clear_float_data();
string* datum_string = datum->mutable_data();
for (int c = 0; c < 3; ++c) {
for (int h = 0; h < cv_img.rows; ++h) {
for (int w = 0; w < cv_img.cols; ++w) {
datum_string->push_back(static_cast<char>(cv_img.at<cv::Vec3b>(h, w)[c]));
}
}
}
return true;
}
// 2. Save the datum
leveldb::WriteBatch* batch = new leveldb::WriteBatch();
datum.SerializeToString(&value);
batch->Put(string(key_cstr), value); |
Converting numpy arrays to leveldb is of course a first workaround. Writing out a file everytime I want to train a model is too cumbersome in the long run though, because I want to parametrize preprocessing, so it's not a one-off job. But anyway thanks for the pointers. |
Great proposal, and we would appreciate a PR for this! |
Thanks for pointing me to the HDF5DataLayer. I'll create a PR as soon as I'll have some time to work on this. |
#286 takes steps to solving this, and will follow-up with numpy array / in-memory inputs for training. |
I would be nice to be able to train a network directly within python. That is directly invoke caffe on a numpy array instead of writing it out to a leveldb file first. I know that you can pass numpy arrays for pretrained networks, but I'm not aware that training is already possible.
The text was updated successfully, but these errors were encountered: