-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor DataLayer into DataSource and DataProcessing layers #148
Comments
Would you like to create the first DataSourceLayer for the HDF5DataLayer in #147? |
But for backward compatibility and not breaking all the existing user codes, we probably need to have both versions for a while and deprecate the old version after the end of an announced maintenance period. Shall we set a version release schedule from now on? |
@kloudkl we're still in a "break it now so it sets right" stage of development, so we aren't ready for versioned releases, but we'll do our best to make the transition comfortable. |
Let me make it more clear. I did not mean that #147 should be refactored before being merged. Switching easily among different data sources using the same processing layer will greatly reduce code redundancy and is a feature that many users would embrace wholeheartedly. What I asked for is a concrete example of the design that @sergeyk was thinking about. |
Not everything has to be a separate layer considering the significant success of #128. The DataProcessing stuffs fit more naturally in a processing pipeline which should stay in the refactored data layer to avoid redundant memory copy. The data source should be a plain field of data layer similarly to avoid creating too many kinds of layers. The data source will be defined in the proto and instantiated by a data source factory. The major methods of the DataSource base class that I can think of are has_next_batch, next_batch, load and save. Any better ideas? |
A series of recent collaborative efforts to refactor the data layers have completed this task. |
Right now, DataLayer does image-specific processing of LevelDB data.
It should be separated into LevelDBDataSourceLayer, and ImageProcessingLayer.
That way, other data sources can more easily be added: HDF5, image directories, CSV, etc.
The text was updated successfully, but these errors were encountered: