Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt the data interface from CXXNET authored by @tqchen & @antinucleon #407

Closed
wants to merge 9 commits into from
Closed

Adapt the data interface from CXXNET authored by @tqchen & @antinucleon #407

wants to merge 9 commits into from

Conversation

kloudkl
Copy link
Contributor

@kloudkl kloudkl commented May 12, 2014

One of the flexible features of CXXNET is that the data interface is easy to extend and relatively independent of the preprocessing steps. The design is a natural solution to partly solve #148.

The commit bcf933b literally translates the codes of CXXNET into the Caffe terminology. There are concerns about the copyright and the license that must be addressed. I would like to reach out to @tqchen and @antinucleon to get their permissions to adapt their codes. I believe the Apache License Version 2.0 under which CXXNET is licensed is compatible with the BSD 2-Clause license under which Caffe is licensed.

The API is ready for code review. The concrete implementations will follow when the API is stabilized.

@Yangqing
Copy link
Member

I will leave it to Evan and Sergey to decide... Personally I don't like adding more complication to things, it seems that we already have a good solution in terms of data layers that Sergey has, so this seems a little bit redundant and overcomplicated.

@shelhamer
Copy link
Member

Sergey and I will review this after the NIPS deadline 06/06 at the latest. A re-design of data layers for inheritance and modularity is worthy of consideration, so we will take a look at your proposal!

@shelhamer
Copy link
Member

My stance on the data layers is that it is best to separate the data source from the data transformations, and even to separate the data and the labels. For instance, one might use the same image collection for different tasks with different ground truth such as image classification, detection, or segmentation.

Having the transformations should be a simplification as data layers will be reduced to their IO.

@Yangqing Sergey and I will talk this one over. Of course splitting of the data layers in general is an idea apart from this concrete proposal. Do you see taking apart the data layers as over-complicated in itself? Thanks for pointing this PR out to us.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jun 9, 2014

I will further simplify the implementation for easier extension and add tests for the existing functionality this week.

@Yangqing
Copy link
Member

Yangqing commented Jun 9, 2014

My thought on having the data layer separated is to have the following:

(1) The data layer will have a member variable "next_batch_", which is a vector of Blobs that holds the next batch. The reason we have a vector of blobs is because a data layer may want to produce multiple output blobs.

(2) The data layer has an abstract function GetNextBatch() that will be implemented by specific methods to fill prefetched data into next_batch_.

(3) The data layer has a Forward() function that basically copies next_batch_ to its outputs.

(4) The GetNextBatch() will be called in the child thread, just like what we are doing for the current data layer.

For one to write a custom data layer, s/he simply needs to initialize things properly in the constructor, and then implement the GetNextBatch() function to fill in the values for the next batch.

@shelhamer @sergeyk - should you desire, I can refactor the current data layer according to the plan above.

Taking apart data transforms and the prefetching behavior is pretty good - for example, subtracting the mean does not cost too much time and could well be done outside the data layer.

No offense to anyone, but having three additional classes DataBatch, DataFetcher, and DataIterator is too complicated while the same task could be done just in one abstract class... I'd personally like fellow graduate students to understand the code quickly rather than needing to dig through a lot of classes before they can write a simple piece of algorithm, and in that regard will sacrifice modularity and scalability (if any) a little bit.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jun 11, 2014

I'm glad that there is a much simpler design. Once it's implemented, this can be closed.

@sergeyk
Copy link
Contributor

sergeyk commented Jun 16, 2014

@Yangqing your refactor would be much appreciated

@kloudkl
Copy link
Contributor Author

kloudkl commented Aug 27, 2014

We don't need this PR any more. Related work is done in #710, #954 and #963.

@kloudkl kloudkl closed this Aug 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants