Images layer: A data provider layer directly from images #120

sguada · 2014-02-17T18:36:13Z

This data layer, reads a file with "path/image_filenames label" and provides a data and label tops in the same way data_layer does. But it doesn't require the images to be in a leveldb, or even doesn't the require the images to be resized in advance (although it is recommended for speed).

Still some speed test comparisons has to be done.

…he tests

…ile reading them

Yangqing · 2014-02-17T20:30:39Z

It is my random thought, but would it be good to merge data_layer and input_layer by allowing an input format selection? Since there are some codes that could be reused, such as threaded prefetching.

jeffdonahue · 2014-02-17T20:39:44Z

Agreed with Yangqing - this seems it might only be a 10 or 20 line change to DataLayer with an input format selector param; this is a lot of code to duplicate imo.

kloudkl · 2014-02-18T02:49:05Z

Learning from some of the recent contributing efforts including mine that finally involved significant code refactoring or even reversion after the initial pull requests, I feel strongly that contributors should create issues for the wanted features or bug fixes at first. It is only after exchanging the thoughts about the most suitable designs or algorithms with the project owners and other contributors should the contributor begin investing a lot time time in really developing. This will avoid too much wasted sunk costs along the way.

sguada · 2014-02-18T02:55:36Z

@Yangqing @jeffdonahue I like your idea. Initially I wanted to get the layer working, and reusing as much code as possible from Data_layer seemed right.

@kloudkl I think refactoring the code after it is working is a good idea. So I don't feel it was a waste of time, it let me understand better the differences and similarities between data_layer and images_layer, and now I feel can probably extract the common parts and separate the differences.
However getting the opinions from other contributors and project owners is always useful.

Yangqing · 2014-02-18T02:59:01Z

Welcome to the realm of research code, where 90% of the codes are sunken.
For example, maybe some of us still remember this:

https://github.com/Yangqing/iceberk/

which is now at the bottom of the ocean filled with coffee.

Yangqing

On Mon, Feb 17, 2014 at 6:55 PM, Sergio Guadarrama <notifications@github.com

wrote:

@Yangqing https://github.com/Yangqing @jeffdonahuehttps://github.com/jeffdonahueI like your idea. Initially I wanted to get the layer working, and reusing
as much code as possible from Data_layer seemed right.

@kloudkl https://github.com/kloudkl I think refactoring the code after
it is working is a good idea. So I don't feel it was a waste of time, it
let me understand better the differences and similarities between
data_layer and images_layer, and now I feel can probably extract the common
parts and separate the differences.
However getting the opinions from other contributors and project owners is
always useful.

Reply to this email directly or view it on GitHubhttps://github.com//pull/120#issuecomment-35348050
.

shelhamer · 2014-02-18T03:10:39Z

I'm currently drafting development and contributing guides; please join the discussion at #101.

@kloudkl: it is certainly important for discussion and avoiding duplication of effort that people make their suggestions known and claim their contributions. However, to avoid double issues (issue + PR) and fragmenting conversations, I propose a natural way to do this with PRs when possible.

@sguada: discussion is definitely helpful, and I think issues + PRs are the place to do it in public.

@Yangqing reminds us of the truth, as ever.

kloudkl · 2014-02-18T03:54:50Z

I skimmed through the iceberk project and figured out that DeCAF, the evolution origin of Caffe, borrowed some key data structures and algorithms from it. In this sense, it is the testbed of this now very mature deep network power engine and not sunken but reborn.

sguada · 2014-02-18T23:02:55Z

I have done some performance tests with Titan card, and it seems that images_layer is approx twice as slow as the data_layer, what translate to a 8% slower in the forward-backward pass. What is not to bad considering that each image has to be read from a different jpg file.

# Using images_layer batchsize=50 100 repetitions

E0218 13:39:18.331570 31627 net_speed_benchmark.cpp:66] *** Benchmark begins ***
E0218 13:39:27.378861 31627 net_speed_benchmark.cpp:74] data    forward: 9.04 seconds.
E0218 13:39:30.105782 31627 net_speed_benchmark.cpp:74] conv1   forward: 2.8 seconds.
...
E0218 13:39:46.968508 31627 net_speed_benchmark.cpp:77] Forward pass: 28.68 seconds.
E0218 13:40:21.253057 31627 net_speed_benchmark.cpp:88] Backward pass: 34.27 seconds.
E0218 13:40:21.253075 31627 net_speed_benchmark.cpp:89] Total Time: 62.95 seconds.
E0218 13:40:21.253084 31627 net_speed_benchmark.cpp:90] *** Benchmark ends ***

# Using data_layer batchsize=50 100 repetitions

E0218 13:41:56.200095 31873 net_speed_benchmark.cpp:66] *** Benchmark begins ***
E0218 13:42:01.953526 31873 net_speed_benchmark.cpp:74] data    forward: 4.22 seconds.
E0218 13:42:04.677783 31873 net_speed_benchmark.cpp:74] conv1   forward: 2.75 seconds.
...
E0218 13:42:21.526362 31873 net_speed_benchmark.cpp:77] Forward pass: 23.81 seconds.
E0218 13:42:55.801816 31873 net_speed_benchmark.cpp:88] Backward pass: 34.25 seconds.
E0218 13:42:55.801826 31873 net_speed_benchmark.cpp:89] Total Time: 58.06 seconds.
E0218 13:42:56.253084 31873 net_speed_benchmark.cpp:90] *** Benchmark ends ***

sergeyk · 2014-02-24T23:27:33Z

@sguada will work further on this w.r.t. #148, probably won't be done until after March 7.

Images layer: A data provider layer directly from images

shelhamer · 2014-03-13T18:39:53Z

@sguada thanks. Let's not forget to refactor the data layers at some point though.

Images layer: A data provider layer directly from images

sguada added 7 commits February 17, 2014 10:32

Draft for Input_layer copied from Data_layer

54cd881

Added input_layer to set of layers and to factory

617c016

Fixed input_layer to pass tests, added cat image to data to perform t…

e8e3a1b

…he tests

Added the option to resize_image to resize images using cv::resize wh…

8b65e51

…ile reading them

Renamed input_layer to images_layer

ccf4eb5

Fixed typos to pass test_images_layer

fcaf5ad

Enforce that new_height and new_width are both 0 or both > 0

587eeab

sguada mentioned this pull request Feb 17, 2014

Reshape layer #108

Closed

shelhamer added the interface label Feb 18, 2014

sergeyk assigned sguada Feb 24, 2014

shelhamer added the work-in-progress label Feb 25, 2014

kloudkl mentioned this pull request Mar 9, 2014

Add memory data layer to pass data directly into the network #196

Closed

shelhamer added a commit that referenced this pull request Mar 13, 2014

Merge pull request #120 from sguada/images_layer

b1765ce

Images layer: A data provider layer directly from images

shelhamer merged commit b1765ce into BVLC:master Mar 13, 2014

This was referenced Mar 17, 2014

HDF5DataLayer source now takes list of filenames, loads one at a time. #203

Merged

Feature extraction, feature binarization and image retrieval examples #161

Merged

kloudkl mentioned this pull request Mar 23, 2014

Feed the ImageDataLayer with OpenCV images directly from memory #251

Closed

shelhamer removed the work in progress label Mar 23, 2014

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#120 from sguada/images_layer

b376f7d

Images layer: A data provider layer directly from images

DCurro mentioned this pull request Apr 13, 2016

Matcaffe crashes on any CHECK(...), or on failing to parse prototxt #3986

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Images layer: A data provider layer directly from images #120

Images layer: A data provider layer directly from images #120

sguada commented Feb 17, 2014

Yangqing commented Feb 17, 2014

jeffdonahue commented Feb 17, 2014

kloudkl commented Feb 18, 2014

sguada commented Feb 18, 2014

Yangqing commented Feb 18, 2014

shelhamer commented Feb 18, 2014

kloudkl commented Feb 18, 2014

sguada commented Feb 18, 2014

sergeyk commented Feb 24, 2014

shelhamer commented Mar 13, 2014

Images layer: A data provider layer directly from images #120

Images layer: A data provider layer directly from images #120

Conversation

sguada commented Feb 17, 2014

Yangqing commented Feb 17, 2014

jeffdonahue commented Feb 17, 2014

kloudkl commented Feb 18, 2014

sguada commented Feb 18, 2014

Yangqing commented Feb 18, 2014

shelhamer commented Feb 18, 2014

kloudkl commented Feb 18, 2014

sguada commented Feb 18, 2014

sergeyk commented Feb 24, 2014

shelhamer commented Mar 13, 2014