Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More of a question than an issue.. what IDE/env is used by committers to do SparkNet development? #64

Open
javadba opened this issue Feb 13, 2016 · 37 comments
Labels

Comments

@javadba
Copy link
Contributor

javadba commented Feb 13, 2016

(note: Please move this "issue" to an appropriate thread/location/etc.. I can't find any gitter or mailing list then email me info javadba at gmail.com)

I am joining a project performing distributed GPU based work on SparkNet. I am a former contributor to Spark (sql and mllib) so have strong underpinnings in the core spark stack.

It is crucial to productivity to have an IDE that at the least permits traversing the codebase. Respecting/ handling breakpoints would be a big plus.

Intellij is my preferred IDE but is not playing well with SparkNet. Here is one of the issues I am battling:

http://stackoverflow.com/questions/35385304/unable-to-import-sbt-project-into-intellij-using-custom-sbt-launch-jar

So then my question here is: what is the development environment and process for the core SparkNet committers? I want to be in sync with that in order to be able to be fully invested as a likely near-future contributor.

Thanks
Stephen Boesch

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

FYI I have created a maven pom.xml that emulates the build.sbt and successfully builds the project. I am using this successfully within intellij - given the build.sbt is not accepted there. A PR for the pom.xml is forthcoming.

@robertnishihara
Copy link
Member

Thanks Stephen, that's excellent! We've been using SBT so far because it's very simple, and we've been doing most of the development in Atom. We haven't defined a formal development environment/process yet, but we're open to suggestions about that.

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

#65

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

Atom is a good alternative: apropos I have been trying to get up to speed in it just this past two weeks. Are you using ensime? What other plugins? If you have link(s) (that can save you time from writing it yourself) to the stuff you use that's even better.

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

btw I am also going to be trying out scala 2.11 and Spark 1.6.0 / Spark 2.0 within the week and will send PR or create bug report if not possible.

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

It's cool to see fast turnaround on your comments. That's a nice warm fuzzy for starting on this project. Have you considered gitter?

@robertnishihara
Copy link
Member

Thanks a lot! Definitely let us know if there are problems with the newer versions of Scala and Spark! We haven't quite figured out where the best place to have these discussions is, but Gitter is definitely an option.

We've been using Atom without any plugins, but Ensime looks like it'd be useful. Do you use it?

@javadba
Copy link
Contributor Author

javadba commented Feb 13, 2016

re:Ensime. I have not personally but my fanatical scala friend does and has found it useful. Will circle back in couple of weeks after leveraging (/stealing) his tricks of the trade there.

@robertnishihara
Copy link
Member

Sounds good!

@javadba
Copy link
Contributor Author

javadba commented Feb 14, 2016

Unrelated: how to build the libccaffe? Here is what I did

sbt package assembly

Then

spark-submit --class apps.CifarApp target/scala-2.10/sparknet-assembly-0.1-SNAPSHOT.jar 1

However the libbcaffe.so is required. But it has not been built.

Caused by: java.lang.UnsatisfiedLinkError: Can't load library: /git/SparkNet/build/libccaffe.so

How to build that? I can not find any script or command to do so.

@robertnishihara
Copy link
Member

Right now, we've been doing the following:

cd SparkNet
mkdir build
cd build
cmake ../libccaffe
make -j 30

Let me know if that works or doesn't work for you.

@javadba
Copy link
Contributor Author

javadba commented Feb 14, 2016

OK thanks Is there documentation (that I missed) that should be followed?

2016-02-14 11:17 GMT-08:00 Robert Nishihara notifications@github.com:

Right now, we've been doing the following:

cd SparkNet
mkdir build
cd build
cmake ../libccaffe
make -j 30


Reply to this email directly or view it on GitHub
#64 (comment).

@robertnishihara
Copy link
Member

It's in the README under "setup". We probably should structure the README in a more intuitive way.

@javadba
Copy link
Contributor Author

javadba commented Feb 14, 2016

My bad. I did see it earlier. Got lost in the shuffle (setting up a bunch of stuff).

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

Unrelated question. What is the relationship between the caffe embedded in SparkNet at https://github.com/SparkNet/caffe and the standalone github repo @ https://github.com/BVLC/caffe ? Is the first one a snapshot of a specific version of the standalone? Are there any SparkNet specific changes?

@robertnishihara
Copy link
Member

The caffe embedded in SparkNet is a snapshot of Caffe. We squashed it here 192ba1f.

We made some small changes, mostly related to multiGPUs, testing, and reading data from Java.

However, one of our big priorities right now is switching from our own Caffe wrapper to using javacpp-presets https://github.com/bytedeco/javacpp-presets (see #60). This will make things much easier to develop/deploy/maintain, and we've been working on this in the branch javacpp+dataframes. The main thing blocking us here is a bug with javacpp-presets (see bytedeco/javacpp-presets#147). We haven't had a chance to debug that yet. It's a key thing to do. One piece of low-hanging fruit is to figure out where in javacpp the bug was introduced.

To see the bug, you can look at the attached example. Calling sbt run from ExampleGPU/ should work, but if you uncomment the line Caffe.set_mode(Caffe.GPU), then the app will crash. Searching through the versions (and potentially commits) of javacpp-presets or maybe javacpp in build.sbt could give us an idea of where the bug was introduced. Any progress/ideas in that direction would be extremely helpful. If you're interested, we can also get in touch with the javacpp maintainers.
ExampleGPU.zip

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

Thanks Robert I am in a bit over my head at this moment while wrestling with building on my Mac. I am getting partway through the caffe tests "make runtests" and likewise partway through the SparNet tests "sbt test". I have not reported anything about my challenges yet because they may be configuration errors on my part. I will circle back one way or other after getting more traction with the codebase and with the environment setup requirements.

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

Btw by using the maven / pom.xml with the PR for "specify master on commandline" I have been able to launch the CifarApp within Intellij including nice stuff like setting breakpoints. It is failing rather soon due to my inability (so far) to properly build libccaffe.

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

screen shot 2016-02-14 at 11 48 46 pm

@robertnishihara
Copy link
Member

Ah, some of the tests may not be working right now. We haven't been as diligent about that as we should have been. They mostly require small fixes, but I'd suggest not spending much time on that (you seem to be going through the apps, which is a better entry point).

What errors are you getting with building libccaffe? Caffe is a bit easier to build on Linux, but we've seen a number of errors, so maybe we can help. Fortunately, this kind of difficulty will go away once we switch to using JavaCPP.

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

Thanks - I am about to launch a full rebuild - this time with cudaDNN enabled - and go through the online docs once again to see if I missed some obvious stuff. Will get back tomorrow afternoon on that.

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

Here is the core of my challenges on mac: should we be using atlas or openblas? I have seen references to both of them. I am on El Capitan and currently using atlas - due to some comments about caffe from other projects.

The error I get on a bunch of files when building libccaffe is:

fatal error: 'cblas.h' file not found

Here is a full example

/git/sparknet/libccaffe/../caffe/include/caffe/util/mkl_alternate.hpp:11:10: fatal error: 'cblas.h' file not found

If there were a specific guide for building libccafe on mac that works best for this project would you please point me to it? Thanks!

@robertnishihara
Copy link
Member

I think the real solution is to switch to JavaCPP, so people won't have to build libccaffe. Philipp and I will try to get this working tomorrow (at least without the GPU mode) so you don't have to worry about libccaffe.

Then we can remove the copy of Caffe from our repository and makes things much cleaner.

@javadba
Copy link
Contributor Author

javadba commented Feb 15, 2016

That's great! Thanks for the awesome responsiveness.

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

A bit different topic. I tried the javacpp+dataframes branch. I am running on mac. Do you have any suggestions how to approach the non-linux build?

ERROR] Failed to execute goal on project sparknet: Could not resolve dependencies for project org.amplab:sparknet:jar:1.0: The following artifacts could not be resolved: org.bytedeco.javacpp-presets:caffe:jar:macosx-x86_64:master-1.2-SNAPSHOT, org.bytedeco.javacpp-presets:opencv:jar:macosx-x86_64:3.1.0-1.2-SNAPSHOT, com.twelvemonkeys.imageio:imageio:jar:3.1.2, com.twelvemonkeys.imageio:common-lang:jar:3.1.2: Failure to find org.bytedeco.javacpp-presets:caffe:jar:macosx-x86_64:master-1.2-SNAPSHOT in https://repository.apache.org/content/repositories/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of Maven snapshots repository has elapsed or updates are forced -> [Help 1]

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

Do you guys have a local (laptop/desktop) standardized environment? E.g. a docker image?

@robertnishihara
Copy link
Member

These are both really good points.

  1. We've been focusing on the Linux build. We haven't done much testing on other operating systems, but that's an important thing to do.

  2. We also haven't made a docker image, but that would be useful. However, we can run things locally on top of a local Spark cluster. See some of the later discussion in Question about disabling GPU #52 about this.

@robertnishihara
Copy link
Member

Also, did you get the above error message when building with sbt or with maven? I actually put the files at http://www.eecs.berkeley.edu/~rkn/temp/org/bytedeco/ (this is not a permanent solution, but the current javacpp-presets release doesn't have some bug fixes that we need), and I added

resolvers += "javacpp" at "http://www.eecs.berkeley.edu/~rkn/temp/"

to the build.sbt.

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

re: .. ~rkn/temp/ .. I saw that - and had added to the within the maven build. I should have tried the sbt as well - will do that now.

@pcmoritz
Copy link
Collaborator

Another option (for now) is to use an older version of JavaCPP (version 1.1 should work), the downside is that the Solver won't work (it depends on a recent bugfix), but you could construct networks and run forward and backward methods on your Mac.

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

@pcmoritz Oh that sounds like a quite reasonable workaround. will update late wed 2/17

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

OK I tried out the sbt build on the javacpp+dataframes branch in mac and it is fine . So there's something amiss in my pom.xml attempt.

The following is result of doing spark-submit on mac

 java.lang.UnsatisfiedLinkError: no jniopencv_core in java.library.path

That is actually pretty good: it shows that the jar were built and submitted to spark scheduler. It got launched on the spark executor but then there is native library issue.

@pcmoritz
Copy link
Collaborator

Awesome! Does running this, the JavaCPP caffe preset example (src/main/java/caffe.java in the README.md) work?

@javadba
Copy link
Contributor Author

javadba commented Feb 17, 2016

There are complications - trying to work through them.

Here is the present one: I recently upgraded to El Capitan and Xcode 7.5 . That is not playing well with opencv. 20 errors mostly along the lines of the following:

/System/Library/Frameworks/Foundation.framework/Headers/NSObject.h:19:21: error: expected a type.

Following link seems to indicate xcode 6.4 needs to be used. This is feeling a bit wobbly to me ..

electron/electron#2315

Build failure with Xcode 7 and El Capitan

Builds successfully if you replace Xcode.app with Xcode 6 and repeat the bootstrapping and build process.

It is not completely clear what the fix should be. I am wading through various online posts on the topic.

@javadba
Copy link
Contributor Author

javadba commented Feb 21, 2016

I have been incommunicado for four days, three of which focused on setting up a gpu-enabled Linux environment. Several were attempted and one seems to be relatively promising: an nVidia 780 outfitted machine running ubuntu 15.10. I have the linux/nVidia up but am unable to get Caffe working.

I have added to an existing issue over there

BVLC/caffe#2347

Here is my status as reflected in the comment on it:

Is there any more complete summary on how to do this build? After attempting to merge various bits and pieces from this as well as StackOverflow questions, I end up with the following all the same .

LD -o .build_debug/lib/libcaffe.so.1.0.0-rc3
/usr/bin/ld: cannot find -lhdf5_hl
/usr/bin/ld: cannot find -lhdf5
/usr/bin/ld: cannot find -lcudnn
/usr/bin/ld: cannot find -lcblas
/usr/bin/ld: cannot find -latlas
collect2: error: ld returned 1 exit status
Makefile:554: recipe for target '.build_debug/lib/libcaffe.so.1.0.0-rc3' failed
make: *** [.build_debug/lib/libcaffe.so.1.0.0-rc3] Error 1

I had used apt-get install on the relevant libraries, updated paths and the Makefile.config . Quite a number of steps. There does not seem to be a firm consensus on the correct aproach so there was a certain degree of mix and match.

I am on ubuntu 15.10 with nVidia cuda .

@robertnishihara
Copy link
Member

Cool, I've mostly been working with Ubuntu 14.04, and that seems to work alright (note, we no longer need to build libccaffe with the latest version of the SparkNet codebase).

You've probably seen these instructions https://github.com/BVLC/caffe/wiki/Ubuntu-15.10-Installation-Guide. However, I haven't tried them out myself.

Does SparkNet compile/run on Ubuntu 15.10?

@pcmoritz
Copy link
Collaborator

Hey, I suggest two solutions to the problem that might save you lots of headaches: If there isn't an important reason why you need Ubuntu 15.10, you could build on top of Ubuntu 14.04 which is very well supported. Or you could build on top of 15.10 and use the binaries we are providing. There are the preliminary ones that are being pulled by build.sbt, these might do the trick till we are done with some more functional ones in the next couple of days (they are supposed to work across a wide variety of linux distributions and also to support CUDA and CUDNN, we also plan to create a document on how to build these jars).

If you really want to use 15.10, the Caffe maintainers are probably keen on getting that to work for their 1.0 release, so they might be able to help you. For us it is less of a priority right now because we expect most people will either use binaries we provide or build their own binaries that will work across a wide variety of different distributions, and then the build environment is less crucial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants