Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: accelerating antsAtroposSegmentationImageFilter.hxx function GetPosteriorProbabilityImage using ITK ParallelizeImageRegion #1462

Merged
merged 1 commit into from
Dec 15, 2022

Conversation

br-cpvc
Copy link
Contributor

@br-cpvc br-cpvc commented Dec 7, 2022

This is a reimplementation of the optimizations initially proposed in PR #1452

With the help of the ITK discourse forum in post https://discourse.itk.org/t/how-to-use-itk-multi-threading-primitives-to-do-parallel-execution-of-loop-iteration/5531/ I have managed to rework my previous proof of concept implementation into a generic solution that uses the ITK multi-threading primitive ParallelizeImageRegion.

During my limited testing (using only two image inputs) I have seen the execution time being reduced to around 60-70% of the original implementations execution time. The execution times for the two tested image inputs: image 1 was reduced from 96 seconds to 60 seconds (see below) and image 2 from 272 seconds to 168 seconds.

I have validated the implementation using the following:

Tests where done on the cpu:

lscpu | grep "Model name"
Model name:                      Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

Preparations:

wget https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0303:anat:sub-0303_T1w.nii.gz

mkdir master
ANTSPATH=./bin_old/ time ./bin_old/Scripts/antsAtroposN4.sh -d 3 -u 0 -c 3 -m 3 -n 3 -a sub-0303:anat:sub-0303_T1w.nii.gz -x sub-0303:anat:sub-0303_T1w.nii.gz -o master/

using a compiled version of ANTs before the code changes:

mkdir old_sub
time ./bin_old/Atropos -d 3 -x sub-0303:anat:sub-0303_T1w.nii.gz -c [ 3,0.0 ] -a master/Segmentation0N4.nii.gz --verbose 1 -i PriorProbabilityImages[ 3,master/SegmentationPosteriors%d.nii.gz,0.0 ] -k Gaussian -m [ 0.1,1x1x1 ] -o [ old_sub/Segmentation.nii.gz,old_sub/SegmentationPosteriors%d.nii.gz ] -r 0 -p Socrates[ 1 ]

Elapsed time: 96.0797

md5sum old_sub/*
fc6148a95f77d073768cbd46a18e340f  old_sub/Segmentation.nii.gz
38517eecae8745acbffe034f7ba71e10  old_sub/SegmentationPosteriors1.nii.gz
8974621586bf2844f4ef372dfcae96c2  old_sub/SegmentationPosteriors2.nii.gz
6c79422e44b49225490eb42f615cd1e0  old_sub/SegmentationPosteriors3.nii.gz

using a compiled version of ANTs after the code changes:

mkdir new_opt_sub
time ./bin_new/Atropos -d 3 -x sub-0303:anat:sub-0303_T1w.nii.gz -c [ 3,0.0 ] -a master/Segmentation0N4.nii.gz --verbose 1 -i PriorProbabilityImages[ 3,master/SegmentationPosteriors%d.nii.gz,0.0 ] -k Gaussian -m [ 0.1,1x1x1 ] -o [ new_opt_sub/Segmentation.nii.gz,new_opt_sub/SegmentationPosteriors%d.nii.gz ] -r 0 -p Socrates[ 1 ]

Elapsed time: 59.7719

md5sum new_opt_sub/*
fc6148a95f77d073768cbd46a18e340f  new_opt_sub/Segmentation.nii.gz
38517eecae8745acbffe034f7ba71e10  new_opt_sub/SegmentationPosteriors1.nii.gz
8974621586bf2844f4ef372dfcae96c2  new_opt_sub/SegmentationPosteriors2.nii.gz
6c79422e44b49225490eb42f615cd1e0  new_opt_sub/SegmentationPosteriors3.nii.gz

The md5sums confirms that the new implementation produces the exact same output as the before the changes.

…riorProbabilityImage using ITK ParallelizeImageRegion
@ntustison
Copy link
Member

This is great. However, I would prefer to see a few more subjects processed where the output doesn't change before merging.

@br-cpvc
Copy link
Contributor Author

br-cpvc commented Dec 8, 2022

@ntustison Sure, good idea. I have tested 5 more images with the following:

wget https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0010:anat:sub-0010_T1w.nii.gz https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0086:anat:sub-0086_T1w.nii.gz https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0101:anat:sub-0101_T1w.nii.gz https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0109:anat:sub-0109_T1w.nii.gz https://openneuro.org/crn/datasets/ds004288/snapshots/1.0.1/files/sub-0114:anat:sub-0114_T1w.nii.gz

for sub in sub-0010 sub-0086 sub-0101 sub-0109 sub-0114; do \
export ver=old && mkdir -p $sub/$ver && echo "$sub $ver" && ./bin_$ver/Atropos -d 3 -x "${sub}:anat:${sub}_T1w.nii.gz" -c [ 3,0.0 ] -a "${sub}:anat:${sub}_T1w.nii.gz" -k Gaussian -m [ 0.1,1x1x1 ] -o $sub/$ver/Segmentation.nii.gz -r 0 -p Socrates[ 1 ] -i otsu[3] --verbose 1 | grep "Elapsed time:" && \
export ver=new && mkdir -p $sub/$ver && echo "$sub $ver" && ./bin_$ver/Atropos -d 3 -x "${sub}:anat:${sub}_T1w.nii.gz" -c [ 3,0.0 ] -a "${sub}:anat:${sub}_T1w.nii.gz" -k Gaussian -m [ 0.1,1x1x1 ] -o $sub/$ver/Segmentation.nii.gz -r 0 -p Socrates[ 1 ] -i otsu[3] --verbose 1 | grep "Elapsed time:" && \
diff -s $sub/old/Segmentation.nii.gz $sub/new/Segmentation.nii.gz && echo; done

Output:

Elapsed time: 78.2441
sub-0010 new
Elapsed time: 42.2994
Files sub-0010/old/Segmentation.nii.gz and sub-0010/new/Segmentation.nii.gz are identical

sub-0086 old
Elapsed time: 71.4629
sub-0086 new
Elapsed time: 38.4986
Files sub-0086/old/Segmentation.nii.gz and sub-0086/new/Segmentation.nii.gz are identical

sub-0101 old
Elapsed time: 73.7006
sub-0101 new
Elapsed time: 39.7073
Files sub-0101/old/Segmentation.nii.gz and sub-0101/new/Segmentation.nii.gz are identical

sub-0109 old
Elapsed time: 71.4932
sub-0109 new
Elapsed time: 38.3375
Files sub-0109/old/Segmentation.nii.gz and sub-0109/new/Segmentation.nii.gz are identical

sub-0114 old
Elapsed time: 69.5892
sub-0114 new
Elapsed time: 37.463
Files sub-0114/old/Segmentation.nii.gz and sub-0114/new/Segmentation.nii.gz are identical

To me the changes seems quite solid, maybe someone else should rerun my experiments or do further tests to if they get similar results.

@ntustison
Copy link
Member

Thanks @br-cpvc, that's good enough for me. @stnava, @cookpa --- any other issues with merging?

@cookpa
Copy link
Member

cookpa commented Dec 8, 2022

Can we check it remains consistent with prior probability images? That's probably the most common use case.

@ntustison
Copy link
Member

Yeah, that's a good idea. I'll try running a few subjects through the antsAtroposN4.sh pipeline.

@ntustison
Copy link
Member

I've been trying to run this in the background but am having difficulty getting identical results. I think I traced it to antsBrainExtraction.sh where there appears to be randomness even after setting -u 0. @cookpa , specifically, I'm looking here at the call to antsAI where I don't see the --random-seed option being used. Can you confirm that I'm seeing this correctly and that this would be an issue if one calls antsBrainExtraction.sh ... -u 0 ... ?

@cookpa
Copy link
Member

cookpa commented Dec 14, 2022

@ntustison are you running single threaded?

@ntustison
Copy link
Member

Yep, single-threaded.

@cookpa
Copy link
Member

cookpa commented Dec 14, 2022

OK, then maybe we need additional changes, I'll take look. Meanwhile, I'll build this PR and run some tests using AAN4 directly

@ntustison
Copy link
Member

That would be great. I started out with antsAtroposN4 which is what led me to looking internally at the components.

@ntustison
Copy link
Member

Okay, @cookpa , thanks for the reminder about setting the ANTS_RANDOM_SEED. It appears that I'm now getting the identical results I was expecting. I'll check after a couple more subjects and then we can merge this. Thanks again @br-cpvc . Regarding the randomization in antsBrainExtraction---do you want me to work on that or were you planning on addressing it?

@cookpa
Copy link
Member

cookpa commented Dec 15, 2022

I also get reproducible results, and with two threads, the execution time is reduced by about 20%.

@cookpa
Copy link
Member

cookpa commented Dec 15, 2022

Regarding the randomization in antsBrainExtraction---do you want me to work on that or were you planning on addressing it?

I replied in the issue, I'll make the change to make antsBrainExtraction.sh use a consistent random seed throughout if "-u 0" is passed. I think that's pretty backwards compatible because most users wouldn't call "-u 0" unless they want reproducible results

@ntustison
Copy link
Member

Thanks @cookpa . So I'll go ahead and merge this request given the reproducibility that both of us are getting. A final thanks to @br-cpvc for the contribution.

@ntustison ntustison merged commit cdabb0d into ANTsX:master Dec 15, 2022
@br-cpvc
Copy link
Contributor Author

br-cpvc commented Dec 16, 2022

Great. I'm happy to see that the changes has been merged :) Thank you all for the collaboration, it has been a pleasure working together on this change, and I'm happy to be able to contribute a little bit back to the project.

Thank you all for the excellent work you do, and thank you very much for continuously putting your effort into this great project through out all these years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants