Benchmarks
we use the first and second image (transposed and multiplied by 32) from http://katrin.kit.edu/data/3dtracker/benchmark/synthetic/60_Luft_0_10_20_70.tif as benchmark images. the results by using Thorsten's program (by Jul. 2014) is documented
original image
- See attachments for image (1,2)
processed images
- See attachments for result (1,2) from Thorsten's program (by Aug. 2014)
circles found in images
Result Image 1 | Result Image 2 (70,1199) = 1 px | (73,1197) = 6 px (74,481) = 2 px | (80,913) = 10 px (80,913) = 10 px | (112,423) = 5 px (112,423) = 5 px | (154,344) = 3 px (157,743) = 4 px | (158,744) = 3 px (160,337) = 11 px | (162,579) = 5 px (163,578) = 5 px | (165,333) = 5 px (173,330) = 11 px | (194,309) = 10 px (191,301) = 3 px | (221,196) = 7 px (221,196) = 8 px | (259,533) = 10 px (259,534) = 10 px | (265,997) = 4 px (264,997) = 5 px | (299,1144) = 10 px (292,896) = 2 px | (321,413) = 11 px (301,1141) = 8 px | (322,1042) = 2 px (322,1042) = 2 px | (394,1011) = 3 px (322,409) = 7 px | (410,450) = 3 px (410,451) = 3 px | (443,903) = 6 px (451,432) = 7 px | (445,747) = 3 px (451,663) = 7 px | (450,432) = 8 px (483,981) = 10 px | (452,663) = 7 px (488,722) = 3 px | (486,981) = 10 px (523,867) = 6 px | (522,871) = 10 px (526,1171) = 10 px | (525,1171) = 10 px (535,1107) = 10 px | (534,1107) = 10 px (565,1012) = 6 px | (564,1015) = 6 px (628,199) = 4 px | (626,201) = 3 px (707,598) = 9 px | (707,597) = 8 px (713,568) = 4 px | (712,568) = 4 px (722,966) = 3 px | (741,774) = 4 px (741,775) = 4 px | (76,482) = 10 px (774,346) = 3 px | (797,1111) = 0 px (798,1117) = 4 px | (878,344) = 3 px (900,483) = 3 px | (918,674) = 6 px (917,677) = 4 px | Result Image 1.2 | Result Image 2.2 (75,483) = 10 px | (70,1201) = 2 px (79,912) = 10 px | (76,482) = 10 px (155,742) = 4 px | (79,912) = 10 px (163,582) = 1 px | (157,734) = 0 px (165,330) = 4 px | (163,582) = 1 px (221,202) = 2 px | (165,334) = 6 px (263,1001) = 2 px | (191,303) = 3 px (300,1144) = 10 px | (220,195) = 6 px (321,413) = 13 px | (265,995) = 1 px (409,446) = 1 px | (299,1144) = 10 px (447,437) = 11 px | (321,412) = 12 px (449,657) = 10 px | (325,1045) = 4 px (484,981) = 10 px | (410,451) = 0 px (521,870) = 10 px | (450,430) = 6 px (535,1107) = 10 px | (451,664) = 6 px (564,1016) = 3 px | (484,981) = 10 px (627,206) = 6 px | (525,1171) = 10 px (704,601) = 11 px | (534,1107) = 10 px (738,774) = 2 px | (564,1016) = 3 px (799,1117) = 4 px | (626,201) = 0 px (707,598) = 9 px (714,571) = 1 px (737,774) = 3 px (797,1109) = 1 px
profile of GPU kernels
Profiling application: /usr/local/MATLAB/R2013b/bin/glnxa64/MATLAB -r auto3d_track_tif_average4_gpu(1, 1) -nodesktop
Profiling result:
Time(%) Time Calls Avg Min Max Name 79.21% 583.413s 480 1.21544s 867.25ms 1.81715s MSortFloat(float const *, int, int2, float*, int2, int2, float, float) 13.61% 100.225s 984 101.86ms 19.782ms 195.21ms LoadElementsF(float const *, int2, int2, int2, unsigned char const *, float*) 7.14% 52.5967s 504 104.36ms 18.956ms 143.72ms BitonicSortFloatWD(float const *, float*, int2, int, unsigned int, unsigned int, int2, int2, float, float) 0.02% 175.51ms 24 7.3130ms 7.2417ms 7.5373ms SortBitonicMedian(unsigned short const *, unsigned short*, int2, int, unsigned int, unsigned int, int2, int2, float) 0.01% 69.727ms 84 830.08us 785.32us 1.9273ms [CUDA memcpy DtoH] 0.01% 57.600ms 166 346.98us 800ns 786.79us [CUDA memcpy HtoD] 0.01% 37.227ms 24 1.5511ms 1.4893ms 1.6385ms LoadElements144(unsigned short const *, int2, int2, int2, unsigned short*) 0.00% 647.71us 2 323.86us 321.73us 325.98us FromBack(unsigned short const *, unsigned short const *, int2, double*) nvprof matlab -nodesktop -r 'auto3d_track_tif_average4_gpu(1, 1)' 750.38s user 179.94s system 98% cpu 15:48.89 total
profile of matlab code
Function Name Calls Total Time Self Time* Ordfilt (MEX-file) 656 6423.917 s 6423.917 s selectDevice 658 940.299 s 940.299 s multiSearch 16 280.003 s 0.010 s centerSearch 293 279.993 s 6.642 s createProfile_advanced 35160 238.218 s 238.218 s polyfit 35160 35.113 s 5.410 s
Current results using UFO
In the attachments you can find some results using the UFO framework and the amount of time that was spent to compute each resulting image. You can use the following command to extract the data.
$ tar xfvz bench.tar.gz
Last modified 11 years ago
Last modified on Oct 23, 2014, 11:20:11 AM
Attachments (9)
- ImageSrc1.tif (2.2 MB) - added by 11 years ago.
- ImageRes1.tif (2.2 MB) - added by 11 years ago.
- ImageSrc2.tif (2.2 MB) - added by 11 years ago.
- ImageRes2.tif (2.2 MB) - added by 11 years ago.
- ImageRes1.2.tif (2.7 MB) - added by 11 years ago.
- ImageRes2.2.tif (2.7 MB) - added by 11 years ago.
- ImageSrc1.2.tif (2.7 MB) - added by 11 years ago.
- ImageSrc2.2.tif (2.7 MB) - added by 11 years ago.
- bench.tar.gz (11.2 MB) - added by 11 years ago.