Modify

Opened 13 years ago

Closed 13 years ago

#141 closed defect (fixed)

Slow performance on ufosrv1

Reported by: Matthias Vogelgesang Owned by: Suren A. Chilingaryan
Priority: major Milestone: ufo-core-0.2
Component: Infrastructure Version:
Keywords: Cc: Suren A. Chilingaryan, Tomas Farago, David Haas, Tomy Rolo, Matthias Vogelgesang

Description

David and Tomas investigated the performance of the framework on the UFO server. Unfortunately, it is pretty disappointing. So for a rather small data set, it takes about 580 seconds using a single GTX 580. This is more than an order of magnitude slower than the 12.5 seconds it takes on my desktop machine using a similar GTX 580.

Now, I profiled a bit here and there but could not yet find the real source of the problem. To me it is unclear, if it's a problem with the server itself or the software.

Attachments (0)

Change History (9)

comment:1 Changed 13 years ago by Matthias Vogelgesang

Component: ufo-coreInfrastructure
Owner: changed from Matthias Vogelgesang to Suren A. Chilingaryan
Status: newassigned

I just checked the same data set on ipepdvcompute1 and it rushed through in 14.9 seconds. I have to assume that there's something wrong with the server hardware rather than the software.

comment:2 Changed 13 years ago by Suren A. Chilingaryan

Can you share results of your profiling? Where the time is actually spent?

comment:3 Changed 13 years ago by Tomy Rolo

Cc: Tomy Rolo added

comment:4 Changed 13 years ago by Suren A. Chilingaryan

Cc: Matthias Vogelgesang added

Matthias can you provide profiling information. I'd like to know which of the filters uses this time (i.e. is it I/O or computations). oprofile log will be helpful.
Alternatively, provide the dataset and script causing problems I'll do testing myself.

comment:5 Changed 13 years ago by Matthias Vogelgesang

oprofile is not very helpful, it is showing me that most of the time is spent in the kernel, followed by calls to libOpenCL.so and the pthreads library. Anyway, we have to postpone the investigation for some days, because we need the server untouched at the TopoTomo? beamline.

comment:6 Changed 13 years ago by Matthias Vogelgesang

Resolution: fixed
Status: assignedclosed

Suren's change that reverted the NVIDIA CUDA version back to 4.2 fixed the problem. Now it takes about 20 seconds to reconstruct.

comment:7 Changed 13 years ago by Matthias Vogelgesang

Resolution: fixed
Status: closedreopened

I have to resurrect this ticket once again and it should stay open until all (performance) problems are fixed. It is clear that startup for multiple GPUs is bad. However, even when restricting the number of GPUs to one, it is extremely bad for ufosrv1. I did some measurements concerning context setup, program compilation, kernel creation, buffer creation and cleanup. On each machine I restricted the number of GPUs to one (except for the AMD machine) and enabled persistence mode:

Machine Setup Compilation Kernel Buffer Cleanup
my desktop 0.06s 0.000210s 0.000007s 0.000007s 0.031s
ufosrv1 3.8s 0.000166s 0.000007s 0.000005s 3.6s
compute1 0.75s 0.000162s 0.00007s 0.000006s 0.3s
compute2 0.02s 0.09s 0.000015s 0.000035s 0.001371s
kepler 0.36s 0.000091s 0.000004s 0.000004s 0.00916s

With all GPUs enabled the twins behave like this:

ufosrv1 20.32s 0.000683s 0.000008s 0.000005s 19.06s
compute1 7.87s 0.000937s 0.000009s 0.000006s 1.75s

comment:8 Changed 13 years ago by Suren A. Chilingaryan

  • Updated to driver to 313.18. Seems to be a bit faster.
  • Fancy. I traced ocl application with ltrace, there is a lot of consecutive calls to random number generator. I guess this that the NVIDIA driver does most of these 20 seconds.

comment:9 Changed 13 years ago by Matthias Vogelgesang

Resolution: fixed
Status: reopenedclosed

It's considerably better now in terms of startup time. Run-time performance could be a bit better (especially compared to my desktop) but I will close this ticket again.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Suren A. Chilingaryan.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.