Changes between Version 9 and Version 10 of students


Ignore:
Timestamp:
Aug 14, 2014, 10:40:49 AM (10 years ago)
Author:
csa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • students

    v9 v10  
    7272 * Contact person: Suren A. Chilingaryan <csa@suren.me>
    7373 * [raw-attachment:1407-internship-drivers.pdf Detailed announcement]
    74  * Required Skills: Very good knowledge of the C/C++ programming language, acquaintance with POSIX standards, understanding of
    75 process synchronization. Prior experience in developing Linux kernel modules is a plus.
     74 * Required Skills: Very good knowledge of the C/C++ programming language, acquaintance with POSIX standards, understanding of process synchronization. Prior experience in developing Linux kernel modules is a plus.
    7675 * Linux kernel development, PCIe-based scientific electronics, DMA protocols
    7776
     
    8887
    8988== Web-based monitoring of large-scale data in scientific experiments ==
    90  * Contact person: Suren A. Chilingaryan csa@suren.me
     89 * Contact person: Suren A. Chilingaryan <csa@suren.me>
    9190 * [http://www.ipe.kit.edu/648_632.php Apply online]
    9291 * [raw-attachment:1301-adei-status-v2.pdf Detailed announcement]
     
    9897
    9998= Highlighted Master topics =
     99== Enhancing the quality of tomographic reconstruction by advanced iterative algorithms optimized for parallel architectures ==
     100 * Contact person: Suren A. Chilingaryan <csa@suren.me>
     101 * [raw-attachment:1407-master-art.pdf Detailed announcement]
     102 * Required Skills: Very good knowledge of the programming languages C and Python. Good knowledge of linear algebra and computer vision algorithms. Prior experience in parallel programming with CUDA, OpenCL, MPI, SIMD, OpenMP or Pthreads is a plus.
     103 * Experience Gained: Synchrotron imaging, compressive sensing theory, iterative image reconstruction, non-linear optimization, parallel programming, GPU programming.
    100104
     105X-ray microtomography is a powerful tool to analyze and understand internal otherwise invisible mechanisms of small
     106animals. Resolution and duration of experiments with living objects are currently limited by radiation damage. The
     107compressed sensing theory has demonstrated the feasibility to recover signals from the under sampled data and, hence,
     108opens up the possibility to reduce radiation. These reconstruction techniques are computationally very
     109demanding and have therefore been not used for synchrotron experiments up to now.
     110
     111The master thesis will be performed within an international project that aims to develop a novel instrumentation for
     112ultrafast imaging at synchrotron light sources. We expect the student to get familiar with the latest developments in the
     113field of compressive sensing theory and its application to tomographic image reconstruction. Advanced methods
     114described in literature have to be evaluated using realistic sample datasets. Promising algorithms should to be
     115implemented. To take advantage of the latest high-performance computing hardware, the selected algorithms have to be adapted for better mapping to massively parallel architectures. The implementation in OpenCL will be optimized for latest GPU architectures from AMD and NVIDIA.
     116
     117== Enhancing the quality of tomographic reconstruction by advanced iterative algorithms optimized for parallel architectures ==
     118 * Contact person: Suren A. Chilingaryan <csa@suren.me>
     119 * [raw-attachment:1407-master-astor.pdf Detailed announcement]
     120 * Required Skills: Strong C and Python knowledge, numerical algorithms in image processing. Experience with parallel programming is a plus.
     121 * Experience Gained: Synchrotron Imaging, 4D Tomography, Image Segmentation, Optical Flow, Parallel programming, GPU programming.
     122
     123Recent developments in X-ray microtomography (SR-μCT) facilitate the investigation of internal morphology and structural changes in small living organisms in 4D (3D + time). In order to analyze internal dynamics existing instrumentation records hundreds of 3D
     124volumes with high-resolution within a few minutes. The first step in data analysis is segmented of the functional units. Currently this is a manual task requiring months of work of highly skilled biologists.
     125
     126The aim of this work is to develop algorithms for semi-automatic segmentation of 4D tomographic volumes and to implement them. One possible solution is to use the optical-flow in sequences of 3D volumes and use it to map manual segmentations of the selected volumes to consecutive frames. The algorithms have to be optimized for the latest parallel computing architectures. The work is embedded in national and international collaborations for high data-rate processing and performed within an interdisciplinary team of computer scientists, synchrotron physicists, and biologists.
     127
     128== High-speed tracking of fluorescent nanoparticles in 3D and with subnanometer precision using Parallel Accelerators ==
     129 * Contact person: Suren A. Chilingaryan <csa@suren.me>
     130 * [raw-attachment:1307-tvt-v1.pdf Detailed announcement]
     131 * Required Skills: Good knowledge of C/C++ programming language. Prior knowledge of parallel programming models is a plus.
     132 * Experience Gained: Parallel programming, GPU programming, Image processing
     133
     134In the field of organic and printed electronics (e. g. polymer solar cells, OLEDs or Li-Ion batteries) there is a growing demand for thin functional layers with highly homogeneous surface topology. If these layers are coated from the liquid phase, the coating and
     135drying steps affect the surface quality. During the drying process, Marangoni convection might occur, leading to surface inhomogeneities. To get a better understanding of convection process we apply μPIV using fluorescent nanoparticles to resolve
     136the respective flow field in the liquid phase. In case of a 3D a multifocal system is used to acquire images in different layers at
     137the same time.
     138
     139During experiment a 4 GB of data is recorded every second by 5 high-speed cameras. It is a challenge to analyze such amount of
     140data interactively and extract particle trajectories. At a first step, we expect student to parallelize the data evaluation codes and
     141optimize for latest GPU architectures from AMD and NVIDIA. On the second stage, the codes should be modified to run in GPU-
     142cluster environment.
     143
     144== Optimizing high speed data transfer and processing of DAQ systems with NVIDIAs GPUDirect ==
     145 * Contact person: Suren A. Chilingaryan <csa@suren.me>
     146 * [raw-attachment:1407-master-gpudirect.pdf Detailed announcement]
     147 * Required Skills: Good knowledge of C/C++ programming language as well es Linux kernel and driver development. Knowledge of parallel programming models is a plus.
     148 * Experience Gained: Parallel programming, GPU programming, RDMA data transfer mechanisms.
     149
     150Recent data acquisition systems are characterized by increasing data rates and the need for efficient online analysis and monitoring.
     151Conventional CPUs are no longer able to handle the increased computational demands of scientific processes. In the field of high
     152performance computing, GPUs with their modern and simple methods to utilize parallel processing make for an easily accessible
     153alternative to classical CPU computing. Unfortunately, the gap between computational capabilities of GPU systems and
     154throughput of system memory has grown tremendously and becomes the main factor limiting performance. This is especially harmful for PCI-express (PCIe) based data acquisition systems using multiple GPU cards for data processing. Using standard
     155approaches to handle PCIe devices, the data will be copied into the system memory, sometimes multiple times, at each stage of data
     156processing pipeline. For instance, the standard pipeline consisting of 3 stages (data readout from the frame-grabber card,
     157preprocessing on GPU, and dispatch to the remote server over network or Infiniband interface) will include 4 copies in system
     158memory at least and usually more depending on the hardware and software configuration.
     159
     160Recently NVIDIA revealed the GPUDirect for RDMA technology to relieve the load on the system memory. The GPUDirect/RDMA technology enables point to point transfers between PCIe devices and NVIDIA GPU on the same bus bypassing the system memory entirely. The alternative technology is called GPUDirect for Video and developed by NVIDIA specially for high-speed frame grabbers. For his Diploma work, the student is supposed to:
     161 * Compare GPUDirect/RDMA and GPUDirect/Video technologies ,
     162 * Provide GPUDirect-enabled drivers for our FPGA-based data acquisition platform (FDAP),
     163 * Investigate if similar approach could be used to transfer data between FDAP and Infiniband adapters directly ,
     164 * Evaluate the technology in terms of latency and throughput compared to the existing drivers,
     165 * Check if and how the GPUDirect technology can be used with UFO parallel processing framework to split load across the nodes in GPU cluster.
     166The performance benefit of technology should be demonstrated for realistic scenarios like ultra high-speed X-ray
     167tomography.
    101168
    102169== Optimizing imaging algorithms to the latest parallel CPU and GPU architectures ==