wiki:UfoServer

Version 7 (modified by Suren A. Chilingaryan, 13 years ago) (diff)

--

UFO Server

System

  • Host name: ufosrv1.ka.fzk.de
  • Interfaces: eth0 (10 GBit), eth1 (upper-right socket)
    • eth0: dhcp
    • eth1: 141.52.111.135/22
    • Gateway: 141.52.111.208 (via dhcp)
    • Name server: 141.52.111.248 (via dhcp)
  • Running services
    • SSH on ports 22 and 24
    • NX server over SSH
    • VirtualGL (OpenGL forwarding)

Hardware

  • Display connected to integrated video card (Matrox G200)
  • 2 x Xeon X5650 / Intel X58 / 96 GB DDR3
  • System drives: 2 x Hitachi 2TB SATA2
  • Areca ARC-1880 Raid Controller (x16 slot)
    • 16 x 2 TB Hitachi HUA722020ALA330 in external Areca Enclosure
    • 4 x 256 Crucial RealSSD C300
  • External PCIe 2.0 x16 (x16 slot)
    • External GPU box from One Stop Systems
    • 4 x NVIDIA GeForce? GTX580
  • 2 x NVIDIA GeForce? GTX580 (x16 slots)
  • Intel 82598EB 10GBit Ethernet (x4 slot)
  • Silicon Software CameraLink? FrameGrabber? MicroEnable? IV Full (PCIe 1.0 x4 slot)
  • Free slots
    • PCI express: x4 and x1
    • Storage: 2 x SSD in the main server case

Areca Raid Configuration

  • A single Areca-1880 controller handles both external storage box with SATA hard drives and internal SSD cache. Only a pair of system hard drives are connected to the SATA controller integrated in the motherboard.
  • 16 x Hitachi 2TB SATA hard drives in the external enclosure are organized as Raid-6
  • 4 x Crucial SSD C300 in the server case are organized as Raid-0

Partitioning

  • Two system hard-drives are connected to internal SATA controller and mirrored as Raid-1 using Linux Software Raid.
    • Devices: /dev/sda, /dev/sdb
    • Partitions: /boot (2GB ext2), / (256GB ext4), /home (ext4)
  • The RAID is split into the 2 partitions (GPT partition table): the fast and normal.
    • Device: /dev/sdc
    • The fast partition will be used to stream the data from the camera and should be able to stand throughput of 850 MB/s. The data should be moved out as soon as possible. Only a single application is allowed to write to the disk.
      • Size: first 6TB of disk array
      • File system: non-journaled ext4
      • Mount point: /mnt/fast
    • Standard partition is for short term data storage (before offloading to LSDF)
      • Size: 22 TB
      • File system: ext4
      • Mount point: /mnt/raid
  • The SSD cache
    • Device: /dev/sdd
    • Size: 1 TB
    • File system: ext4
    • Mount point: /mnt/ssd
  • Partition table (/dev/sda & /dev/sdc)
    Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x0003874d
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1   *        2048     4192255     2095104   fd  Linux raid autodetect
    /dev/sda2         4192256   541069311   268438528   fd  Linux raid autodetect
    /dev/sda3       541069312  3907028991  1682979840   fd  Linux raid autodetect
    
  • Partition table (/dev/sdc)
    Disk /dev/sdc: 28.0TB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt_sync_mbr
    
    Number  Start   End     Size    File system  Name     Flags
     1      1049kB  6597GB  6597GB               primary
     2      6597GB  28.0TB  21.4TB  xfs          primary
    
  • Partition table (/dev/sdd)
    Disk /dev/sdd: 1000.0 GB, 999998619648 bytes
    255 heads, 63 sectors/track, 121576 cylinders, total 1953122304 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x000b6bc3
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sdd1            2048  1953122303   976560128   83  Linux
    
  • Raid table:
    Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] 
    md2 : active raid1 sdb3[1] sda3[0]
          1682979704 blocks super 1.0 [2/2] [UU]
          bitmap: 3/13 pages [12KB], 65536KB chunk
    
    md0 : active raid1 sdb1[1] sda1[0]
          2095092 blocks super 1.0 [2/2] [UU]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    md1 : active raid1 sda2[0] sdb2[1]
          268438392 blocks super 1.0 [2/2] [UU]
          bitmap: 0/3 pages [0KB], 65536KB chunk
    
  • mdadm.conf
    DEVICE containers partitions
    ARRAY /dev/md0 UUID=7c032686:e8861a19:9ccb43c3:8f25011e
    ARRAY /dev/md1 UUID=4a18bb5c:4b4b4490:929fdc08:99b65f2f
    ARRAY /dev/md2 UUID=8e7c863e:3a75af81:321862ae:d679602e
    
  • fstab
    /dev/disk/by-id/md-uuid-4a18bb5c:4b4b4490:929fdc08:99b65f2f /                    ext4       acl,user_xattr        1 1
    /dev/disk/by-id/md-uuid-7c032686:e8861a19:9ccb43c3:8f25011e /boot                ext2       acl,user_xattr        1 2
    /dev/disk/by-id/md-uuid-8e7c863e:3a75af81:321862ae:d679602e /home                ext4       acl,user_xattr        1 2
    /dev/disk/by-id/scsi-2001b4d2003077811-part2 /mnt/raid            xfs        defaults              1 2
    /dev/disk/by-id/scsi-2001b4d2064473251-part1 /mnt/ssd             ext4       acl,user_xattr        1 2
    proc                 /proc                proc       defaults              0 0
    sysfs                /sys                 sysfs      noauto                0 0
    debugfs              /sys/kernel/debug    debugfs    noauto                0 0
    usbfs                /proc/bus/usb        usbfs      noauto                0 0
    devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
    

Software

  • OpenSuSe 12.1
    • Gnome Desktop
    • Console tools
    • Development: Kernel, Gnome, Python
    • Default compiler: gcc-4.3
    • Additional packages:
      zypper install bzr
      zypper install cmake
      zypper install freeglut-devel
      zypper install openmpi-devel
      zypper install fftw3-devel
      zypper install python-imaging
      zypper install python-numpy-devel
      
  • Downgrade to gcc 4.3 (gcc-4.6 is not compatible with the latest CUDA toolkit)
    zypper ar http://download.opensuse.org/repositories/devel:/gcc/openSUSE_12.1/devel:gcc.repo
    zypper install gcc43 gcc43-c++ gcc43-locale
    for name in `rpm -ql gcc43 gcc43-c++ | grep "/usr/bin"`; do ln -sf $name /usr/bin/`basename $name -4.3`; done
    
  • CUDA
    • Driver: 285.05.32
    • Toolkit: 4.1 (installed into /opt/cuda)
    • SDK: 4.1 (installed into /opt/cuda/sdk)
    • SDK must be compiled
      cd /opt/cuda/sdk
      make
      
    • Allow reseting failed GPUs. Add following line into the /etc/sudoers:
      ALL ALL=(ALL) NOPASSWD: /opt/cuda/sdk/C/bin/linux/release/deviceQuery
      
  • Install FreeNX server to allow remote desktop
    zypper ar http://download.opensuse.org/repositories/X11:/RemoteDesktop/openSUSE_12.1/X11:RemoteDesktop.repo
    zypper install FreeNX
    nxsetup
    
  • Install VirtualGL to provide OpenGL forwarding
  • PyHST
    • Installed to /opt/pyhst
    • Sources: pyhst/pyhst
      cd /opt
      bzr branch http://ufo.kit.edu/sources/csa/pyhst/
      cd pyhst
      cmake .
      make
      
    • Start script /opt/PyHST
      #!/bin/bash
      
      PACKAGE_HOME=/opt/pyhst
      PACKAGE_SOURCE=${PACKAGE_HOME}
      CUDA_DIR=/opt/cuda.41
      
      export PATH=${INSTALLATION_HOME}/bin:$PATH
      export LD_LIBRARY_PATH=${PACKAGE_SOURCE}:${INSTALLATION_HOME}/lib:$LD_LIBRARY_PATH
      export LDFLAGS="-L ${INSTALLATION_HOME}/lib"
      export CPPFLAGS="-I ${INSTALLATION_HOME}/include"
      
      export LD_LIBRARY_PATH=$CUDA_DIR/lib64/:$LD_LIBRARY_PATH
      export LD_LIBRARY_PATH=${PACKAGE_SOURCE}/hst_cuda:$LD_LIBRARY_PATH
      
      export PYTHONPATH=${PACKAGE_SOURCE}:$PYTHONPATH
      PYTHON=python
      
      ${PYTHON}   ${PACKAGE_SOURCE}/PyHST.py   $*
      
  • UFO Framework
    Matthias, please, put instructions and provide list of all required dependencies above.
    

Usage

  • Reset failed GPU devices
    sudo /opt/cuda/sdk/C/bin/linux/release/deviceQuery
    
  • Check system status
    To be provided by Tomsk students... 
    
  • PyHST
    /opt/PyHST <parameter_file.par>
    
  • Framework

Attachments (1)

Download all attachments as: .zip