== Active Nodes == * 192.168.11.1 - ipepdvcompute1.ka.fzk.de (compute: Fermi, storage) * 192.168.11.2 - ipecamera.ka.fzk.de (camera) * 192.168.11.3 - ipekatrinadei.ka.fzk.de (storage) * 192.168.11.4 - ipeusctcompute1.ka.fzk.de (compute: Fermi) * 192.168.11.5 - ipepdvcompute2.ka.fzk.de (master, compute: Kepler, Xeon Phi) * 192.168.11.6 - ipepdvkepler.ka.fzk.de (camera): moved to ANKA * 192.168.11.7 - ipepdvsrv2.ka.fzk.de (virtualization) * 192.168.11.8 - ipepdvcompute3.ka.fzk.de (compute: AMD) * 192.168.11.6x - detached student cluster nodes == Remote Nodes == * 192.168.11.180 - ipepdvdev1.ka.fzk.de * 192.168.11.117 - ipechilinga2.ka.fzk.de (csa) == Installation == * [wiki:InfinibandConfiguration12.x OpenSuSE 12.x] * [wiki:InfinibandConfiguration11.4 OpenSuSE 11.4] == Diagnostic == * Port information: ibstat * Hardware graph: iblinkinfo * Network Diagnostic: ibdiagnet -ls 10 -lw 4x * Ping: ibping -S ; ibping == Storage == * Fast storage for camera streaming * First 8 TB on storage boxes attached to ipekatrinadei and ipepdvcompute1 * Exposed over ''iSER'' protocol using ''tgt'' server * No access sharing, single user only * Both devices allocated to the camera computer using openiscsi * Software Raid 0 is assembled from the devices * Formatted using ''XFS'' and mounted under ''/mnt/server'' * Big and slow storage * ''glusterfs'' is used currently, but ''fhgfs'' is faster and may be better option if they go open source (there are some plans) * OpenSuSE packages does not include RDMA support, the source RPM should be recompiled on the system with installed Infiniband stack * Storage mounted under ''/pdv'' * ''/pdv/home/'' clustered home folders * ''/pdv/data/'' sample data sets * External computers may mount the storage over NFS {{{ ipepdvcompute1:/storage> /pdv nfs defaults,_netdev,mountproto=tcp 0 0 }}} * [wiki:UfoClusterStorage Configuration] == Management == * ''ipepdvcompute1'' is a master node * If you have account on ipepdvcompute1, you may convert it to the cluster account by creating empty ~/.pdvcluster folder. Then, hourly cron-job will: * Create accounts on all cluster nodes and synchronize uids across the cluster nodes. If you want to mount it over NFS on your desktop, you will still need to match your desktop user uid to the uid you are using on ipepdvcompute1. * Create a cluster home in ''/pdv/home/''. * Replicate ''~/.ssh'' from ''ipepdvcompute1'' to all cluster nodes to allow public key authentication. * The ''~/.ssh/'' from ipepdvcompute1 will be re-replicated every hour. So, you can add/change keys on ''ipepdvcompute1'' and they will be propagated. * ''/pdv/cluster/cluster_run.sh'' will run command on all cluster nodes * You'll need to put/generate a ssh private key into the ''~/.ssh'' on ipepdvcompute1 and put corresponding public key to ''authorized_keys'' * Go to ''.ssh'' folder: ''cd ~/.ssh'' * Generate key (press enter when asked for password): ''ssh-keygen -t dsa'' * Append public key to the authorized_keys: ''cat id_dsa.pub >> authorized_keys'' * Wait 1h until keys are propagated * Example: ''/pdv/cluster/cluster_run.sh head -n 1 /etc/issue''