Slurm against torque

Slurm is a very interesting queue sytem for high performance computing, scalable and powerful. Several cluster world wide uses Slurm queue system. One main feature (is a little odd, because this should not be a main feature) the development is still on differently from torque…

The Slurm components:

GPU or not GPU

GPU programing for scientific computing has several gains in the processing time and there is some libraries who could help like blas, lapack etc. One option to use GPU and those libraries is cuBlas, with those libraries is possible to run several functions in GPU:

  1. cublas<t>geam();
  2. cublas<t>her()
  3. and all the 152 blas function.

The statment is cuBlas is lot faster than MKL:

Code time analysis for intel ICC or GCC

During a code parallelization is important see which is most time spending function, using gprof and gprof2dot is possible to generate some graphs like this:

In the nutshell, using Debian:

# apt-get install python graphviz
# pip install gprof2dot
# icc yourSoftware.c -pg -o yourSoftware or gcc yourSoftware.c -pg -o yourSoftware
# ./yourSoftware
# gprof ./yourSoftware
# gprof ./yourSoftware | gprof2dot -n0 -e0 | dot -Tpng -o output.png ; eog output.png