[Neolith-users] current usage instructions (still very simple and early)

Peter Kjellstrom cap at nsc.liu.se
Fri Aug 3 17:41:50 CEST 2007


Available via http too: http://www.nsc.liu.se/systems/neolith/testpilot.html


Minimal Neolith testpilot instructions
What we expect

    * that you read the e-mail sent to testpilots carefully
    * lots of feedback (problems, wishes, questions, ...) to 
support at nsc.liu.se
    * performance figures (see previous info)
    * flexibiliy, be prepared to resubmit jobs, run new ways, etc.
    * that you invest more time than usual understanding what you're doing 
(don't expect things to work, verify that they do) 

What you can expect from the system

    * frequent updates
    * changes in documentation (read all e-mail sent to testpilots)
    * no long jobs allowed without special requests
    * that things can change from one day to the other
    * bugs 

System description
final configuration will be:

    * 805 compute nodes each with 8 x86_64 cores and 16-32 GiB RAM
    * next generation Infiniband interconnect (ConnectX)
    * around 60 TiB of fast /nobackup storage space
    * three login nodes
    * Centos-5 64-bit Linux
    * Intel compilers, Scali MPI, and SLURM batch queueing system 

phase1 limitations:

    * 36 compute nodes (or less)
    * "normal" Infiniband
    * one login node
    * no /nobackup storage, only home directories 

Main differences compared to Monolith

    * 8 processor cores per node instead of 2
    * 64-bit adressing instead of 32-bit
    * 16-32 GiB RAM per node instead of 2 GiB
    * SLURM batch queue system instead of PBS/torque 

Significant similarities compared to Monolith

    * Intel compilers
    * MKL in serial version located 
at /software/intel/cmkl/9.1/lib_serial/em64t/
    * Scali MPI, located at /opt/scali
    * Storage layout with backup protected home directories and /nobackup 
(/nobackup was called /global on Monolith and will only be available on the 
final system (stage2))
    * Moab scheduler (showq command)
    * Scratch disk on compute nodes is: /disk/local 

How to use, with examples
Compiling
Compilers are loaded by default (c.f. module list), building applications not 
requiring MPI or such, should be straight forward.
In order to build with MPI-support, the corresponding MPI-module must be 
loaded.

 $ module list
 Currently loaded modules:
   1) ifort
   2) icc
   3) idb
   4) dotmodules
   5) base-config
   6) default

... compiler already loaded (ifort/icc), no mpi...

 $ module load scampi
 $ icc -Nmpi my_prog.c -o my_prog.mpibin

Submitting jobs

 $ sbatch my_prog.sh

my_prog.sh:

 #!/bin/sh
 #
 # 4 nodes (total 32 cores/mpi ranks)
 #SBATCH -N 4
 #
 # 60 minutes
 #SBATCH -t 60

 module load scampi
 /software/tools/bin/mpprun my_prog.mpibin

Note1: mpprun needs the correct mpi to be loaded with the module command.
Note2: mpprun does not allow you to choose size (-np). If you want to lauch 
less (or more) than nodes x 8 ranks then use the slurm option -n. If you 
wanted to use 16 cores instead of default all (32) above then this line 
should be added to the submit script:

 #SBATCH -n 16

Interactive jobs (like qsub -I on monolith)
Use the command

 interactive

to submit an interactive job. Time,
number of nodes etc are specified on the command line using the same
syntax as in a batch script.

Request an interactive job using four nodes for 60 minutes:

 $ interactive -N4 -t 60
 Waiting for JOBID 640 to start
 ....
 $

Monitoring jobs

    * squeue (shows jobs from SLURM perstpective, like qstat on Monolith)
    * scancel (cancels a job, like qdel on Monolith)
    * showq (same as on Monolith)
    * sinfo (shows node overview, free, used, down...)
    * ssh to the node (and run top, ps, etc.) 

Login-node and storage quota

    * top (press M (shift+m) to sort by memory usage)
    * /home/diskinfo (file with nightly disk usage summary)
    * df 

More information

    * slurm userguides: quickstart guide (note: general documentation, not 
100% applicable to Neolith, maybe not even 30%...)
    * scampi userguide: /opt/scali/doc/SMC55_UserGuide.pdf 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://www.nsc.liu.se/pipermail/neolith-users/attachments/20070803/4168d02e/attachment.bin


More information about the neolith-users mailing list