[Neolith-users] Neolith changes next Thursday: longer wall time limit and more...

Mats Kronberg kronberg at nsc.liu.se
Thu Feb 2 14:48:25 CET 2012


Dear Neolith users,

Summary
=======

NSC has decided to increase the wall time limit on Neolith from three
days (72 hours) to seven days (168 hours). This change will be
performed next Thursday (2012-02-09). No downtime will be required for
this change.

We will also change the default wall time for jobs on Neolith from
three days to two hours.

At the same time, we will also upgrade the resource manager (SLURM) to
version 2.3. This upgrade will result in a 5-10 minute period in which
you will not be able to start new interactive jobs, schedule batch
jobs, check job status etc.


Details
=======

Many users have requested a higher wall time limit, and we have
decided that the advantages of raising the limit (users wants to run
longer jobs, similar settings on all NSC SNIC systems) outweigh the
disadvantages (sometimes longer queue time for high-priority jobs,
more difficult maintenance planning). So we have decided to change the
maximum wall time limit for jobs to seven days (the same limit as on
Kappa and Matter).

We will also change a related setting: The default wall time limit
will be lowered from 3 days to 2 hours (the same as on Kappa). This
means that if you do not specify a time limit ("-t HH:MM:SS") for your
jobs, the time limit will be 2 hours.

NOTE: If you have not specified a time limit in your job scripts or on
the command line before, you need to start doing so now. If you do
not, your jobs will end after 2 hours, which might not be what you
want...

Please note that this is only the default time limit. You can choose
any time limit up to 7 days by giving the "-t" option to sbatch or
interactive, e.g:

"interactive -t 12:00:00 -N1" - request one node for 12 hours.

"sbatch -t 4-18:30:00 -N2 job.sh" - request two nodes for 4 days, 18
hours, and 30 minutes.

See the man page for sbatch ("man sbatch") for more details.



While I have your attention - some guidelines on choosing a time limit
for your Neolith jobs:

+ Request a little more time than you think your job will need, but
not much more.

+ To get your test and development jobs to start quickly, request less
than 60 minutes of wall time. Your job will then be eligible to run on
the eight reserved development nodes.

+ Never request less than 60 minutes of wall time for production jobs.
If you do, your production jobs will use the development nodes and
there will be no such nodes available for quick test jobs. Your
project will only be billed for the time actually used, so you do not
lose allocated time by following this advice.

+ Do not run many similar short (less than 10 minutes) jobs. Instead,
combine many small jobs into fewer long jobs.


--
Mats Kronberg, NSC Support <support at nsc.liu.se>


More information about the neolith-users mailing list