[Vagnekman-users] Vagn: end of fat node pilot test 2011-09-21

Mats Kronberg kronberg at nsc.liu.se
Tue Sep 13 15:18:52 CEST 2011


Dear Vagn users,

In one week, on 2011-09-21, we will end the pilot testing of the new
Vagn nodes a7 and a8, and configure them as normal analysis nodes.


Important information for all Vagn users:
=========================================

The new nodes will be configured in the same way as a6: they will be a
part of the share/noshare partitions, and can be used for small jobs
if no non-fat nodes are available.

The scheduler will try to fill a2-a5 with jobs first, then a6, then a7
and finally a8. As far as possible the fattest nodes will be kept idle
and ready to accept big jobs, but not if that would result in a small
job being prevented from starting.

Hopefully this will work well for all user groups. If not, we can
always change the configuration later. This initial configuration was
agreed on at the Vagn-Ekman user-group meeting at KTH on September
6th.

Note: if you have not yet verified that your applications will work on
the new nodes, you should do so as soon as possible. When we add the
new nodes to the share/noshare partitions, any queued jobs might start
on one of the new nodes. See
https://lists.nsc.liu.se/mailman/public/vagnekman-users/2011-July/000221.html
for information on how to recompile your applications.


Important information for users of the pilot partition
======================================================

In order to reconfigure the system, a7 and a8 will be reserved from
2011-09-21 at 13:00 CEST. This means that you can run jobs in the
pilot partition up until that time, but you need to choose a walltime
limit (-t HH:MM:SS) so that the job ends before that time, or the job
will never start. I.e two days before the reservation you can only run
jobs that requests less than 48h of walltime.

When the reservation starts, we will reconfigure the system and move
a7 and a8 to the share/noshare partitions as quickly as possible. When
this is done, jobs queued in the share or noshare partitions will
start running on a7 and a8.

Any jobs that remain in the pilot partition queue will be killed and
the pilot partition removed.


--
Mats Kronberg, NSC Support <vagnekman-support at snic.vr.se>


More information about the Vagnekman-users mailing list