[Vagnekman-users] Unscheduled reboot of the Vagn login node (analysis1)

Mats Kronberg kronberg at nsc.liu.se
Wed Sep 28 16:33:41 CEST 2011


Dear Vagn users,


A short while ago, the Vagn login node ran out of memory and had to be rebooted.

The cause was a single user running a very large "idl" process.

I would like to remind all users that running processes that use large
amounts of CPU or RAM is not allowed on the login node. Please
allocate an analysis node for that (using sbatch or interactive). If
an analysis node runs out of memory, only one or a few users are
affected, but when it happens on the login node, all Vagn users are
affected, including data transfers.

The login node was unavailable to users from approximately 15:42 CEST
to 16:25 CEST.

Since we had been planning to reboot the login node for a routine
operating system upgrade during the next week, we decided to do that
upgrade while the login node was down in order to avoid a second stop.


//Mats


More information about the Vagnekman-users mailing list