[Neolith-users] Neolith problems yesterday

Pär Andersson paran at nsc.liu.se
Fri Sep 26 18:26:17 CEST 2008


Dear users,

Yesterday at 14:30 we had problems with the ethernet network on Neolith. The 
problems lasted for about 10 minutes before they were noticed and fixed. 
During this time the /nobackup/global file system became unavailable. This 
caused several jobs using that file system to fail. Several jobs were also 
started, and immediately failed as the file system was not available.

The problem was caused by a service technician performing hardware replacement 
of an ethernet switch. The new switch was not configured correctly when 
connected, and caused a loop in the network which resulted in packet storms.

As /nobackup was not available slurm-XXXX.out-files may not exist for all 
failed jobs. If you need help to determine if any specific job failed due of 
this problem then please contact us at support at nsc.liu.se.

We are very sorry for the inconvenience caused by this.

Regards,

-- 
Pär Andersson
National Supercomputer Centre
Sweden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://www.nsc.liu.se/pipermail/neolith-users/attachments/20080926/5fabcab3/attachment.bin


More information about the neolith-users mailing list