[Snic-users] File system problem at NSC

Pär Lindfors paran at nsc.liu.se
Thu Oct 2 18:24:05 CEST 2014


Dear SNIC users,

During the last few weeks some users have reported various strange
problems on SNIC systems at NSC. Software that have been working fine in
the past would suddenly crash or fail to work correctly in some other
way.

Example of reported issues:

 * Could not compile software using CMake (for example DALTON)
 * Problems running RStudio
 * Problems running Intel VTune

We have narrowed this down to a problem in the GPFS software, that is
used for the file systems /home, /nobackup/global and /software.

The technical explanation is that in some conditions the system call
writev() will incorrectly fail with the error code EINVAL (invalid
argument). The problem have been assigned IBM APAR numbers IV64862 and
IV64863.

We received fixed software packages yesterday.

On Triolith, all jobs that have been started since yesterday evening
will run on compute nodes where the fix have been applied. Login servers
will soon be rebooted to apply the fix. More details will be sent out to
the triolith-users list.

Kappa and Matter have not been upgraded, but this will be done very
soon. More information will be sent out to the kappa-users and
matter-users lists.

Regards,
Pär Lindfors, NSC



More information about the Snic-users mailing list