[Bi-users] Login node reboots 3 May at 17:00, node boots ongoing

Kent Engström kent at nsc.liu.se
Wed Apr 25 17:29:06 CEST 2018


Bi users,

as discussed with our user representatives on today's followup meeting,

it's time for a new attempt at installing non-broken mitigation for the
Spectre variant 2 processor security issue (Spectre/Meltdown was the big
news at the start of the year).

As you might remember, we installed microcode updates in January that we
had to revert as they turned out to affect system stability. Since then,
Intel has provided fixed microcode. Also, the latest Red Hat / CentOS
kernel uses a somewhat different version of the mitigation, with
Google's "retpoline" software trick as the main component.


We do not expect this version to crash jobs like the last attempt, but
there may be performance impacts. Anyhow, we are interested in feedback
(especially from those of you who saw problems during January) if you
see things that are clearly different before and after today's
change.

The nodes are being updated as they become idle. This means that jobs
that got started after 16:47 today will run on nodes with the new node
image.

We will restart the login nodes and the system server next week on
Thursday 3 May starting at 17:00 CEST.

Best Regards,

-- 
Kent Engström, National Supercomputer Centre
kent at nsc.liu.se, +46 13 28 4444



More information about the Bi-users mailing list