[Bi-users] Update - Re: Problems with smhid16 and rossby23 at the moment

Fredrik Nyström freny at nsc.liu.se
Fri Mar 6 16:12:07 CET 2020


On 2020-03-06 14:16, Fredrik Nyström wrote:
> On 2020-03-06 11:00, Fredrik Nyström wrote:
>> Dear Bi Users,
>>
>> we are having problems with 4 out of 6 servers for smhid16 and rossby23.
>>
>> Servers will be rebooted shortly...
> 
> Servers has been rebooted and access to smhid16 and rossby23 has been 
> restored.
> 
> Recovery after reboot completed at 13:27 CET. Jobs using smhid16 and 
> rossby23 before this time may have been affected.

Dear Bi Users,

We have been forced to reboot a server for smhid16 and rossby23 again.
We think smhid16 is the file system that is overloaded in some way.

If you have started to run more or different jobs against smhid16 during
the last day or so, please consider refraining for the time being,
ideally until after the planned upgrade on 11-12/3.

If the file systems go down again during the weekend, we can not promise
to bring them up promptly, and we have limited time for troubleshooting
during Monday-Tuesday next week too as we need to prepare for the upgrade.


Kind Regards,
-- 
Fredrik Nyström, National Supercomputer Centre
freny at nsc.liu.se


More information about the Bi-users mailing list