[Neolith-users] /nobackup/global unavailable

Pär Andersson paran at nsc.liu.se
Sun Jan 11 22:31:38 CET 2009


Hi,

Pär Andersson wrote:
> The file system /nobackup/global on Neolith is currently having hardware 
> problems.

The hardware problems with /nobackup/global have been resolved and 
Neolith is back in normal operation.

We do not believe that any data have been lost. If you do find damaged 
files please let us know. Files written to between 2009-01-09 22:30 and 
2009-01-10 01:45 is most likely to have been affected.


You can stop reading here if you are not interested in technical details 
about what happened.

* At 22:38 Friday 2009-01-09 a raid controller failed. This affected two 
NSDs (Network Shared Disks) used by the GPFS filesystem /nobackup/global.

* The problem was for unknown reasons not properly detected by GPFS 
until 01:39 2009-01-10.

* Before 01:39 several commands using the file system would just hang.

* After 01:39 only reads of files striped across the unavailable NSDs 
failed with read errors, as expected.

* The raid controller was restarted today and both NSDs was recovered 
without problem.

Best Regards,

Pär Andersson
NSC



More information about the neolith-users mailing list