[Neolith-users] Re: Planned downtime 2009-03-28 20:30-21:30, Earth Hour

Pär Andersson paran at nsc.liu.se
Sun Mar 29 00:22:11 CET 2009


Hi again,

Pär Andersson wrote:
 > Neolith will have planned downtime tomorrow, Saturday 2009-03-28
 > between 20:30 and 21:30. Jobs that can't finish before this will
 > remain in the queue and start after the downtime is over.
...
 > The unusual choice of scheduling this on a Saturday evening is
 > because this lets NSC participate in the WWF "Earth Hour"
 > campaign.


The downtime is now over! 99.5% of the compute nodes is online
and have been running jobs since 23:30.

The maintenance took a bit longer than expected, but I think the
results are worth it. All network filesystems on Neolith have
improved performance, especially GPFS (/nobackup/global) but also
/home.

If you are not interested in technical details you can stop
reading now.


Maintenance that was performed:

* Firmware was upgraded on the raid controllers used by both
   /home and GPFS. The new firmware version have improved both
   read and write performance.

* To improve metadata performance more metadata NSDs (disks) was
   added to the GPFS filesystem.

* More NSD-servers was added to GPFS to improve IO
   performance. All NSDs was also redistributed across the
   available NSD servers to improve the load balancing.

* Finally we ran diagnostics on the InfiniBand leaf switches,
   this had been requested by the switch vendor.

Regards,

Pär Andersson
NSC


More information about the neolith-users mailing list