[Vagnekman-users] update: forwarded flashnews - lost file-server

lars malinowsky lama at pdc.kth.se
Fri Nov 20 10:50:54 CET 2009


Hello again,

2009-11-20 at 10:40
    afs-server now salvaged, gradually resuming batch-operations. 

as mentioned earlier - an explicit impact on batch jobs
is that they could not change state (since ~2300 last night)
otherwise they should be unaffected.

The impact for users/jobs having files/applications on the broken
server is that access to those typically have failed with messages
similar to 'connection timed out'

regards,
lars/pdc-staff.
- - - 
> Hello,
>
> forwarding this out of the flash-news at www.pdc.kth.se
>
> - - - 
> 2009-11-20 at 06:05
>   We seem to have problems with at least one afs-server.
>   Home catalogues, applications, and also batch-job-processing
>   affected. 
> - - -
>
> In addition to random home-catalogues, and applications, important
> parts of data of all batch-jobs, node-reservations et cetera for
> ekman, lives on this file-server.
>
> The impact is that no changes of jobs/nodes can be made,
> your reserved nodes/running jobs will stay reserved/running.
>
> Once the server is back on-line again they will show up. However,
> any attempted changes (i.e. jobs finishing and nodes released)
> during the outage will have failed.
>
> Should the server fail to come on-line we have backups on tape.
>
> sorry for the inconvenience,
>
> regards,
> lars/pdc-staff.


More information about the Vagnekman-users mailing list