[Vagnekman-users] Status update of Vagn

Johan Raber raber at nsc.liu.se
Wed Nov 25 23:33:36 CET 2009


Dear Vagn users,

Those of you with data stored on /nobackup/vagn1 should have received a
list of the files you had stored. If you have not received such a list, but
should have, or have reason to believe the list to be incomplete, you may
very well be right, there were errors accessing some files and directories
and at present there is unfortunately little we can do better to find
these.

The status of Vagn is that after some clarifications of misunderstandings,
IBM is now working on restoring the critical on-disk data structures that
were damaged. No promises of success have been made though. This was a
little over a week ago and at the time, IBM predicted this to be a time
consuming process. We continue to inquire what this means specifically.

We have tried to open access to the cluster to allow retrieval of data from
the home directory but so far failed due to the state of the cluster not
allowing startup of the filesystem daemon on the login node, or otherwise
providing access to the /home filesystem there. Our plan of attack now is
therefore to provide access to a node in the cluster that has /home already
mounted, in effect moving the login. A realistic estimate for this to
happen is by Tuesday next week (Dec. 1), if sooner you will be notified.
Do note however, that the access at this stage is only to allow data
retrieval and that computationally intensive work must be avoided, since
working nodes are a precious resource now.

At this point, to resolve the data corruption issue on /nobackup/vagn1 all
we can do is to provide as much information we can to the vendor and make sure
there are no misunderstandings about the status of the system and the
severe impact this outage has on the Ekman-Vagn data flow. We are aware
that we are nearing a point where this situation is intolerable and where
the value of the data must be weighed against the value of loss in
productivity. We do not feel that we are there quite yet though and will
continue to push to get a time estimate on the restoration of the filesystem.

Best regards,
Johan Raber -- NSC support



More information about the Vagnekman-users mailing list