[Dunder-users] NSC migftp (tapestorage) back in service
Peter Kjellstrom
cap at nsc.liu.se
Tue Feb 19 17:39:06 CET 2008
Short version for the impatient: migftp now works again on the login-nodes of
tornado, dunder and blixt with no known bugs. If you have previously
used '-c' to any migftp put or similar please read on below.
The service stop for the NSC migftp/tape storage ended up alot longer than
planned. This was due to the discovery, by a user, of failed (corrupted) file
transfers. Since this involved data integrity we had no choice but to leave
the system down until it had been properly investigated.
The results of the investigation are that using the '-c' option (continue)
with migftp when doing put or mput sometimes caused only parts of files being
correctly transfered. The investigation did not find any cases where a normal
put/mput corrupted data.
We don't know how common the use of the '-c' option have been in the past and
if you know that you have used it please attempt to verify past transfers or
contact support for help.
The solution now implemented is to remove the possibility of using this
option. The migftp now available on our systems know nothing of any '-c'
option and has been verified to correctly transfer data.
During the investigation (and after) it would have been very nice to have
checksums for user data. As such NSC would like to recommend users to (as
soon after creation as possible) create checksums for datasets/datafiles. It
is not very complicated and allows both system-administrators and users to
verify the integrity of the data. Here follows a small example of how
checksums can be generated and verified.
$ ls -l
-rw-r--r-- 1 cap cap 2097152 2008-02-19 17:29 datafile_1.gz
-rw-r--r-- 1 cap cap 1048576 2008-02-19 17:29 datafile_2.gz
$ md5sum *.gz > MD5SUMS
$ echo "bad data" > datafile_1.gz
$ md5sum -c MD5SUMS
datafile_1.gz: FAILED
datafile_2.gz: OK
md5sum: WARNING: 1 of 2 computed checksums did NOT match
$
Explanation: We start with two datafiles, we generate checksums, we overwrite
one of the files (corrupt it) and finally we ask md5sum to verify all files
and we can clearly see that one of them is now damaged.
NSC apologises for the inconvenice this stop has caused and asks concerned
users to contact support,
Peter K and the NSC Support-team
--
------------------------------------------------------------
Peter Kjellström | E-mail: cap at nsc.liu.se
National Supercomputer Centre |
Sweden | http://www.nsc.liu.se
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://www.nsc.liu.se/pipermail/dunder-users/attachments/20080219/6364bcc3/attachment.bin
More information about the dunder-users
mailing list