[Vagnekman-users] Vagn: slow access to SMHI/Accumulus file systems

Mats Kronberg kronberg at nsc.liu.se
Tue Jan 24 18:08:46 CET 2012


On Thu, Jan 12, 2012 at 14:22, Mats Kronberg <kronberg at nsc.liu.se> wrote:
> We have now upgraded the Vagn-Accumulus network link. Please let me know if
> you still see performance problems on the Accumulus file systems.

This upgrade solved one problem but introduced another... As you might
have noticed, several Vagn nodes have stopped working during the last
week (which killed all jobs on those nodes).

When we upgraded the Vagn-Accumulus connection, it became possible for
the Accumulus file servers to overload the network interface on Vagn
analysis nodes.

In severe cases this would result in the scheduling system declaring
the node as down and killing all jobs on it. In less severe cases the
analysis node would appear to be very slow, perhaps impossible to use
interactively.

We have now reconfigured the network flow control. I can no longer
reproduce this problem, so I hope Vagn is now back to normal.

--
Mats Kronberg, NSC Support


More information about the Vagnekman-users mailing list