[Berzelius-users] Important updates to scheduling policy and automatic job termination
Henrik Henriksson
hx at nsc.liu.se
Mon Feb 26 11:37:13 CET 2024
Dear Berzelius users,
We will implement some changes to the automatic job termination.
# Upcoming changes
- A new reservation has been created, "safe". Jobs running within this
reservation will be safe from automatic job termination. However, this
reservation will be intentionally underprovisioned, so expect longer queue
times. Add `--reservation=safe` to your slurm invocation to use the
reservation. With the creation of this reservation, we will apply stricter
rules to jobs outside of this reservation. Within this reservation we may
impose limits on job width.
- Job termination will no longer be based on average wattage since the start of
the job. Instead, we will use an exponential moving average [0].
- The power limit will be gradually increased to at least 100W.
- Interactive jobs have so far been fully exempted from automatic job
termination. In the future, interactive jobs will only be exempt from
automatic job termination for the first 8 hours of walltime.
The changes will be rolled out gradually over the next week or so. The `safe`
reservation is available as of today. It should be noted that the `safe`
reservation is not intended to be a long-term solution for projects, rather a
stop-gap solution while looking to improve code. Efficient use of resources are
considered by the Berzelius allocation staff.
# Motivation
We are updating the scheduling policy in this manner based on metrics,
experiences and user interactions collected since we started working on
improving the efficiency. So far, we see a measurable and significant
improvement in how the system is used. These changes are intended to mitigate
limitations imposed on users, as well as allowing for automatic job termination
in some common situations where this haven't been done so far.
- The old scheduling policy place a small subset of users in a position where
they can't work at all. This is partially mitigated by the
`1g.10gb`-reservation, but not all users are able to use that. The
`safe`-reservation is intended to mitigate this, by providing a manner in which
the job will always run safely. However, to provide incentive to use the system
efficiently, the size of this reservation will be limited. That means that
queue-times will be artificially and intentionally longer.
- Basing job termination on average use works fine most cases. "Delayed
starts", where nothing happens, are terminated after one hour, as are jobs
that simply don't manage to saturate the GPU. However, for "forgotten" jobs,
where the GPU was used for a few hours and then went idle, we need a
different averaging function to allow for a faster decay. In practice, we
don't think this particular change will affect users noticably.
- A common pattern for interactive jobs are forgotten sessions - users allocate
resources, use them for a while and then forget to terminate the job. So far,
we have intentionally avoided interactive jobs at all. In the future we will
terminate them according to the same policy as other jobs, but interactive
jobs will have a grace time of 8 hours (= a full workday), instead of the
normal one hour.
As always, please contact berzelius-support at nsc.liu.se with any questions or comments.
[0] We will leave the exact parameters open for us to adjust. To start with, we
aim for the "step function" for a job going from a full load down to idle to
allow for the job to continue running for approximately one hour.
Kind regards,
Berzelius Staff
More information about the Berzelius-users
mailing list