[Berzelius-users] Temporarily cheaper GPU billing
Henrik Henriksson
hx at nsc.liu.se
Wed Aug 9 15:26:52 CEST 2023
Dear Berzelius users,
As much of the work done on Berzelius is somewhat interactive, and we currently
have 34 new nodes, there is currently quite a lot of "air" in the queues. For
projects with smaller allocations this is a good opportunity to grab some extra
compute time on the cluster.
Slurm, the scheduler on Berzelius, schedules according to a "fair-share"
algorithm. The allocation (say 240 GPUh/month) is *not* a quota, but a weight
used when setting queue priorities. If everyone schedules as much as they can,
the monthly usage per project will approximate the allocation. The queue
priority is based on the allocation and "recent usage" (exponential falloff, 21
days half life).
As such, when the queues are short, a lower priority is needed to get a job
scheduled. However, if you use a lot more than your allocation (say 8x), it will
take quite a lot of time (something like log_2(8)*21 days) before your queue
priority returns to normal.
To allow for projects to go above their allocation by *a lot* without being
punished in the future, we have *temporarily* reduced the cost of running jobs to
1/8th of normal billing.
When there is less air in the system and the new nodes fill up, we will return
this to normal billing. This means that GPU hours you use now will *not* reduce
your priority (that much) when we turn the cost back up again.
We will send out a notice before we turn the billing back to normal again, at
least a day before doing so, so that users utilizing this temporary billing
reduction can cancel their jobs if they so desire.
If you want more information on how the scheduling algorithm works, Berzelius is
set up very similar to Tetralith, as described on our website [1].
As always, please reach out to us if you have any questions or comments.
TL;DR: Firesale on Berzelius-GPUs, 87.5% off everything until we go back to
normal pricing. Discount does not apply to storage and is time limited. (But
please read the entire email.)
[1] https://www.nsc.liu.se/support/batch-jobs/tetralith/fair-share/
--
Henrik Henriksson
Systems Administrator
National Supercomputer Centre
More information about the Berzelius-users
mailing list