[Triolith-users] Scheduling problem on Triolith (2015-08-10)

Marvin Lie marvin at nsc.liu.se
Mon Aug 10 09:18:41 CEST 2015

Dear All,

I have taken a look at our scheduler status around 08:15.
There are more than 1000 nodes idle but many jobs are held with status 
I confirmed that there were no large reservations being made.
There was also no high priority wide job hogging the queue, requesting 
1000 nodes.
That does not look right to me.
I then restarted our scheduler Slurm at 08:21 and 08:22.
Twice and the problem was still there.
Something has caused Slurm to not be able to schedule the jobs.
That left me no choice to start the scheduler with a clean state at 08:38.
Unfortunately, this causes many running jobs to fail and pending jobs 
I apologize for the inconvenience and we need your cooperation to 
resubmit the job again.
We will investigate this issue further with our scheduler expert who is 
still on vacation.

Marvin - NSC

More information about the Triolith-users mailing list