Indiana University
University Information Technology Services
  
What are archived documents?

What is Moab?

On this page:


Introduction

Moab is an advanced job scheduler for use on clusters and supercomputers. It is a highly optimized and configurable tool capable of supporting a large array of scheduling and fairness policies, dynamic priorities, and extensive reservations. Acknowledged by many as one of the most advanced schedulers available, Moab is currently in use at hundreds of leading government, academic, and commercial sites throughout the world. Moab improves the manageability and efficiency of machines ranging from clusters of a few processors to multi-teraflop supercomputers.

Moab at IU

On the Quarry system at Indiana University, Moab serves as the job scheduler for the TORQUE resource manager. TORQUE is based on OpenPBS; if you are familiar with PBS Pro, you'll find much of the syntax the same.

Once a job has been submitted to one of the TORQUE queues, it may become eligible for dispatch by Moab. The following commands provide useful information on the status of a queued or running job:

showq Display active, idle, or all jobs
showstart jobid Display estimated dispatch time for jobid
checkjob jobid Display attributes for jobid

For more information about these commands as well as other Moab utilities, see the Moab Workload Manager User's Manual.

Fairshare scheduling

Fairshare scheduling allows historical resource usage to affect job priority decisions. Administrators can set target utilization goals for each user, group, class, or service group. When these utilization goals are exceeded by one usage class, jobs from other usage classes will take precedent over jobs from the offending class.

Currently, the fairshare policy on Quarry records usage over the last seven days and decays at a rate of 80% per day. Each usage class (usually a username) has a goal of 20% usage. Anything above that will cause that user's jobs to have a lower scheduling priority.

Use the diagnose -f command to display the fairshare scheduling usage table. The following example shows that users baikgrp and dsheen have exceeded their "fair share" and will be given lower priorities over the next week:

[root@Quarry]# diagnose -f FairShare Information Depth: 7 intervals Interval Length: 1:00:00:00 Decay Rate: 0.80 FS Policy: DEDICATEDPS System FS Settings: Target Usage: 0.00 Flags: 0 FSInterval % Target 0 1 2 3 4 5 6 FSWeight ------- ------- 1.0000 0.8000 0.6400 0.5120 0.4096 0.3277 0.2621 TotalUsage 100.00 ------- 1872.2 1605.8 631.7 1868.0 3222.6 1857.5 1439.1 USER ------------- haiyang* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- baikgrp* 45.91 20.00 81.11 45.57 79.98 49.70 4.88 20.49 10.79 balin* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- akewalra* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- kevidale* 0.25 20.00 ------- ------- ------- 0.23 0.74 0.78 ------- dlauer* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- kmane* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- bramley* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- qzou* 0.18 20.00 ------- ------- ------- 0.05 0.34 0.53 1.01 mathess* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- iyengar* 0.54 20.00 ------- ------- ------- ------- 0.63 2.58 3.34 pewang* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- rrepasky* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- agopu* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- heap* 0.02 20.00 0.09 ------- ------- ------- ------- ------- ------- vsingan* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- huili* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- dsheen* 39.26 20.00 14.97 43.03 ------- 33.74 86.48 62.68 ------- turnerg* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- ejolson* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- ssrivast* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- smiddha* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- mburland* 4.90 20.00 0.17 5.37 4.83 11.14 3.11 5.19 16.89 febertra* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- lsandvos* 0.01 20.00 ------- 0.06 ------- ------- ------- ------- ------- mswat* 0.00 20.00 ------- ------- ------- ------- ------- ------- ------- acolubri* 5.72 20.00 3.67 5.97 15.20 5.14 3.75 7.75 10.01 mbaik* 3.22 20.00 ------- ------- ------- ------- 0.07 ------- 57.96

When to expect your job to start

Moab uses the fairshare tables to determine which job will be assigned to the next open processors. The showq command shows the state of submitted jobs. Following is sample output:

[root@Quarry]# showq active jobs-------------------- JOBID USERNAME STATE PROCS REMAINING STARTTIME 17199 heap Running 1 2:53:12 Wed Sep 17 11:20:45 17200 heap Running 1 2:53:52 Wed Sep 17 11:21:25 17201 heap Running 1 2:54:32 Wed Sep 17 11:22:05 17202 heap Running 1 2:55:13 Wed Sep 17 11:22:46 17203 heap Running 1 2:55:53 Wed Sep 17 11:23:26 17204 heap Running 1 2:56:33 Wed Sep 17 11:24:06 17205 heap Running 1 2:57:13 Wed Sep 17 11:24:46 . . . 6 active jobs eligible jobs---------------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 16672 ejolson Idle 1 8:08:00:00 Tue Sep 16 23:27:05 16673 ejolson Idle 1 8:08:00:00 Tue Sep 16 23:27:06 16674 ejolson Idle 1 16:16:00:00 Tue Sep 16 23:27:06 16675 ejolson Idle 1 16:16:00:00 Tue Sep 16 23:27:06 16676 ejolson Idle 1 8:08:00:00 Tue Sep 16 23:27:06 16677 ejolson Idle 1 8:08:00:00 Tue Sep 16 23:27:06 6 eligible jobs blocked jobs---------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 0 blocked jobs Total Jobs: 116 Active Jobs: 104 Eligible Jobs: 6 Blocked Jobs: 0

The jobs at the top of the eligible jobs list will run next if resources are available. Various reservations can prevent jobs from running if they have blocked off resources that waiting jobs would need. You can use the command showres to examine the list of reservations.

To find the estimated start time of a particular job, try:

showstart $JOBID [root@Quarry]# showstart 16672 job 16672 requires 1 proc for 8:08:00:00 Earliest start in 5:03:54:32 on Mon Sep 22 17:00:00 Earliest completion in 13:11:54:32 on Wed Oct 1 01:00:00 Best Partition: DEFAULT
This is document avmu in domain all.
Last modified on August 27, 2007.
Please tell us, did you find the answer to your question?