Running jobs

Creating jobs

To create and (immediately) execute a job, select "Create job" from the space explorer page. Then set the given execution parameters such as including wall clock timeout and CPU timeout (see here for the difference).

Options for Creating Jobs

Selecting an execution queue

You can also select a queue for execution of your job. Each job queue is assigned a number of nodes with identical technical specs. The queues all.q and all2.q are always available for all users to use. They differ in the kind of compute nodes that are assigned to them. Those in all2.q are slightly slower and have less RAM. Community leaders may request the creation of new queues for exclusive access to nodes, for special, community-related tasks (such as running a competition).

Choosing a pre processor

With this option you can choose a benchmark pre processor that will alter the benchmarks before the solvers on them.

Choosing a post processor

With this option you can select a post processor that will extract attributes from the job results as defined by the post processor.

Timeouts

Timeouts are specified in number of seconds. Any job pairs that exceed the wallclock timeout or cpu timeout you specify will be terminated. Using these options will ensure that job pairs that are taking an unreasonable amount of time won't keep your other pairs from running.

Maximum Memory

The units for the maximum memory field is Gigabytes. This option limits the amount of memory a pair can use before it is terminated.

Subscribe to Job Notifications

Choose to recieve an email notification when the job status changes (ie: when the job completes).

Advanced Options

Choosing an execution order

There is an option to execute job pairs in depth-first order, which will execute all job pairs in one subspace before moving on to the next; or else round-robin, which will result in a workload where all subspaces make progress in the execution concurrently. After setting these options, select "next".

Create Paused

The "Create Paused" option will pause the job as soon as it is created.

pre-processor seed

This option allows you to specify the seed for a pseudo-random value generator that will generate a number and pass it to the preprocessor

Suppress Timestamps

Setting this option to "yes" will prevent runsolver from adding timestamps to the jobpair output.

Results Interval

The interval, in seconds, at which to receive incremental results for pairs that are running. 0 means results are only obtained after pairs finish. 10 is the minimum if this is used.

Save Additional Output Files

Saves solver output that is placed into the extra output directory given to each solver.

Soft time limit

This option is only available when using BenchExec

If greater than zero, solver will be sent TERM signal after running for this number of seconds, but will not be sent KILL signal until either wallclock timeout or cpu timeout. This offers the solver an opportunity to do any necessary cleanup before being terminated.

Delay before termination

This option is only available when using runsolver

If greater than zero, solver will be sent TERM signal after memory or time limits have exceeded, and KILL signal will be then sent after this number of seconds. This offers the solver an opportunity to do any necessary cleanup before being terminated.

You can then select whether to "run and keep the hierarchy structure" or choose which benchmarks and solvers to execute. The first option will find all subspaces in the current hierarchy (rooted at the node where you are creating the job) which have solvers and benchmarks, and execute all possible combinations of those benchmarks and solvers. So to run several divisions of a competition, for example, you can create a subspace for each division, copy the solvers and benchmarks for that division into that subspace, and the run a single job from the space containing those subspaces.

Monitoring the execution

Once the job is created, it will begin queueing immediately. If other jobs are running, of course, it may take some time for your job to make it through the queue to execute on the compute nodes. You can look at the cluster status page to monitor queues and nodes.