Previous Topic (Plugins) Up (Contents) Next Topic (Submission)

How Deadline Works


The Deadline Repository

Instead of using a server application to coordinate network rendering, Deadline does so via reading and writing files to a shared directory, which we call the Deadline Repository. The Repository stores data such as jobs, slave settings, network settings, etc, and the Deadline Slaves use this information to search for jobs themselves. By using smart Slaves and removing the need for a server application, the only way for network rendering to cease is if the machine that is hosting the Repository physically goes down.

When a user submits a rendering job to Deadline, a new job folder is created in the repository along with the necessary files to store the job's settings. When a Slave picks up the job for rendering, it copies the job folder locally before proceeding. When rendering is complete, the local folder is deleted and the Slave moves on to find another job.

Note that all external references the job relies on, such as textures or other scene data, should be network accessible because these files are not copied to the Repository. In addition, rendering output is not saved to the Repository, so the Slaves need to be able to access the job's output path(s) so that the images can be saved. This is explained further in the Network Setup documentation.

The Deadline Job

As explained above, when a job is submitted to Deadline, it is placed in the Repository where it becomes visible to Deadline Monitor and accessible to the computers running the Deadline Slave application.

Most jobs can be broken down into several tasks that make up that job - for an animation being rendered in 3dsmax, each frame becomes a task. This allows multiple Slaves to be able to work on the same job at the same time. Some other jobs cannot be divided into tasks this way, such as creating a Quicktime movie from the individual frames of an animation.

There are several states that a job can be in during its lifetime in the Repository. The possible job states are:

  • Queued: Job is waiting to be rendered.
  • Active: One or more of the job's tasks are currently being rendered.
  • Suspended: The job is paused, and will not be rendered until it is resumed.
  • Pending: The job is either scheduled, or is dependent on another job.
  • Completed: All of the job's tasks have finished rendering.
  • Failed: The job has reported the maximum number of errors allowed.
  • Archived: The job can no longer be modified.
  • Deleted: The job no longer exists in the Repository.

Below are some examples of the possible paths a job can follow. For information on how a slave actually selects a job, see the Job Scheduling section of the manual.

Normal Job Flow

  1. The job is submitted to Deadline and placed in the Repository. It will have the status of Queued when viewed from the Monitor.
  2. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  3. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.

Pending Job Flow

  1. The job is submitted to Deadline that is dependent on another job. It will have the status of Pending when viewed from the Monitor.
  2. The other job that this job is dependent on finishes. It will have the status of Queued when viewed from the Monitor.
  3. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  4. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.

Suspended Job Flow

  1. The job is submitted to Deadline and placed in the Repository. It will have the status of Queued when viewed from the Monitor.
  2. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  3. The job is suspended by right-clicking on the job in Monitor and selecting Suspend Job. It will have the status of Suspended when viewed from the Monitor. While a job is suspended, the Slaves will effectively ignore it and not choose any tasks associated with that job.
  4. The job is resumed by right-clicking on the job in the Monitor and selecting Resume Job. It will have the status of Queued when viewed from the Monitor. It will eventually enter the Active state when Slaves beging rendering the job again.
  5. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.

Failed Job Flow

  1. The job is submitted to Deadline and placed in the Repository. It will have the status of Queued when viewed from the Monitor.
  2. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  3. The job reports the maximum number of errors allowed. It will have the status of Failed when viewed from the Monitor. While a job is in the Failed state, the Slaves will effectively ignore it and not choose any tasks associated with that job.
  4. The job is resumed by right-clicking on the job in the Monitor and selecting Resume Failed Job. It will have the status of Queued when viewed from the Monitor. It will eventually enter the Active state when Slaves beging rendering the job again.
  5. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.

Requeued Job Flow

  1. The job is submitted to Deadline and placed in the Repository. It will have the status of Queued when viewed from the Monitor.
  2. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  3. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.
  4. One or more of the job's tasks are requeued by right-clicking the tasks in the Monitor and selecting Requeue Tasks. One reason to requeue a task is that its rendered output may not be satisfactory (the software on a Slave may be misconfigured, for instance). It will have the status of Queued when viewed from the Monitor. It will eventually enter the Active state when Slaves beging rendering the job again.

Archived Job Flow

  1. The job is submitted to Deadline and placed in the Repository. It will have the status of Queued when viewed from the Monitor.
  2. The job is picked up by one or more Slaves. It will have the status of Active in the Monitor. The number in brackets beside the Active status indicates the number of slaves currently rendering the active job.
  3. All of the job's tasks have finished rendering. It will have the status of Completed in the Monitor.
  4. The job is archived by right-clicking on the job in the Monitor and selecting Archive Job. It will have the status of Archived when viewed from the Monitor. Archived jobs cannot have their status or properties changed, but you can still explore their output or retrieve the data file submitted with the job.


Job Scheduling

How A Job Is Selected For A Slave

By default, a job is selected for a slave based on the following properties in this order:

  1. The Group/Pool that the job is submitted to:
    • A slave will only select a job if it has been assigned to the group and pool that the job belongs to. Pools are priority based, so everything else being equal, a job submitted to a pool with the highest priority will be chosen first when a slave is selecting its next job.
  2. The Priority of the job:
    • A jobs priority can range from 0 to 100, where 0 is the lowest priority and 100 is the highest priority.
    • Everything else being equal, a job with the highest priority will always be chosen first when a slave is selecting its next task.
  3. The Date and Time the job is submitted:
    • This is set automatically and is the timestamp of when the job was submitted to Deadline.
    • Everything else being equal, an older job will take priority over a newer job when a slave is selecting its next task.
  4. The job's Limit Groups and Machine Limit:
    • With limit groups, if a job has the highest priority, but requires a limit group that is maxed out, a slave will try to select a job with the next highest priority.
    • A machine limit is essentially a special limit group, except that the limit restricts the number of machines that can render that particular job at a time.

Changing The Scheduling Order

It is possible to change the order in which jobs are scheduled from the Deadline Monitor. In Super User mode, select Tools -> Configure Repository Options, and scroll down to find the Scheduling Settings section.

The following options are available:

  • Date_Pool_Priority: This represents a First In First Out (FIFO) scheduling system. Since it is rare for two jobs to be submitted at exactly the same time, the job's pool and priority will likely never affect how the job is scheduled.
  • Date_Priority_Pool: Because it is rare for two jobs to be submitted at exactly the same time, this will work the same as Date_Pool_Priority, and is simply included for completeness.
  • Pool_Priority_Date: This is the default scheduling order that is used.
  • Pool_Date_Priority: Since it is rare for two jobs to be submitted at exactly the same time, only the job's pool and submission date/time will affect how the job is scheduled.
  • Priority_Date_Pool: Since it is rare for two jobs to be submitted at exactly the same time, only the job's priority and submission date/time will affect how the job is scheduled.
  • Priority_Pool_Date: Similar to the default scheduling order, except that the job's priority takes precedence over its pool.

Job Groups/Pools

What Are Groups/Pools?

Groups can be used to organize your farm based on machine configurations (specs, installed software, etc). For example, if you have a 64 bit machine with 3ds Max installed, you could assign it to groups like '3dsmax', or '3dsmax_64', or simply '3D'. The Groups have no impact on the order in which jobs are rendered, they just help to ensure that your job renders on the machines you want it to. If you don't care about grouping your machines, just use the default 'none' group.

Pools are similar to groups, except that they are priority based and affect the order in which jobs are rendered. Because of this, it's encouraged to use pools for prioritizing shows, shots, etc. If you don't want to set up pools on your farm, you can use the default 'none' pool. Note that the 'none' pool always has the least priority of all the pools.

Managing Groups/Pools

Groups and pools can be managed from the Monitor while in Super User mode. Just select Tools -> Manage Groups or Tools -> Manage Pools to open the appropriate dialog. Here, you can add and remove groups/pools. You can also see which slaves are in which groups/pools.

To assign groups/pools to slaves, right-click on one or more slaves in the Slaves list and select Modify Pools or Modify Groups.

Pools and Job Scheduling

How pools affect the job selection process is best explained through an example. Say we have a slave that has been assigned to the pools 'maya', '3dsmax' and 'fusion', in that specific order. This means that when the slave is looking for its next job, it will first look for a job that was submitted to the 'maya' pool. If there are no available jobs in the 'maya' pool, it will then look for jobs in the '3dsmax' pool. If there are no available jobs in the '3dsmax' pool, it will then look for jobs in the 'fusion' pool. If there are no jobs available in any of the 'maya, 3dsmax' or the 'dfusion' pools, then the slave will select the next available task from a job in the default 'none' pool .

Limit Groups and Job Machine Limits

What are Limit Groups and Job Machine Limits?

In order to support products which use network licensing to limit the number of render clients rendering at a time, Deadline supports the ability to create limit groups to manage this restriction. When creating the limit group, be sure to set the limit to the number of network licenses you have for that product.

For example, if you have 20 nodes in your render farm and only 10 licenses of Fusion, you can create a Fusion limit group with a limit of 10. When you submit a Fusion job to Deadline, be sure to add the Fusion limit group to it. During rendering, Deadline will ensure that no more than 10 machines will be rendering jobs with the Fusion limit group at any given time. Because of this, you never have to worry about licensing issues.

Machine limits are like a special type of limit group. Instead of limiting how many slaves can render a group of jobs, they limits the number of slaves that can render one particular job.

Managing Limit Groups and Machine Limits

Limit groups can be managed from the limit group list in the Deadline Monitor while in Super User mode. See the Limit Group List documentation for more details. Machine limits can be managed by right-clicking on a job in the Deadline Monitor and selecting Modify Machine Limit.