Skip to main content
Department of Information Technology

Dynamic Circular Work-Stealing Deque (SPAA'05)

David Chase (Sun Microsystems Laboratories, Burlington, MA)
Yossi Lev (Brown University & Sun Microsystems Laboratories, Burlington, MA)


This paper describes a simple lock-free work-stealing deque. A work-stealing deque allows a single thread to push and pop elements to/from one end of a queue, while other threads are allowed to steal elements from the other end of the queue.

The authors present an improvement over a previous work-stealing deque by Arora, Blumofe, and Plaxton, which uses fixed array sizes. In the deque presented in this paper, the elements are stored in a cyclic array that can grow when it overflows. The algorithm has no limit other than integer overflow (and the system´s memory size) on the number of elements that may be on the deque, and the total memory required is linear in the number of elements in the deque.

Work-stealing deques are important for dynamic load balancing, and used in systems like Cilk Plus and Intel Thread Building Blocks (although none of them actually use Chase-Lev deques). The ForkJoinPool in Java 7 however states that "The main work-stealing queue design is roughly similar to those in the papers Dynamic Circular Work-Stealing Deque by Chase and Lev, SPAA 2005 and Idempotent work stealing by Michael, Saraswat, and Vechev, PPoPP 2009.

Link to the paper


Question 1:

Which deque operations may execute concurrently?

Question 2:

In which situations do the deque issue the more expensive compare-and-swap instruction?


The only operation that can be invoked concurrently by other threads than the owning thread is the steal() operation. This method works in the 'top' end of the deque. The 'bottom' end of the deque is manipulated via the operations pushBottom() and popBottom() which can only be invoked by the thread that owns the deque.

A compare-and-swap operation must be performed either (1) when threads try to steal objects from a deque, since this is done concurrently, or (2) when the owning thread pops the last element in the deque (such that it becomes empty), since the pop then may interfere with another thread trying to steal.


Among steal() , popBottom() and pushBottom(), only the steal() operation can be run concurrently by multiple processes.

The deque issues CAS in

  1. steal() operation for avoiding two different thieves steal the same object
  2. popBottom() operation to avoid concurrent pop operations (i.e. popBottom and steal)
  3. perhapsShrink() operation to avoid concurrent steal and shrink (in shared pool version) operations

Only the process owning the pool uses the 'pushBottom' and 'popBottom' methods, while any other process uses the 'steal' method. Therefore any number of 'steals' may run concurrently, possibly together with either one 'pushBottom' or one 'popBottom'.

The deque uses CAS in:

  1. 'steal', to ensure that each object is stolen exactly once
  2. 'popBottom', to detect cases where the popped object is the last one and was just stolen by another process

CAS may is also used when trying to shrink the pool, to detect cases where the shrink operation was completed by another process.


Q1: 'pushBottom' and 'popBottom' are only executed by the deque's owner. 'steal' can be executed concurrently.

Q2: 'steal' does a CAS on the top index. This prevents two concurrent stealer threads from getting the same element.
'popBottom' uses CAS when the owner pops the last element. This is to detect if the owner won the race if there is any concurrent stealer trying to pop the last element.
'perhapsShrink' does a CAS on top.


This is the "steal" function which operates on the top of the queue. It can be executed concurrent to other steals, popBottom and pushBottom operations.

1) steal + popBottom: tries to increment top, fail indicates loss of a race and the function returns "Abort". The CAS is specially necessary for the case when the queue becomes emtpy. 2) perhapsShrink: modified top again, the CAS makes sure a concurrent steal didn't change it in between.


Several threads can execute the steal method concurrently but might get "Abort" as return return value if the thread loses a race with another thread. The "owner" of the queue is the only one that can execute pop and push. Pop and push calls can execute in parallel with steal calls.

In the basic algorithm a CAS is issued in popBottom() and steal().


The steal operation

Read the explanation page 26: the 2 last paragraphs before section 4.2.



'steal' and 'popbottom', to avoid conflicts between concurrent threads removing elements from the deque. 'perhapsShrink' in the shared pool version, to coordinate between 'steal' and shrink operations.


Several threads may execute the 'steal' operation concurrently.

The deque issues a CAS in the 'steal' and 'popBottom' operations (to detect if the top of the deque is being modified by some other thread) and in the 'perhapsShrink' operation to ensure that no one is stealing while we are trying to shrink the deque.

Updated  2013-06-26 23:02:51 by Kjell Winblad.