Parallelization in Oscar

The functionality in the experimental package Parallel provides tools to do simple things in parallel in Oscar. The main obstruction to using regular functions like pmap and friends is that most of the Oscar datatypes have parents which need to be coherent throughout the worker pool and the main process. In order to achieve this, some more information on "type parameters" needs to be send up front when communicating with other processes; see the documentation of the Oscar serialization for details.

Below we document specialized structs and methods which we overload so as to hopefully have a seamless integration of Oscar data types into pmap and other functions. Note that this is non-exhaustive work in progress and the user always remains responsible for serializing any type of object that they want to pass to workers!

OscarWorkerPool, pmap, and friends

OscarWorkerPoolType
 OscarWorkerPool

An OscarWorkerPool is used to handle a pool of separate worker processes running Oscar. The messaging between processes in an OscarWorkerPool uses the Oscar serialization which is needed to implement parallel methods. The julia Distributed functions remotecall, remotecall_fetch and pmap will work with an OscarWorkerPool which is a subtype of AbstractWorkerPool.

source
oscar_worker_poolFunction
 oscar_worker_pool(n::Int)
 oscar_worker_pool(f::Function, n::Int)

Create an OscarWorkerPool with n separate processes running Oscar. There is also the option to use an OscarWorkerPool within a context, such that closing down the processes happens automatically.

Example

The following code will start up 3 processes with Oscar, run a parallel computation over each element in an array and then shutdown the processes.

results = oscar_worker_pool(3) do wp
  Qxy, (x, y) = QQ[:x, :y]
  pmap(z -> z^2, wp, [x^2, x*y, x*y^2])
end
Experimental

This function is part of the experimental code in Oscar. Please read here for more details.

source

Slightly advanced distributed computing with centralized bookkeeping

compute_distributed!Function
compute_distributed!(ctx, wp::OscarWorkerPool; wait_period=0.1)

Given an OscarWorkerPool wp and a context object ctx, distribute tasks coming out of ctx to be carried out on the workers (based on availability) and process the result.

For this to work, the following methods must be implemented for ctx:

  • pop_task!(ctx) to either return a Tuple (task_id::Int, func::Any, args), in which case remotecall(func, wp, args) will be called to deploy the task to the workers, or return nothing to indicate that all tasks in ctx have been exhausted, or return an instance of WaitForResults to indicate that some information on results of tasks which have already been given out is needed to proceed with giving out new tasks.
  • process_result!(ctx, task_id::Int, res) to process the result res of the computation of the task with id task_id.

Note: The programmers themselves are responsible to deliver task_ids which are unique for the respective tasks!

Deploying tasks to the workers continues until pop_task!(ctx) returns nothing. Computation continues until all deployed tasks have been treated with process_result!. The return value is the ctx in its current state after computation.

source
pop_task!Function
pop_task!(ctx::CtxType) where {CtxType}

Internal function for compute_distributed!; to be overwritten with methods for specific CtxTypes.

Whatever type of object the user would like to use as context object ctx for compute_distributed should implement a method of this function to do the following.

  • If there is another task to be deployed to a worker, return a triple (task_id, func, args) consisting of a function func, a unique task_id::Int, and arguments arg so that func(arg) is called on some worker.
  • If no new task can be given out with the current state of the context object ctx and we first need to wait for processing of some results of tasks given out already, return WaitForResults().
  • If all tasks in ctx have been exhausted, return nothing.
source
process_result!Function
process_result!(ctx::CtxType, id::Int, res) where {CtxType}

Internal function for compute_distributed!; to be overwritten with methods for specific CtxTypes.

For a task (task_id, func, args) returned by a call to pop_task!(ctx), the result of the call func(args) is delivered as res for processing on the main process.

source