You can use parallel computing to carry out many calculations simultaneously. Split large problems into smaller ones, which you can process at the same time.
With parallel computing, you can:
This table lists some essential parallel computing terms and their definitions.
Smallest set of instructions that a CPU can schedule and execute independently. A GPU, multiprocessor, or multicore computer can perform multithreading, or executing multiple threads simultaneously.
Execution of an instance of a computer program by one or many threads. Each process has its own blocks of memory.
Standalone computer containing one or more CPUs or GPUs. Nodes can be networked to form a cluster or supercomputer.
Collection of interconnected computers that work together as a unified system to provide high-performance computing power for processing complex and data-intensive tasks.
Increase in parallel speedup with the addition of more resources.
To run the examples on this page, you must have a Parallel Computing Toolbox™ license. To determine whether you have Parallel Computing Toolbox installed, and whether your machine can create a default parallel pool, enter this code in the MATLAB ® Command Window.
if canUseParallelPool disp("Parallel Computing Toolbox is installed") else disp("Parallel Computing Toolbox is not installed") end
Alternatively, to see which MathWorks ® products you have installed, in the Command Window, enter ver .
Before you parallelize your code, you can use techniques such as vectorization and preallocation to improve the sequential performance of your MATLAB code. Sequential acceleration and parallelization can often work together to give cumulative performance improvements.
MATLAB is optimized for operations involving matrices and vectors. The process of revising loop-based, scalar-oriented code to use MATLAB matrix and vector operations is called vectorization. Using vectorized code instead of loop-based operations often improves your code performance.
These code snippets compare the amount of time the software needs to calculate the square root of 1,000,000 values with loop-based code against vectorized code.
tic for k = 1:1000000 x(k) = sqrt(k); end toc
Elapsed time is 0.112298 seconds.
tic k = 1:1000000; x = sqrt(k); toc
Elapsed time is 0.006783 seconds.
In some cases, while - and for -loops that incrementally increase the size of an array each time through the loop can adversely affect performance and memory use. You can preallocate the maximum amount of space required for an array instead of continuously resizing arrays when you run loop-based code.
These code snippets compare the amount of time the software needs to create a scalar variable x , when you gradually increase the size of x in a for -loop against when you preallocate a 1-by-1,000,000 block of memory for x .
tic x = 0; for k = 2:1000000 x(k) = x(k-1) + 5; end toc
Elapsed time is 0.103415 seconds.
tic x = zeros(1,1000000); for k = 2:1000000 x(k) = x(k-1) + 5; end toc
Elapsed time is 0.018758 seconds.
This table shows the appropriate preallocation function for the type of array you want to initialize.
Array Type to Initialize | Preallocation Function |
---|---|
Numeric | zeros |
String | strings |
Cell | cell |
Table | table |
MATLAB supports two ways to parallelize your code on multicore and multiprocessor nodes.
Some MATLAB functions implicitly use multithreading to parallelize their execution. These functions automatically execute on multiple computational threads in a single MATLAB session, which means they run faster on multicore-enabled machines. Some examples are linear algebra and numerical functions such as fft , mldivide , eig , svd , and sort . Therefore, if you use these functions on a machine with many cores, you can observe an increase in performance.
MATLAB and Parallel Computing Toolbox software uses MATLAB workers to explicitly parallelize your code. MATLAB workers are MATLAB computational engines that run in the background without a graphical desktop. The MATLAB session you interact with, also called the MATLAB client, instructs the workers with parallel language functions. You use Parallel Computing Toolbox functions to automatically divide tasks and assign them to these workers to execute the computations in parallel.
If you have Parallel Computing Toolbox installed on your machine, you can start an interactive parallel pool of workers to take advantage of the cores in your multicore computer.
A parallel pool ( parpool ) is a group of MATLAB workers on which you can interactively run code.
You can create a parallel pool of workers using parpool or functions with automatic parallel support. By default, parallel language functions such as parfor , parfeval , and spmd automatically create a parallel pool when you need one. When the workers start, your MATLAB session connects to them. For example, this code automatically starts a parallel pool and runs the statement in the parfor -loop in parallel on six workers.
parfor i = 1:100 c(i) = max(eig(rand(1000))); end
Starting parallel pool (parpool) using the 'Processes' profile . Connected to parallel pool with 6 workers.
You can also use the parallel status indicator in the lower left corner of MATLAB desktop to start a parallel pool manually. Click the indicator icon, and then select Start Parallel Pool.
To stop a parallel pool while it is starting, press Ctrl+C or Ctrl+Break. On Apple macOS operating systems, you also can use command+ (the command key and the plus key).
Starting a parallel pool often takes a long time, which can impact performance for code that takes only a few seconds to execute. For longer running code, the overhead becomes less significant.
Your default parallel environment determines the parallel pool cluster. The default parallel environment of your local machine is called Processes . This environment starts a parallel pool of process workers. You can see the selection of available cluster profiles in the Parallel menu on the MATLAB Home tab.
Note
For the default Processes profile, the default number of process workers is one per physical CPU core using a single computational thread. This restriction ensures that each worker has exclusive access to a floating-point unit, and generally optimizes performance of computational code. If your code is not computationally intensive, for example, code that is input/output (I/O) intensive, then consider using up to two workers per physical core. Running too many workers on too few resources can impact the performance and stability of your machine.
This table summarizes the different ways you can create interactive parallel pools.
Up to 512 cores
Up to 512 threads
Without a Parallel Computing Toolbox license: 1 thread
With a Parallel Computing Toolbox license: Up to the number of threads that the maxNumCompThreads function returns
Up to the maximum number of workers the cluster can start
Parallel Computing Toolbox also supports running a parallel pool of workers that are backed by computing threads instead of process workers. This parallel environment is called Threads . Thread workers have reduced memory usage, faster scheduling, and lower data transfer costs. However thread workers support only a subset of the MATLAB functions that are available to process workers.
MATLAB also supports an additional local parallel environment called backgroundPool . The backgroundPool environment is backed by thread workers and supports running code in the background while you run other code in your session at the same time. You can use one thread worker in the backgroundPool environment when you do not have a Parallel Computing Toolbox license. If you have a Parallel Computing Toolbox license, the maximum number of thread workers in your backgroundPool is the value that the maxNumCompThreads function returns.
If you have access to onsite or cloud clusters, you can discover other clusters running on your network or on Cloud Center by clicking Parallel > Discover Clusters and following the prompts. Parallel pools on clusters are backed by process workers and support the full parallel language.
When you have an interactive parallel pool of workers, you can use parallel language functions to split large problems into smaller tasks that workers can execute in parallel. To accelerate your MATLAB code, use interactive parallel features such as parfor .
This example shows how to convert a for -loop into a parfor -loop and calculate the scalability of the parfor -loop with the number of workers.
You can convert for -loops to run in parallel by using a parfor -loop. Often, you can simply replace for with parfor . However, you often need to adjust your code further to run in it parallel.
Mechanics of parfor -loops
When you run a parfor -loop, MATLAB executes the statements in the loop body in parallel. Each execution of the parfor -loop body is an iteration . The MATLAB client issues the parfor command and coordinates with the workers to execute the loop iterations in parallel on the workers in a parallel pool. A parfor -loop can provide significantly better performance than its analogous for -loop because several workers compute iterations simultaneously.
When you run a parfor -loop, the MATLAB client divides the loop iterations into subranges and assigns them to the workers. If the number of workers is equal to the number of loop iterations, each worker performs one iteration of the loop. If the number of iterations is greater than the number of workers, some workers perform more than one loop iteration. In this case, a worker receives multiple iterations at once to reduce communication time. The client also performs a static analysis of the parfor -loop code to determine which data to transfer to each worker and which data to transfer back to the client. The client sends the necessary data to the workers, which execute most of the computation. The workers then send the results back to the client, which assembles those results. MATLAB workers evaluate iterations in no particular order and independently of each other. Because each iteration is independent, the iterations need not be synchronized, and often are not.
A parfor -loop must satisfy these basic requirements.
Convert for -loops to parfor -loops
Convert a for -loop into a parfor -loop in code that calculates the maximum value of the singular-value decomposition of 5000 200-by-200 random matrices by replacing for with parfor . Execute the parfor -loop on six workers. Compare their execution times.
When you use parfor and you have Parallel Computing Toolbox software installed, MATLAB automatically starts a parallel pool of workers. The parallel pool can take a long time to start. This example shows a second run with the pool already started. You can observe that the parfor code executed on six workers runs much faster than the for -loop code .
tic y = zeros(5000,1); for n = 1:5000 y(n) = max(svd(randn(200))); end toc
Elapsed time is 21.837346 seconds.