Updated 2019-09-23

Run Distributed Matlab on the Cluster

Overview

  • MATLAB is a multi-paradigm numerical computing environment and proprietary programming language developed by MathWorks.
  • This guide will cover how to run distributed Matlab on the Cluster.
  • MATLAB home page (https://www.mathworks.com/products/matlab.html)
  • Distributed MATLAB page (https://www.mathworks.com/help/parallel-computing/run-code-on-parallel-pools.html;jsessionid=fcc7af7fddd6ee1977a004126f8a)

Tips

  • Before run MATLAB in a distributed mode, you need to configure the cluster profile through interactive MATLAB steps to set up
  • In the cluster profile, you can specify the maximun workers, and your job cannot exceed that number
  • If you are running on RHEL6 queues, be sure to add:
module load zlib/1.2.8

Walkthrough: Run MATLAB on the Cluster

Open terminal by ->system tools->konsol
>module load matlab/r2018b
>matlab
1. click ENVIRONMENT->Parallel->Create and Manange Clusters...
2. In the popup window (Cluster Profile Manager), click Add Cluster Profile -> Torque 
Or use existing one if you already have one from past.
3. Click Edit and start fill in the form. See images as an example but only use the below numbers as parameter:

Descrition of this Cluster Description: PACE Cluster
Number of workers available to cluster NumWorkers: 32 (We cannot request more workers in our code than this number)
Number of computational threads to cluster NumWorkers: 1 

4. Additional TORQU PROPERTIES
Resource argument for job submission: -l nodes=^N^:ppn=^T^
Additional command line arguments for job submission: -q <your queue>

Screenshot

Screenshot

  • You can transfer the files to your account on the cluster to follow along. The file transfer guide may be helpful.

Part 1: The PBS Script

#PBS -l nodes=1
#PBS -l walltime=1:00:00
#PBS -q <your queue>
#PBS -j oe
#PBS -o matlab.output.$PBS_JOBID

cd $PBS_O_WORKDIR
module purge
module load matlab/r2018b
# module load zlib/1.2.8 #(load only on RHEL6) 

NPROCS=32 #you can specify a number which is less than NumWorkers in cluster profile

(time $EXE -nodisplay -singleCompThread -r "myParallelAlgorithmFcn($NPROCS);exit") 2>&1 |tee -a time.output

Part 2: The MATLAB Script

Example myParallelAlgorithmFcn.m

function [numWorkers,time] = myParallelAlgorithmFcn(ncores)

complexities =  [2^21];
numWorkers= [ncores];

% To obtain obtain predictable sequences of composite numbers, fix the seed
% of the random number generator.
rng(0,'twister');
myCluster=parcluster('TorqueProfile1');
myCluster.NumWorkers=ncores;

%parfor not creating pool automatically
ps=parallel.Settings;
ps.Pool.AutoCreate=false;
parpool('TorqueProfile1',ncores);
tic;
for c = 1:numel(complexities)

    primeNumbers = primes(complexities(c));
    compositeNumbers = primeNumbers.*primeNumbers(randperm(numel(primeNumbers)));
    factors = zeros(numel(primeNumbers),2);

    for w = 1:numel(numWorkers)
        parfor (idx = 1:numel(compositeNumbers), ncores)
            factors(idx,:) = factor(compositeNumbers(idx));
        end
    end
end
time = toc;
fileID=fopen('time.output','w');
fprintf(fileID,'%8.6f seconds',time);
fclose(fileID);
poolobj=gcp('nocreate')
delete(poolobj)
  • The #PBS directives are standard, requesting just 3 minutes of walltime and 1 node with 2 cores. More on #PBS directives can be found in the PBS guide
  • $PBS_O_WORKDIR is a variable that represents the directory you submit the PBS script from. Make sure the files you want to use are in the same directory you put the PBS script.
  • Output Files will also show up in this dir as well
  • module load matlab/r2018b loads the r2018b version of MATLAB. To see what MATLAB versions are available, run module avail matlab, and load the one you want.

Part 3: Collecting Results

Output file should look like this: matlab.output.5195472.dedicated-sched.pace.gatech.edu



                            < M A T L A B (R) >
                  Copyright 1984-2018 The MathWorks, Inc.
                   R2018b (9.5.0.944444) 64-bit (glnxa64)
                              August 28, 2018


To get started, type doc.
For product information, visit www.mathworks.com.

Starting parallel pool (parpool) using the 'TorqueProfile2' profile ...
connected to 32 workers.

poolobj =

 Pool with properties:

            Connected: true
           NumWorkers: 32
              Cluster: TorqueProfile2
        AttachedFiles: {}
    AutoAddClientPath: true
          IdleTimeout: 30 minutes (30 minutes remaining)
          SpmdEnabled: true
 EnvironmentVariables: {}

Parallel pool using the 'TorqueProfile2' profile is shutting down.

real    2m7.116s
user    0m23.492s
sys     0m1.496s

  • After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
  • Congratulations! You successfully ran distributed MATLAB on the cluster.