Run Comsol on the Cluster - Batch Mode¶
- There are a couple general parts to the Comsol Workflow:
- Make model in gui / import model
- Solve interactively (gui) or in batch mode
- Analyze results
- This guide will focus on how to solve models using batch mode. This is especially helpful if you want to solve multiple models at once
- Important: models imported from windows may have binarys that dont work on the cluster. Make sure models are
.mphbin, or just make them on the cluster using comsol interactively.
Set up Storage for Comsol¶
IMPORTANT: Comsol uses a hidden directory
~/.comsol that resides in your home directory to store configuration and temporary files.
- Since the
homedir size is only 5gb, this storage may cause a
quota exceedederror, and since its a hidden file (
/.), it may look like there is nothing in your
- Solution: Move the
~/.comsolfile to your
~/datadir and link it to the old location.
cd ~ mv .comsol ~/data ln -s ~/data/.comsol #Check to make sure everything worked ls -ld ~/.comsol #Should display: # lrwxrwxrwx 1 <username> <group> <data> /nv/hp16/<username>/.comsol -> /nv/hp16/username/data/.comsol
We strongly recommend you create models on the cluster. If a model is created in windows, the binary of the model will not work on the cluster (linux).
- If you are a researcher use the
-researchversions of Comsol, otherwise for things like class, use the non-research version.
- Make sure you load matlab and then comsol in your
module load <matlab version> <comsol version>. Find available versions with
module avail comsol.
Run Multithreaded Batch Job¶
- Add the following line to your
PBSscript (after you have loaded the comsol module) to run a comsol on multiple cores:
comsol batch -np 8 -inputfile <input.mph> -outputfile <output_name.mph>
The number after the
-np flag (number of processors) must equal the number you requested in the PBS script
Run Multiple Models¶
- One option is to use a job array. Please see the array guide for more information.
- Another option is to supply a script that lists multiple jobs to be run, which will be explained below.
- When logged into the cluster, create a plain file called
COMSOL_BATCH_COMMANDS.bat(you can name it whatever you want, just make sure its .bat). Open the file in a text editor such as
- With the file open, basically you just have to list the run command from above for every model:
#Contents of the .bat File comsol batch -np 8 -inputfile <model1.mph> -outputfile <output_name.mph> comsol batch -np 8 -inputfile <model2.mph> -outputfile <output_name.mph> comsol batch -np 8 -inputfile <model3.mph> -outputfile <output_name.mph>
- Then, in the PBS script instead of writing out the run command (the one that starts with
comsol batch), include the name of the .bat file without the .bat, for example just write COMSOL_BATCH_COMMANDS
- Make sure the
.batfile and the model files are in the same dir as your
PBSscript, if you are using
$PBS_O_WORKDIRin your script.
- Since you are running multiple models in one job, you will have to increase the walltime of your job
Walkthrough: Run Comsol in Batch Mode¶
- This walkthrough will use an example
cold_water_glass.mph. The model is an example provided by Comsol, and more detail on the model can be found on their website here
- Model file can be found here
- PBS Script can be found here
- After logging in, You can transfer the files to your account on the cluster to follow along. The file transfer guide may be helpful.
Part 1: The PBS Script¶
#PBS -N comsolTest #PBS -A [Account] #PBS -l nodes=1:ppn=8 #PBS -l pmem=8gb #PBS -l walltime=10:00 #PBS -q inferno #PBS -j oe #PBS -o comsolTest.out cd $PBS_O_WORKDIR module load comsol/5.3a-research comsol batch -np 8 -inputfile cold_water_glass.mph -outputfile cold_water_glass_solved.mph
#PBSdirectives are standard, requesting just 1 minute of walltime and 1 node with 8 cores. More on
#PBSdirectives can be found in the PBS guide
$PBS_O_WORKDIRis simply a variable that represents the directory you submit the PBS script from. Make sure the
.mphcomsol model you want to run (in this case,
cold_water_glass.mph) is in the same directory you put the PBS script.
module load comsol/5.3a-researchloads the 5.3a version of comsol. To see what comsol versions are available, run
module avail comsol, and load the one you want
comsol batchruns comsol
- For multiple cpus (parallel), make sure the number of processors you request in the directives (top) part of the script is equal to the number you specify in the
-nppart of the
Part 2: Submit Job and Check Status¶
- Make sure you're in the dir that contains the
- Submit as normal, with
qsub <pbs script name>. In this case
- Check job status with
qstat -t 22182721, replacing the number with the job id returned after running qsub
- You can delete the job with
qdel 22182721, again replacing the number with the jobid returned after running qsub
Part 3: Collecting Results¶
- In the directory where you submitted the
PBSscript, you should see all the generated output files, including the solved model
cat comsolTest.outto view information on the completed job, which should look like:
---------- Current Progress: 100 % - Assembling matrices Memory: 1046/1118 7077/7081 1726 119.82 0.087925 3567 1732 3567 1 1 2 3.4e-11 5.4e-16 1727 119.9 0.087925 3569 1733 3569 1 1 2 3.3e-11 6.8e-16 ---------- Current Progress: 100 % - Assembling sparsity pattern Memory: 957/1118 7046/7081 1728 119.99 0.087925 3571 1734 3571 1 1 2 2.8e-11 6.3e-16 ---------- Current Progress: 100 % - Assembling matrices Memory: 989/1118 7081/7081 - 120 - out 1729 120.08 0.087925 3573 1735 3573 1 1 2 6e-11 5.3e-16 Time-stepping completed. ---------- Current Progress: 100 % - Memory: 957/1118 7047/7081 Solution time: 559 s. (9 minutes, 19 seconds) Physical memory: 1.12 GB Virtual memory: 7.08 GB Ended at 14-Sep-2018 10:42:08. ----- Time-Dependent Solver 1 in Study 1/Solution 1 (sol1) --------------------> Run time: 563 s. Saving model: /gpfs/pace2/project/pf1/shollister7/comsol/cold_water_glass_solved.mph Save time: 0 s. Total time: 572 s. ---------- Current Progress: 100 % - Done --------------------------------------- Begin PBS Epilogue Fri Sep 14 10:42:09 EDT 2018 Job ID: 22409572.shared-sched.pace.gatech.edu User ID: shollister7 Job name: comsolTest Resources: neednodes=1:ppn=8,nodes=1:ppn=8,pmem=8gb,walltime=00:16:00 Rsrc Used: cput=01:04:15,energy_used=0,mem=1080752kb,vmem=7572812kb,walltime=00:09:41 Queue: inferno
- After the result files are produced, you can move the files off the cluster. Refer to the file transfer guide for help.
- To open the solved model in the Comsol postprocessor, see the Comsol interactive guide
- Congratulations! You successfully ran Comsol in batch mode on the cluster.
If you got strange java errors on running Comsol on PACE, you can double check if ~/.comsol points to ~/data/.comsol and there is no recursive symbolic link exists.
"why the same input file produces the result on windows but not Linux?". The answer could be that you should initialize the results by right click "Study"--->"Get Initial Value".