Updated 2023-04-10
Keras¶
License¶
Keras on PACE uses the Georgia Tech license, for which an annual access fee is required per user. Visit documentation from CoE software for more information about access.
Overview¶
Keras is a high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Running Keras Interactivley¶
Allocating Resources¶
-
In order to run Keras interactivley we can use the
salloc
command to specify the account, partitions, time, and queue -
Here is an example of an
salloc
command you can use:salloc -A [Account] -N 1 -n 8 -t 15 -q embers
-
This will allocate the proper resources to run Keras
Using MNIST Example¶
-
The following example will show running Keras interactively using a MNIST convent
-
The MNIST convnet that comes with Keras to execute an interactive example of our script. MNIST is a database of handwritten digits that are used for training various image processing systems. For more information about MNIST you can click here https://keras.io/examples/vision/mnist_convnet/
-
Here is what the mnist_cnn.py script should look like:
"""
Title: Simple MNIST convnet
Author: [fchollet](https://twitter.com/fchollet)
Date created: 2015/06/19
Last modified: 2020/04/21
Description: A simple convnet that achieves ~99% test accuracy on MNIST.
Accelerator: GPU
"""
"""
## Setup
"""
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
"""
## Prepare the data
"""
# Model / data parameters
num_classes = 10
input_shape = (28, 28, 1)
# Load the data and split it between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
"""
## Build the model
"""
model = keras.Sequential(
[
keras.Input(shape=input_shape),
layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation="softmax"),
]
)
model.summary()
"""
## Train the model
"""
batch_size = 128
epochs = 8
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)
"""
## Evaluate the trained model
"""
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])
-
We can do the following in order to run this script with Keras:
-
Load module:
module load keras/2.9.0
-
Activate Conda environment:
conda activate $KERAS_ROOT
-
Run Script:
srun python mnist_cnn.py
-
Your output should look something like this:
"... Test accuracy: 0.9812
Running Keras in Batch Mode¶
- We can also test this in a normal batch mode. Here is an example batch script:
#!/bin/bash
#SBATCH -J keras_test
#SBATCH -A [Account]
#SBATCH -N 1 -n 8
#SBATCH -t 15
#SBATCH -q embers
#SBATCH -o Report-%j.out
#SBATCH -e Report-%j.err
cd $SLURM_SUBMIT_DIR
module load keras
srun python mnist_cnn.py
- When we can expect the following output:
---------------------------------------
Begin Slurm Prolog: Mar-22-2023 01:11:28
Job ID: 1341319
User ID: gburdell3
Account: [Account]
Job name: keras_test
Partition: cpu-small
QOS: embers`
---------------------------------------
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
---------------------------------------
Begin Slurm Epilog: Mar-22-2023 01:15:30
Job ID: 1341319
Array Job ID: _4294967294
User ID: gburdell3
Account: [Account]
Job name: keras_test
Resources: cpu=8,mem=8G,node=1
Rsrc Used: cput=00:32:40,vmem=9380K,walltime=00:04:05,mem=7928K,energy_used=0
Partition: cpu-small
QOS: embers
Nodes: atl1-1-02-010-23-1
---------------------------------------
- Congratulations! you have succesfully run a parallel python script using Keras on the cluster