Best Practices - GPUs on Pinnacles
This guide provides an overview of the GPU resources available on the Pinnacles cluster at UC Merced. GPUs offer accelerated performance for machine learning, scientific computing, and data-intensive workloads.
Accessing and Running GPU Jobs
GPU usage on Pinnacles has been extremely high recently. To ensure fair access and maximize resource efficiency, please only request the amount of GPU resources you actually need.
Please be aware that CIRT may terminate GPU jobs that are significantly underutilizing or not actively using the GPUs, in order to free resources for other users.
GPU Allocation
Users must request GPU resources explicitly in their job submissions using the appropriate Slurm directives.
Submission Guidelines
Below are some example directives that may be used when preparing Slurm Job Scripts or Interactive Sessions.
| Directive | Purpose |
|---|---|
| --gres=gpu:N | Specify to Slurm you will require use of GPU and how many (1 or 2). |
| Optional Directives Below | May be used as another layer of control/optimization |
--constraint=<feature> | Specify which GPU type (A100, L40s, etc), useful when running on test partition |
--cpus-per-gpu=<N> | Reserve N CPU Threads per GPU |
--gpus-per-node=<N> | Specify how many GPUs per node |
--mem-per-gpu=N | N (GB/MB) RAM to be proportioned per GPU |
For some general Slurm directives and submission scripts, please see here
GPU Types
GPU Comparison Table
| GPU | Technical Notes | Best Use Cases |
|---|---|---|
| A100 | High Precision (FP64) | Data/Computational Intensive workloads that require higher numerical precision |
| L40s | Lower Precision (FP32), | Machine learning/deep learning training, AI Workloads |
Harnessing GPUs for Machine Learning
Key Considerations
- Framework Selection: Choose GPU-optimized frameworks that leverage CUDA
- Batch Sizing: Balance between GPU memory limits and training efficiency
Resource Planning
- Estimate GPU memory requirements based on model architecture.
- Consider multi-GPU strategies for large-scale training.
- Plan for checkpointing for fault tolerance.
- Monitor GPU utilization to ensure efficient resource usage. After logging into a GPU node, run the following command:
nvidia-smi
Harnessing GPUs for Scientific Computing
Application Domains
GPUs excel in scientific computing applications including:
- Computational fluid dynamics
- Molecular dynamics simulations
- Climate and weather modeling
- Bioinformatics and genomics
- Physics simulations
- Image and signal processing
Optimization Strategies
- Identify components to parallelize of algorithms or program.
- Minimize CPU-GPU data transfers
- Utilize GPU-accelerated libraries when available
- Consider domain-specific GPU implementations
Common and Supported Frameworks
Machine Learning Frameworks
The following frameworks are commonly used on Pinnacles GPUs:
- PyTorch: Dynamic computational graphs, research-friendly
- TensorFlow: Production-ready, extensive ecosystem
Scientific Computing Libraries
- CUDA Toolkit: NVIDIA GPU programming platform
- cuDNN: Deep learning primitives library
- cuBLAS: GPU-accelerated BLAS operations
Best Practices and Performance Tips
Best Practices for Pinnacles
- Share GPUs when possible through efficient job scheduling. Do not reserve/allocate more than the job will be anticipated to use.
- Right-sizing: Request only the GPU resources you need
Job Optimization
- Profiling: Use profiling tools to identify bottlenecks
- Queue Selection: Select
gpuqueue for use of A100 GPUs, selectcenvalarc.gpufor L40s GPUs.