Partition and Compute Nodes
The operation of HPC jobs running on the EdUHK-HPC platform is managed by Slurm workload manager for task scheduling in queue and improve hardware resources usage.
The current Slurm partitions setting in EdUHK-HPC platform
Section titled “The current Slurm partitions setting in EdUHK-HPC platform”| Partition Name | Priority | Grace Time (Hours) |
Max. Job Duration (Hours) |
Allowed Users | CPU | Memory | GPU | |
|---|---|---|---|---|---|---|---|---|
| shared_cpu* | Normal | 24 | 24 | All Registered Users | 4GB/CPU | None | DEFAULT | |
| 15 cores/Job | 4GB/CPU | MAX | ||||||
| Total: 96 CPU cores + 2048GB RAM | ||||||||
| shared_gpu_l40 | Normal | 24 | 24 | All Registered Users | 4GB/CPU | DEFAULT | ||
| 10 cores/Job | 4GB/CPU | 2 GPU/Job | MAX | |||||
| Total: 380 CPU cores + 3548GB RAM + 16 x L40 GPUs | ||||||||
| shared_gpu_h20 | Normal | 24 | 24 | All Registered Users | 4GB/CPU | DEFAULT | ||
| 10 cores/Job | 4GB/CPU | 2 GPU/Job | MAX | |||||
| Total: 96 CPU cores + 2048GB RAM + 16 x H20 GPUs | ||||||||
| udsai_gpu_l40 | High | 336 | 336 | UDSAI Assigned Users | 4GB/CPU | DEFAULT | ||
| 20 cores/Job | 4GB/CPU | 2 GPU/Job | Max | |||||
| Total: 128 CPU cores + 1024GB RAM + 16 x L40 GPUs | ||||||||
- Default partition if not specified in job script
| Compute Node | Node Type | Features | GRES |
|---|---|---|---|
| eduhkhpc1 - 2 | GPU Node |
CPU_MNF:INTEL CPU_FRQ:2.1GHz CPU_GEN:XEON-4 CPU_SKU:6430 GPU_CC:8.9 GPU_GEN:ADA GPU_MEM:48GB GPU_SKU:L40_PCIE |
gpu:l40 |
| eduhkhpc3 - 4 | CPU Node |
CPU_MNF:INTEL CPU_FRQ:2.9GHz CPU_GEN:XEON-5 CPU_SKU:6542Y NO_GPU |
None |
| h20gpu1 | GPU Node |
CPU_MNF:INTEL CPU_FRQ:2.0GHz CPU_GEN:XEON-4 CPU_SKU:8460Y+ GPU_CC:9.0 GPU_GEN:HOPPER GPU_MEM:96GB GPU_SKU:H20_DGX |
gpu:h20 |
| h20gpu2 | GPU Node |
CPU_MNF:AMD CPU_FRQ:2.4GHz CPU_GEN:EPYC-4 CPU_SKU:9654 GPU_CC:9.0 GPU_GEN:HOPPER GPU_MEM:96GB GPU_SKU:H20_DGX |
gpu:h20 |
| Feature | Description |
|---|---|
| CPU_MNF:INTEL | Select only nodes with Intel CPUs |
| CPU_MNF:AMD | Select only nodes with AMD CPUs |
| CPU_FRQ:2.0GHz | Select only nodes with CPUs provides 2.0GHz Clock |
| CPU_FRQ:2.1GHz | Select only nodes with CPUs provides 2.1GHz Clock |
| CPU_FRQ:2.4GHz | Select only nodes with CPUs provides 2.4GHz Clock |
| CPU_FRQ:2.9GHz | Select only nodes with CPUs provides 2.9GHz Clock |
| CPU_GEN:XEON-4 | Select only nodes with 4th generation Intel Xeon CPU chips |
| CPU_GEN:XEON-5 | Select only nodes with 5th generation Intel Xeon CPU chips |
| CPU_GEN:EPYC-4 | Select only nodes with 4th generation AMD EPYC CPU chips |
| CPU_SKU:6542Y | Select only nodes with Intel Xeon Gold 6542Y CPU chips |
| CPU_SKU:6430 | Select only nodes with Intel Xeon Gold 6430 CPU chips |
| CPU_SKU:8460Y+ | Select only nodes with Intel Xeon Gold 8460Y+ CPU chips |
| CPU_SKU:9654 | Select only nodes with Intel AMD EPYC 9654 CPU chips |
| GPU_CC:8.9 | Select only nodes with Nvidia GPUs by Compute Capability 8.9 (used with --gres) |
| GPU_CC:9.0 | Select only nodes with Nvidia GPUs by Compute Capability 9.0 (used with --gres) |
| GPU_GEN:ADA | Select only nodes with Ada Lovelace architecture GPUs (used with --gres) |
| GPU_GEN:HOPPER | Select only nodes with Server Hopper architecture GPUs (used with --gres) |
| GPU_MEM:48GB | Select only nodes with Nvidia 48GB VRAM GPUs (used with --gres) |
| GPU_MEM:96GB | Select only nodes with Nvidia 96GB VRAM GPUs (used with --gres) |
| GPU_SKU:L40_PCIE | Select only nodes with Nvidia L40 GPUs – PCI-e (used with --gres) |
| GPU_SKU:H20_DGX | Select only nodes with Nvidia H20 GPUs – DGX (used with --gres) |
| NO_GPU | Select only nodes with No GPUs installed |
For more information about Slurm workload manager, please refer to: SLURM Officiial Documents