Cluster Configuration | User Guide | Administration Guide
Download GPU Cluster User Guide.pdf
GPU Cluster User Guide
Quick Start
- Log into the cluster via SSH. (Recommended SSH client software:
PuTTY/MobaXterm for Windows, iterm2/Terminus for Mac.)For example:
ssh [username]@[hostname]
IMPORTANT! If this is your first time logging into the system, please update your account IMMEDIATELY with a strong password using the command line:
passwd [username]
- Log into the Docker Container using received
container-id
.sudo docker attach [container-id]
- You are sudo user in the container now. Get started with your program!
- Detach from the container using the command:
Ctrl+P
followed byCtrl+Q
. The container will be stopped if you exit the container (e.g., using the commandexit
). If you want to keep the container running, please use the detach command.
Start the container if you accidentally exit it:sudo docker
start [container-id]
Restart the container if required:sudo docker restart
[container-id]
Copy files from/to the containersudo docker cp [OPTIONS]
[container-id]:[src_path] [dest_path]
sudo docker cp [OPTIONS]
[src_path] [container-id]:[dest_path]
More details at https://docs.docker.com/engine/reference/commandline/cp/
If you need a large space of storage (>200GB), please contact the administrator to create a volume for your container without the need of copying the files.
Tips
- Want to keep your program running after network disconnection?
Try SSH session management tools, e.g., Byobu (https://www.byobu.org/home), screen (https://www.digitalocean.com/community/tutorials/how-to-install-and-use-screen-on-an-ubuntu-cloud-server). - Check the status of GPU devices
nvidia-smi
- Select a GPU device for your program if you have multiple GPUs
CUDA_VISIBLE_DEVICES=[GPU-IDs] python your_program.py