Available machines

As an overview, you have three choices in order to run GPU-enabled code:

The very heavily-used lxplus-gpu machines ([NOT RECOMMENDED]).
The Point5's Online Machines (RECOMMENDED).
Special nodes.

lxplus-gpu

The lxplus service offers lxplus-gpu.cern.ch for shared GPU instances - with limited isolation and performance.

One can connect similary as would do to the lxplus.cern.ch host domain.

ssh <username>@lxplus-gpu.cern.ch [-X]

If connecting directly from your computer

You might need to initialize kerberos and execute bash again from within the machine:

kinit
exec bash

CMS Point 5 Machines

Info

This section is taken from the CMS TWiki TriggerDevelopmentWithGPUs page.

There are 10 machines available for general development and validation of the online reconstruction on GPUs:

gpu-c2a02-35-01.cms
gpu-c2a02-35-02.cms
gpu-c2a02-37-01.cms
gpu-c2a02-37-02.cms (currently without a GPU)
gpu-c2a02-37-03.cms
gpu-c2a02-37-04.cms
gpu-c2a02-39-01.cms
gpu-c2a02-39-02.cms (Preferred)
gpu-c2a02-39-03.cms
gpu-c2a02-39-04.cms

These are dedicated machines for the development of the online reconstruction.

To access them, you will first need a CMS online account. See below for instructions.

Request a CMS Online account

To request access, please subscribe to the cms-hlt-gpu e-group and send an email to andrea.bocci@cern.ch, indicating:

whether you already have an online account;
your online or lxplus username;
your full name and email.

How to connect

Requirements:

Have a CMS online account and
Be in the gpudev group.

To connect directly from your computer:

Create a proxy:

ssh -f -N -D18080 <username>@cmsusr.cern.ch

Connect via SSH:

ssh -o ProxyCommand='nc --proxy localhost:18080 --proxy-type socks5 %h %p' <username>@gpu-c2a02-39-02.cms

or

ssh -o ProxyCommand='nc -x localhost:18080 -X 5 %h %p' <username>@gpu-c2a02-39-02.cms

Note

More detailed instructions here

Special configuration required

To make commands like cmsenv and cmsrel available, run
```
source /cvmfs/cms.cern.ch/cmsset_default.sh
```
first.

To allow connecting to GitHub via HTTP:

Configure the SOCKS proxy
Open the proxy:
```
ssh -f -N cmsusr.cms
```

Configure git:

git config --global --replace-all http.proxy socks5://localhost:18080

Set the correct SCRAM_ARCH for these machines:
```
export SCRAM_ARCH=el8_amd64_gcc10
```

Notes

These machines lie in a different subnet than the one that the LXPLUS machines belong to.
A side-effect of the previous point is that those machines do not have access to the Grid.
The /nfshome0/<username> directory is shared and available from all the machines above, but has limited space.
The /data/user/<username> directory is not shared across the devices, but has larger capacity.
CMS Cluster Users Guide
Use curl with the --socks5 flag like so: curl --socks5 socks5://localhost:18080 <url>

Useful commands

Transfering files to/from P5 machines

From your own computer:

scp -r -o ProxyCommand='nc -x localhost:18080 -X 5 %h %p' <username>@gpu-c2a02-39-01.cms:/remote/path /local/path

This prevents the nc: invalid option -- '-' error.

Special GPU nodes

Info

This section is more or less taken from the Patatrack website systems subpage.

cmg-gpu1080

System information

Topology of the machine

Getting access to the machine

In order to get access to the machine you should send a request to subscribe to the CERN e-group: cms-gpu-devel.

You should also send an email to Felice Pantaleo motivating the reason for the requested access.

Usage Policy

Normally, no more than 1 GPU per users should be used. To limit visible devices use

export CUDA_VISIBLE_DEVICES=<list of numbers>

Where <list of numbers> can be e.g. 0, 0,4, 1,2,3. Use nvidia-smi to check available resources.

Usage for ML studies

If you need to use the machine for training DNNs you could accidentally occupy all the GPUs, making them unavailable for other users.

For this reason you're kindly asked to use

import setGPU

before any import that will use a GPU (e.g. tensorflow). This will assign to you the least loaded GPU on the system.

It is strictly forbidden to use GPUs from within your jupyter notebook. Please export your notebook to a python program and execute it. The access to the machine will be revoked when failing to comply to this rule.