Jupyter

Aus PC2 Doc
Dies ist die bestätigte sowie die neueste Version dieser Seite.
Zur Navigation springen Zur Suche springen

NOTE: this service is legacy. Please consider using our JupyterHub instead: https://wikis.uni-paderborn.de/pc2doc/JupyterHub

General Usage[Bearbeiten]

Jupyter notebooks (https://jupyter.org/) with IPython can be used interactively on the compute nodes of OCuLUS. Please note that notebooks are executed directly on a compute node and not on the frontend nodes. We provide a wrapper-script named jupyterspawner that takes care of the configuration of the jupyter notebook and help you to redirect the notebook to your local computer. Thus, it allows you to work interactively on the jupyter notebook that is running on a compute node.

To be able to use the wrapper-script jupyterspawner, you first have to load the corresponding module with

module load jupyterspawner

Then you can start a new jupyter notebooks with the default options (runtime 2h, ncpus=8) with the command

jupyterspawner --start

This will prepare a job for the workload management system (CCS). The job will be submitted to the workload management system and depending on if the resources are currently available, the job will either start directly or be planned for execution at a later point of time. You can give many additional options. Please consult the help (jupyterspawner --help) for a complete list and the examples below.

You can query the current status of your jupyter notebooks with the command

jupyterspawner --list

If a state of a jupyter notebook is listed as running you can connect to it. The procedure for connection is as follows:

  1. First you need to redirect the notebook to your local computer. For this purpose, jupyterspawner gives you a ssh-command with a port redirection. You have to execute this command in a terminal on your local computer.
  2. Secondly, you have to open a web-browser on your local computer an enter the url, e.g. http://127.0.0.1:8000, in the address bar. Even tough it uses unencrypted http in the web-browser, your data is encrypted because it is tunnel through an encrypted ssh-connection.
  3. Thirdly, you have to enter the password to access the notebook. The password is listed in the corresponding section in the output of jupyterspawner --list. The default is to generate a random password. However, you can also set a custom password when you start a new jupyter notebook with the commandline option --password.

All calculations in the jupyter notebooks are executed on the compute node of the cluster and have access to the file systems on the cluster but NOT to data on your local computer.

Please also consider that the jupyter notebook runs as a job on the cluster. Hence, the runtime is limited and resource limitations (ncpus, mem, vmem, see OpenCCS) apply.

The maximal runtime of a jupyter notebook can be set by the commandline argument --time, e.g., --time 10:00 for 10 hours or --time 1-00:00 for one day. If you know when in the future you want to use a jupyter notebook, you can set the start time with the commandline option --begin. For example, --begin 1400 sets the start time to 14 o'clock. The workload manager (CCS) determines if it is possible to start the job at the given time. If this is not possible because the resources are not available, then the job will be discarded. The output of the command to start a notebook, e.g., jupyterspawner --start --begin 1400, will directly tell you if the workload manager was able to allocate of plan the job.

To stop a jupyter notebook you can either

  • kill the corresponding job (ccskill JOBID)
  • jupyterspawner --kill JOBID

Use with different operating systems on your local computer[Bearbeiten]

Linux,....[Bearbeiten]

The only thing you need locally is a web-browser (Chrome, Firefox,...) and ssh.

MacOS[Bearbeiten]

The only thing you need locally is a web-browser (Chrome, Firefox,...) and ssh.

Microsoft Windows[Bearbeiten]

You need a web-browser like Firefox or Chrome. Microsoft Windows does not come with a command line ssh-client. Thus, in order to use the forwarding of the jupyter notebook to your local web-browser you have to configure your ssh-client to forward the required ports.

The procedure for port-forwarding depends on the ssh-client.

In the ssh-client Putty (https://www.putty.org/) you can set up a port-forwarding by selecting Connection->SSH-->Tunnels on the left side. If jupyterspawner --list returns something like 'ssh ... -L 8000:gpu005:10000, you have to use 8000 as the Source port. The Destination is gpu005:10000 in this case. Please make sure, that Local is selected. Then click Add. A new entry should show up in the list of Forwarded ports:. The your open connection as usual.

Requesting Resources (cpu-cores, gpus,...)[Bearbeiten]

A jupyter notebook always runs on a single compute node. To request other resources than in the default choice, you can use the command line arguments

  • --ncpus: requested number of cpu-cores
  • --mem: amount of requested main memory, e.g. --mem 16g
  • --vmem: amount of requested virtual memory, e.g. --vmem 30g
  • --gpu: request a GPU. Possible choices are none, rtx2080=1 (one RTX 2080Ti), rtx2080=2 (two RTX 2080Ti), gtx1080=1 (one GTX 1080 TI), gtx1080=2 (two GTX 1080 TI) or tesla (one Tesla K20).

Alternatively you can also provide the resource specification in the CCS-format with the argument --res, e.g., --res rset=1:ncpus=8:mem=12g.

Jupyter Notebooks with Existing Singularity Containers[Bearbeiten]

It is possible to use existing singularity containers with jupyterspawner. You can specify the path to the container with commandline argument --singularity PATH. Jupyter assumes that the python3-binary is located at /usr/bin/python3 in the container. To use a different path use the commmandline argument --pythonpath. With --kernelname you can set the name of the kernel explicitly. PC2PFS, PC2DATA, PC2SCRATCH and your home-directory are accessible from within the container.

Example[Bearbeiten]

The command 'jupyterspawn --start --time 4:0 --mail --gpu gtx1080=1 --jobname jupyter_on_gtx1080ti --begin 1340 --mem 16g --vmem 85g --singularity $PC2SW/JUPYTERSPAWNER/example/keras.simg --kernelname Keras starts a notebook with the properties

  • --time 4:0: maximal runtime of 4 hours
  • --mail: mail notifications
  • --gpu gtx1080=1: request one GTX 1080 TI
  • --jobname jupyter_on_gtx1080ti: set the name of the job to jupyter_on_gtx1080ti
  • --begin 1345: start the notebook at 13:45
  • --mem 16g: request 16 GB of main memory
  • --vmem 85g: request 85 GB of virtual memory
  • --singularity $PC2SW/JUPYTERSPAWNER/example/keras.simg: use the singularity container at $PC2SW/JUPYTERSPAWNER/example/keras.simg
  • --kernelname Keras: set the name of the kernel to Keras

The command outputs:

Requested jupyter notebook is allocated or planned.
############################################################
state:       planned
job id:      7001504
start:       13:45
time:        4h
resources:   1:ncpus=1:gtx1080=t:gpus=1:mem=16g:vmem=85g
working dir: /upb/departments/pc2/users/r/rschade
password:    f3Ja1cmEkJ8

This tells you that the notebook is planned to start at 13:45.

Shortly after 13:45 jupyterspawn --list will give the output:

####################################################################################################
WARNING: notebook is starting, port is not yet determined. Please try again in a few seconds.
state:         starting
job id:        7001504
node:          gpu005
port (node):   not yet known
port (remote): 8000
start:         13:45
time:          4h
resources:     1:ncpus=1:gtx1080=t:gpus=1:mem=16g:vmem=85g
working dir:   /upb/departments/pc2/users/r/rschade
password:      ciPDlmvTvwf1zZ
logfile:       /upb/departments/pc2/users/r/rschade/.jupyter/jupyterspawn/oculus//1551357797.343124/jupyter.log

telling you that the job is now running and the jupyter notebook is starting.

Waiting a few seconds till the notebook is started, jupyterspawn --list will give the output:

####################################################################################################
state:         running
job id:        7001504
node:          gpu005
port (node):   10000
port (remote): 8000
start:         13:45
time:          4h
resources:     1:ncpus=1:gtx1080=t:gpus=1:mem=16g:vmem=85g
working dir:   /upb/departments/pc2/users/r/rschade
password:      ciPDlmvTvwf1zZ
logfile:       /upb/departments/pc2/users/r/rschade/.jupyter/jupyterspawn/oculus//1551357797.343124/jupyter.log

Please use one of the ssh-commands at your local computer to redirect the jupyter notebook from the node to your local computer:

ssh ACCOUNT@fe.pc2.uni-paderborn.de -L 8000:gpu005:10000
ssh ACCOUNT@fe-2.cv2012.pc2.uni-paderborn.de -L 8000:gpu005:10000

Then open a webbrowser on your local computer and enter  http://127.0.0.1:8000 in the address field.

jupyter notebook is running and listing the command to redirect the notebook to your local web-browser. Now enter one of the ssh commands in a new terminal window and log in. Then open a web-browser and enter the url listed in the last line and log in with the password given in the output of jupyterspawn --list.

In the right upper corner of the jupyter notebook click New

and select the kernel Keras. This opens a new working area:

You can now initialize Keras/Tensorflow:

And try out a training of a neural net on the allocated GPU:

You can find this example at $PC2SW/JUPYTERSPAWNER/example/keras_gpu_test.ipynb.

In Case of Problems[Bearbeiten]

  • If your select the Singularity-kernel in your jupyther notebook and it doesn't start:
    • If the logfile (jupyterspawn --list, e.g. /upb/departments/pc2/users/r/rschade/.jupyter/jupyterspawn/oculus//1551357797.343124/jupyter.log) contains something like /usr/bin/python3: No module named ipykernel you have tried to run a singularity container in which the python installation doesn't contain ipykernel. As a workaround you can perform the following steps:
      • module load singularity
      • singularity shell --cleanenv !PATH_TO_SINGULAIRTY_CONTAINER!
      • pip install --user ipykernel (if that doesn't work, use python3 -m pip install --user ipykernel)
      • exit the singularity container with exit
      • after this restart your jupyther notebook and rty to launch the kernel.