Python plays a significant role in modern programming languages, especially within the data analysis domain where it reigns supreme. According to the TIOBE index, Python has ascended to the top spot. While tools like ipython cater to individual Python environment needs locally, when it comes to team collaboration, JupyterHub emerges as the ideal solution.

JupyterHub is an open-source, multi-user Jupyter Notebook server enabling centralized management of multiple users’ Jupyter Notebooks. For a clear understanding of the distinctions between Jupyter Notebook, JupyterLab, and JupyterHub, refer to the official documentation. In essence, Jupyter Notebook is the classic interactive interface, JupyterLab offers a modern alternative, and JupyterHub serves as the server-side platform for managing multi-user notebooks.

Environment Preparation

In this guide, we’ll be using Ubuntu 22.04 as the server OS and pyenv for Python version management. As per the official README, JupyterHub also relies on configurable-http-proxy, which necessitates a nodejs environment. We’ll use nvm for its installation.

  1. SSH and System Check:

    • Log in to your system via SSH.
    • Verify the system version using the command cat /etc/os-release.
  2. Install pyenv:

    • Run curl https://pyenv.run | bash to install pyenv. This command downloads and executes the installation script, placing pyenv in the user’s home directory under .pyenv.
  3. Configure Shell Environment:

    • Add pyenv environment variables to your shell configuration file:

    For Bash:

    echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
    echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
    echo 'eval "$(pyenv init -)"' >> ~/.bashrc
    

    For Zsh:

    echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
    echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc
    echo 'eval "$(pyenv init -)"' >> ~/.zshrc
    
    • Activate the changes with exec "$SHELL".
    • Verify the installation using pyenv --version.
  4. Install Python:

    • Install Python 3.10 using pyenv install 3.10.
    • Check available Python versions with pyenv versions.
    pyenv versions
      system
      3.9.10
    * 3.10.13 (set by /home/ubuntu/.python-version)
    
  5. Install nvm and Node.js:

    • Run curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash to install nvm.
    • Activate the changes with exec "$SHELL".
    • Install Node.js version 18:
    nvm install 18
    nvm use 18
    
  6. Install configurable-http-proxy:

    • Run npm install -g configurable-http-proxy.
    • Locate its installation path using which configurable-http-proxy. You’ll need this path later when configuring the systemd service. For example, the path might be /home/ubuntu/.nvm/versions/node/v18.19.0/bin/configurable-http-proxy.

With these steps, the initial environment setup is complete.

Installing JupyterHub

  1. Install JupyterHub and JupyterLab:

    pip install jupyterhub
    pip install --upgrade jupyterlab jupyter
    
  2. Test the Installation:

    • Run jupyterhub to test if the environment is set up correctly. You should see log output similar to:
    [I 2023-12-25 14:06:11.122 JupyterHub app:2859] Running JupyterHub version 4.0.2
    [I 2023-12-25 14:06:11.122 JupyterHub app:2889] Using Authenticator: jupyterhub.auth.PAMAuthenticator-4.0.2
    [I 2023-12-25 14:06:11.122 JupyterHub app:2889] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-4.0.2
    [I 2023-12-25 14:06:11.122 JupyterHub app:2889] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-4.0.2
    [I 2023-12-25 14:06:11.128 JupyterHub app:1664] Loading cookie_secret from /home/ubuntu/jupyterhub_cookie_secret
    [I 2023-12-25 14:06:11.191 JupyterHub proxy:556] Generating new CONFIGPROXY_AUTH_TOKEN
    [I 2023-12-25 14:06:11.200 JupyterHub app:1984] Not using allowed_users. Any authenticated user will be allowed.
    [I 2023-12-25 14:06:11.220 JupyterHub app:2928] Initialized 0 spawners in 0.002 seconds
    [I 2023-12-25 14:06:11.228 JupyterHub metrics:278] Found 0 active users in the last ActiveUserPeriods.twenty_four_hours
    [I 2023-12-25 14:06:11.228 JupyterHub metrics:278] Found 0 active users in the last ActiveUserPeriods.seven_days
    [I 2023-12-25 14:06:11.229 JupyterHub metrics:278] Found 0 active users in the last ActiveUserPeriods.thirty_days
    [W 2023-12-25 14:06:11.230 JupyterHub proxy:746] Running JupyterHub without SSL.  I hope there is SSL termination happening somewhere else...
    

Configuration and Service Management

While JupyterHub is installed, it requires configuration to align with service requirements. We’ll store configuration files, including jupyterhub_config.py and jupyterhub_cookie_secret, within the /etc/jupyterhub/ directory.

  1. Generate Default Configuration:

    jupyterhub --generate-config
    
  2. Modify Configuration:

    • Use commands to modify the configuration file:
    # inline replace
    sed -i "s#.*c.Authenticator.allowed_users.*#c.Authenticator.allowed_users = {'`whoami`'}#" jupyterhub_config.py
    sed -i "s#.*c.Authenticator.admin_users.*#c.Authenticator.admin_users = {'`whoami`'}#" jupyterhub_config.py
    sed -i "s#.*c.JupyterHub.cookie_secret_file.*#c.JupyterHub.cookie_secret_file = '/etc/jupyterhub/jupyterhub_cookie_secret'#" jupyterhub_config.py
    sed -i "s#.*c.JupyterHub.db_url.*#c.JupyterHub.db_url = 'sqlite:////etc/jupyterhub/jupyterhub.sqlite'#" jupyterhub_config.py
    
    # append line for c.ConfigurableHTTPProxy.pid_file
    sed -i "\#.*c.JupyterHub.pid_file.*#a c.ConfigurableHTTPProxy.pid_file = '/tmp/jupyterhub-proxy.pid'" jupyterhub_config.py
    

    Replace whoami with the actual username. For multi-user teams, adjust the allowed_users list accordingly.

  3. Move Configuration File:

    sudo mkdir /etc/jupyterhub/
    sudo chown -R $USER:$USER /etc/jupyterhub/
    mv jupyterhub_config.py /etc/jupyterhub/
    
  4. Create Systemd Service:

    • Create a systemd service file and start the service:
    sudo tee /etc/systemd/system/jupyterhub.service << END
    [Unit]
    Description=JupyterHub
    After=syslog.target network.target
    
    [Service]
    User=ubuntu
    Environment="PATH=/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/home/ubuntu/.nvm/versions/node/v18.19.0/bin:/home/ubuntu/.pyenv/versions/3.10.13/bin"
    ExecStart=/home/ubuntu/.pyenv/versions/3.10.13/bin/jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
    
    [Install]
    WantedBy=multi-user.target
    END
    
    sudo systemctl daemon-reload
    sudo systemctl start jupyterhub
    sudo systemctl enable jupyterhub
    

    Adjust the User and Environment variables to match your setup. The environment variable includes the path to the configurable-http-proxy tool installed via npm.

    If your virtual machine primarily uses SSH for login and lacks a default password, set a password for the current user using sudo passwd.

  5. Set Up Reverse Proxy (Optional):

    • Point your domain name to the server’s IP address.
    • Use Caddy as a reverse proxy by adding the following to /etc/caddy/Caddyfile:
    hub.example.com {
        reverse_proxy localhost:8000
    }
    

Now, you can access your JupyterHub instance by visiting your domain name and logging in with the configured user credentials.

jupyterhub ui