Skip to content

Setup & Configure Jupyterhub

Balthasar Hofer edited this page Oct 13, 2020 · 3 revisions

TLJH - The Littlest Jupyterhub

Setup a new jupyterhub on an ubuntu 20.04 server.

  1. Install your preferred python version and set it as the systems default.
  2. Install tljh as described here
  3. Configure everything as you want it, thus setup ssh and configure an admin user

Setup Oauth2

Register your app and get the TENANT_ID, CLIENT_SECRET and CLIENT_ID.

Install oauthenticator inside the python3 environment you chose to run tljh, by default with

/opt/tljh/hub/bin/python3 -m pip install oauthenticator

Access the logs

journalctl -n 100 -f -u jupyterhub

To handle the special requirements of the gbsl response (in favor to create propper user names), we create our own authenticator within the installed oauthenticator package folder. Locate the package directory by running

/opt/tljh/hub/bin/python3 -m pip show oauthenticator

Then add a new file called my_azuread.py with the following content

"""
Custom Authenticator to use Azure AD with JupyterHub

""""""
Custom Authenticator to use Azure AD with JupyterHub

"""
import json
import jwt
import urllib

from tornado.httpclient import HTTPRequest, AsyncHTTPClient

from jupyterhub.auth import LocalAuthenticator

from traitlets import default

from .azuread import AzureAdOAuthenticator, azure_token_url_for


class MyAzureAdOAuthenticator(AzureAdOAuthenticator):
    login_service = "Office365 GBSL"

    @default('username_claim')
    def _username_claim_default(self):
        return 'unique_name'

    async def authenticate(self, handler, data=None):
        code = handler.get_argument("code")
        http_client = AsyncHTTPClient()

        params = dict(
            client_id=self.client_id,
            client_secret=self.client_secret,
            grant_type='authorization_code',
            code=code,
            redirect_uri=self.get_callback_url(handler))

        data = urllib.parse.urlencode(
            params, doseq=True, encoding='utf-8', safe='=')

        url = azure_token_url_for(self.tenant_id)

        headers = {
            'Content-Type':
            'application/x-www-form-urlencoded; charset=UTF-8'
        }
        req = HTTPRequest(
            url,
            method="POST",
            headers=headers,
            body=data  # Body is required for a POST...
        )

        resp = await http_client.fetch(req)
        resp_json = json.loads(resp.body.decode('utf8', 'replace'))

        # app_log.info("Response %s", resp_json)
        access_token = resp_json['access_token']

        id_token = resp_json['id_token']
        decoded = jwt.decode(id_token, verify=False)

        cleaned_name = decoded[self.username_claim]
        cleaned_name = cleaned_name.replace(',', '')
        cleaned_name = cleaned_name.replace(' ', '')
        cleaned_name = cleaned_name.replace('@', '__')
        cleaned_name = cleaned_name.replace('.', '_')

        if len(cleaned_name) > 31:
            # we need to shorten this because it won't work with the system's useradd!
            splitpos = cleaned_name.find("__")
            before = cleaned_name[0:splitpos]
            after = cleaned_name[splitpos:]
            remaining = 31 - len(after)
            shortened = before[0:remaining]
            cleaned_name = shortened + after

        userdict = {"name": cleaned_name}

        userdict["auth_state"] = auth_state = {}
        auth_state['access_token'] = access_token
        # results in a decoded JWT for the user data
        auth_state['user'] = decoded

        return userdict


class LocalMyAzureAdOAuthenticator(LocalAuthenticator, MyAzureAdOAuthenticator):
    """A version that mixes in local system user creation"""
    pass

jupyterhub config

Most configuration can be done over tljh-config or by directly editing config.yaml (by default the config is located under opt/tljh/config/config.yaml. The users and https fields should be already set, i added additional the limits per user fields...

The new fields you should add are the auth fields.

users:
  admin:
  - foobar
https:
  enabled: true
  letsencrypt:
    email: foo@bar
    domains:
    - jupyter.foo.bar
limits:
  memory: 1024M
  cpu: 1
auth:
  type: "oauthenticator.my_azuread.LocalMyAzureAdOAuthenticator"
  LocalMyAzureAdOAuthenticator:
    tenant_id: "xxxxxx-xxxxxx-xxxxxxx"
    client_id: "xxxxxx-xxxxxx-xxxxxxx"
    client_secret: "xxxxxx-xxxxxx-xxxxxxx"
    oauth_callback_url: "https://jupyter.XXXXXX.XXXXX/hub/oauth_callback"

systemd service

Jupyterhub is started over stystemd. Edit the service /etc/systemd/system/jupyterhub.service by adding the required environment variables (and if needed change the ExecStart command)

Environment=LOGIN_SERVICE="Office365 GBSL"
# This is our Azure AD Tenant ID 
Environment=AAD_TENANT_ID=xxxxx-xxxxxxx-xxxxxxxx-xxxxxxx

tljh/config/jupyterhub_config.d

All python files within `` are automatically included to the jupyterhub_config. Add a new file user_configs.py:

c.LocalAuthenticator.create_system_users = True
c.SystemdSpawner.unit_name_template = '{USERNAME}'
c.Authenticator.username_map = {
    "balthasar_hofer__gbsl_ch" : "lebalz",
}

Change the unit_name_template in the default UserCreatingSpawner

Locate the raw jupyterhub_config.py with

find / -iname jupyterhub_config.py

And edit the UserCreatingSpawner(SystemdSpawner) to use our configured unit_name_template:

# system_username = generate_system_username('jupyter-' + self.user.name)
system_username = generate_system_username(self.user.name)

Fix wrong name length unit in tljh/normalize.py

The tljh/normalize.py changes usernames already starting from 26 character length. This is a bit too restrictive and can be extended to 32 character length...

Locate the raw normalize.py with

find / -iname normalize.py

and edit...

def generate_system_username(username):
    """
    Generate a posix username from given username.

    If username < 26 char, we just return it.
    Else, we hash the username, truncate username at
    26 char, append a '-' and first add 5char of hash. 
    This makes sure our usernames are always under 32char.
    """

    if len(username) <= 32:
        return username

    userhash = hashlib.sha256(username.encode('utf-8')).hexdigest()
    return '{username_trunc}-{hash}'.format(
        username_trunc=username[:26],
        hash=userhash[:5]
    )

Reload config

tljh-config reload

Use Dummy Authenticator

tljh-config set auth.type dummyauthenticator.DummyAuthenticator
tljh-config set auth.DummyAuthenticator.password foobar