Skip to content

Conversation

@dcmcand
Copy link
Contributor

@dcmcand dcmcand commented Oct 20, 2025

This commit implements comprehensive security hardening across all container images while maintaining full functionality and package compatibility.

Security Improvements:

  • Non-root execution: All containers run as nebari user (UID 1000, GID 100)
  • Pinned base images: Ubuntu 24.04 and CUDA images pinned to manifest digests
  • Verified binaries: Pixi v0.32.1 with SHA256 verification (amd64/arm64)
  • Multi-stage builds: Separate builder and runtime stages for each component
  • Minimal runtime images: Only essential libraries in final images
  • Secure installation: Direct binary downloads replace curl|bash patterns

Supply Chain Security:

  • SBOM generation: Docker buildx SBOM attestation for all images
  • Provenance tracking: Build provenance attached to images
  • Can be inspected with: docker buildx imagetools inspect IMAGE --format "{{ json .SBOM }}"

Reproducibility:

  • Manifest list digests for multi-arch builds (linux/amd64, linux/arm64)
  • Pinned pixi version with checksums
  • OCI image labels with base digest and component metadata
  • Build ARGs for easy version updates

Functionality:

  • External environment mounting at /opt/envs preserved
  • All current packages maintained (no removals)
  • Code-server installation in jupyterlab (with wget/curl in builder only)
  • Proper file permissions via updated fix-permissions script

Infrastructure:

  • Upgraded from Ubuntu 20.04 to 24.04
  • 11-stage multi-stage build architecture:
    1. pixi-installer: Secure pixi binary installation
    2. builder: Base with nebari user 3-4. dask-worker-builder + runtime 5-6. jupyterhub-builder + runtime
    3. intermediate: Common base for jupyterlab/workflow-controller 8-9. jupyterlab-builder + runtime 10-11. workflow-controller-builder + runtime

Technical Details:

  • Ubuntu 24.04 default ubuntu user (UID 1000): removed and replaced with nebari
  • Users group (GID 100) already exists in Ubuntu 24.04
  • PostBuild scripts run as root when creating /opt directories
  • BuildKit cache mounts for faster builds
  • fix-permissions updated for nebari:users ownership

All images successfully built and verified to run as nebari user.

@dcmcand dcmcand requested a review from marcelovilla October 20, 2025 13:32
Dockerfile Outdated
Comment on lines 59 to 62
RUN userdel -r ubuntu && \
useradd -r -m -u 1000 -g 100 -s /bin/bash nebari && \
mkdir -p /opt/envs && \
chown -R nebari:users /home/nebari /opt/envs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of deleting user ubuntu and creating user nebari? Are we sure this is not going to mess with permissions on the pods (e.g. mounting the filesystem)?

Copy link
Contributor Author

@dcmcand dcmcand Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought was to not have unused users laying around, but you are correct, that may lead to some issues, so now user Nebari has uid 1001

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test these images on a Nebari deployment (as in using JupyterLab or running our E2E tests)? I would like to test that everything works as expected before merging.

@marcelovilla
Copy link
Member

@dcmcand can you resolve the conflicts?

This commit implements comprehensive security hardening across all container
images while maintaining full functionality and package compatibility.

Security Improvements:
- Non-root execution: All containers run as nebari user (UID 1000, GID 100)
- Pinned base images: Ubuntu 24.04 and CUDA images pinned to manifest digests
- Verified binaries: Pixi v0.32.1 with SHA256 verification (amd64/arm64)
- Multi-stage builds: Separate builder and runtime stages for each component
- Minimal runtime images: Only essential libraries in final images
- Secure installation: Direct binary downloads replace curl|bash patterns

Supply Chain Security:
- SBOM generation: Docker buildx SBOM attestation for all images
- Provenance tracking: Build provenance attached to images
- Can be inspected with: docker buildx imagetools inspect IMAGE --format "{{ json .SBOM }}"

Reproducibility:
- Manifest list digests for multi-arch builds (linux/amd64, linux/arm64)
- Pinned pixi version with checksums
- OCI image labels with base digest and component metadata
- Build ARGs for easy version updates

Functionality:
- External environment mounting at /opt/envs preserved
- All current packages maintained (no removals)
- Code-server installation in jupyterlab (with wget/curl in builder only)
- Proper file permissions via updated fix-permissions script

Infrastructure:
- Upgraded from Ubuntu 20.04 to 24.04
- 11-stage multi-stage build architecture:
  1. pixi-installer: Secure pixi binary installation
  2. builder: Base with nebari user
  3-4. dask-worker-builder + runtime
  5-6. jupyterhub-builder + runtime
  7. intermediate: Common base for jupyterlab/workflow-controller
  8-9. jupyterlab-builder + runtime
  10-11. workflow-controller-builder + runtime

Technical Details:
- Ubuntu 24.04 default ubuntu user (UID 1000): removed and replaced with nebari
- Users group (GID 100) already exists in Ubuntu 24.04
- PostBuild scripts run as root when creating /opt directories
- BuildKit cache mounts for faster builds
- fix-permissions updated for nebari:users ownership

All images successfully built and verified to run as nebari user.
Remove BuildKit cache mounts from pixi-installer stage to resolve QEMU
emulation issues when building ARM64 images on AMD64 runners. The cache
mounts cause apt-get operations to fail with exit code 100 in emulated
environments.

Also remove unused BASE_IMAGE build argument from CI workflow since the
hardened Dockerfile now uses digest-pinned base images.

Changes:
- Remove --mount=type=cache flags from pixi-installer stage
- Remove BASE_IMAGE and GPU_BASE_IMAGE env vars from workflow
- Remove build-args: BASE_IMAGE from build step
@dcmcand dcmcand force-pushed the pixi-lockfiles-hardened branch from 64e788e to 3839a25 Compare October 28, 2025 13:06
Remove unused BASE_IMAGE build arg infrastructure from CI workflow.
Both CPU and GPU images now use the same unified Dockerfile with
ubuntu:24.04 base. GPU support is provided via NVIDIA environment
variables and runtime library mounting rather than separate CUDA
base images.

Benefits:
- Smaller images (no CUDA toolkit baked in)
- Single image works for both CPU and GPU deployments
- Easier maintenance (no separate base image logic needed)
- GPU variants differentiated only by -gpu tag suffix
- Change nebari user group from GID 1001 to GID 100 (users group)
  for better compatibility with JupyterHub and multi-user environments
- Add bzip2 to pixi-installer stage dependencies
- Update fix-permissions script comment to reflect correct UID 1001
- Normalize indentation in fix-permissions script
- Upgrade pixi from 0.32.1 to 0.58.0 to support lockfile version 6
- Update SHA256 checksums for pixi 0.58.0 binaries (amd64 and arm64)
- Add DEFAULT_ENV as global ARG with default value "default"
- Redeclare DEFAULT_ENV, UBUNTU_DIGEST, and PIXI_VERSION ARGs in all
  runtime stages (dask-worker, jupyterhub, jupyterlab, workflow-controller)
  for proper variable scoping in multi-stage builds
- Fix pixi install --manifest-path to point to pixi.toml file instead of
  directory (pixi 0.58.0 is stricter about this)
- Standardize pixi clean cache to always use --yes flag

This resolves build failures where:
1. Lockfiles generated with newer pixi couldn't be read by pixi 0.32.1
2. Variables were undefined in stages that didn't redeclare them
pixi clean cache --yes

# Run postBuild as root (creates files in /opt)
USER root
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure to switch back to user nebari after creating the files in /opt

This change migrates system utilities from apt packages to conda-forge
packages managed by pixi, improving reproducibility and fixing the
JupyterHub crashloop caused by missing git executable.

Changes:
- JupyterHub: Add git to fix crashloop when jhub-apps tries to import GitPython
- JupyterLab: Add htop, tree, zip, unzip, openssh, tmux, nano, vim, zsh, neovim, wget, curl
- Dask-Worker: Add git-lfs
- Dockerfile: Remove moved packages from apt install commands

Packages kept as apt:
- emacs: Not available for linux-aarch64 in conda-forge
- gnupg, pinentry-curses: Kept in apt for system GPG functionality
- All lib* packages and system libraries: Must remain as apt packages

Benefits:
- Fixes JupyterHub crashloop (git now available at runtime)
- Better reproducibility with cryptographic hashes in lock files
- Cross-platform consistency (same versions on linux-64 and linux-aarch64)
- Unified dependency management via pixi
- Smaller Docker images with fewer apt packages
The postBuild script needs wget to download code-server installer.
Since wget was moved from apt to pixi, we need to run postBuild
through 'pixi run' to ensure wget is available in the PATH.

Fixes build error:
  /opt/scripts/install-code-server.sh: 14: wget: not found
JupyterLab extensions like jupyterlab-git and nbgitpuller require
git to be available at runtime. This adds git to the jupyterlab
pixi.toml dependencies.

This was an oversight from the previous commit that moved system
dependencies to pixi - we added git-lfs but forgot to add git itself.
Replace the default ubuntu user (UID 1000) with the nebari user
running as UID 1000 instead of 1001. This ensures compatibility
with Kubernetes security contexts and volume permissions that
expect UID 1000.

Changes:
- Delete ubuntu user (UID 1000) from base image
- Create nebari user with UID 1000 (GID 100/users)
- Maintains non-root security posture
- Home directory: /home/nebari
Pixi creates local cache directories (.pixi/) that should not be
committed to version control. These contain installed packages
and are environment-specific.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants