UPF Scientific Computing Core Facility

Coming soon

Containers & Kubernetes for Scientific Workflows

This upcoming service will support reproducible, interactive and scalable scientific applications in the Correfoc environment, connecting containers, GPUs, storage, visualization and modern research workflows.

The key idea

Kubernetes does not replace the HPC scheduler. It complements it when research workflows become interactive, service-based or dynamically orchestrated.

Service status

Containers & Kubernetes is not available yet. The SCC team is preparing this service for advanced scientific applications and reproducible platforms.

The mental model

Three layers, three different jobs

The Correfoc HPC ecosystem can combine familiar schedulers, web entry points and application orchestration without forcing every workload into one model.

Slurm

Runs computational jobs

  • Ideal for MPI, batch, job arrays and simulations
  • The user submits a job and gets an output
  • Best for classic HPC execution patterns
Kubernetes

Runs scientific applications

  • Ideal for services, APIs, dashboards and inference
  • Coordinates dynamic workers and workflows
  • Applications can react, scale and combine components
Open OnDemand

Provides the user-facing portal

  • Ideal for desktops, notebooks and interactive apps
  • Gives researchers a browser-based entry point
  • Hides infrastructure complexity behind useful tools
User Open OnDemand Scientific App Kubernetes Services / Workers Results / Visualization

Slurm runs jobs. Kubernetes runs scientific applications. Open OnDemand gives users an easy web entry point.

Why containers?

Reproducible software environments matter in research

Many scientific workflows depend on exact versions of Python, R, CUDA, scientific libraries, bioinformatics tools or AI frameworks.

Containers package software, libraries and runtime assumptions together so that projects are easier to share, rerun and move between environments.

Reproducibility

Keep software versions close to the analysis.

Collaboration

Share the same environment across groups.

Cleaner dependencies

Reduce conflicts between tools and libraries.

Portable workflows

Move from laptop development to HPC-style execution.

AI and data science

Support modern frameworks and fast-changing stacks.

When does Kubernetes make sense?

When the work is more than submit, wait and download

Kubernetes becomes useful when a workflow needs services, state, interaction or components that coordinate while the analysis is running.

Traditional HPC workflow

Submit script Wait in queue Run job Read output

Interactive scientific workflow

Open app Select dataset Launch analysis See progress Refine region Update visualization

A practical example

Scientific imaging as an interactive HPC application

A tomography, 3D microscopy or digital pathology viewer can start from the browser and coordinate computation behind the scenes.

How the workflow feels to the researcher

The researcher opens a visualization app from Open OnDemand and loads a 3D dataset or a very large image.

They select a region, launch an automatic detection step and watch partial segmentations, heatmaps or detections appear progressively.

When an area looks promising, they request a more expensive GPU refinement only for that region.

What happens underneath

The app sends a request to a backend. Kubernetes starts multiple workers to process image tiles or volume blocks.

Partial results are stored on shared storage or published through an API.

For heavier stages, the application can coordinate with Slurm so the largest HPC computations still use the scheduler efficiently.

1 Open dataset
2 Detect regions of interest
3 Launch distributed workers
4 Stream partial results
5 Refine selected regions
6 Visualize final output

Why this matters: the system does not need to process the whole dataset at maximum cost. It can start with a light pass, refine only promising regions and let the user make decisions during the analysis.

More use cases

Where application orchestration can help

These are examples of scientific platforms that often need more than a single batch job.

AI

AI inference services

Run model endpoints, batch inference, LLMs, image analysis or protein models as reusable services.

Kubernetes helps keep services available and connected to workers.

BIO

Bioinformatics pipelines

Build reproducible workflows using containers, APIs, reports and shared execution environments.

Kubernetes helps expose pipeline components as managed services.

DOCK

Molecular docking and virtual screening

Coordinate CPU and GPU stages, queues, dashboards and candidate refinement.

Kubernetes helps manage the application logic around the computations.

VIEW

Interactive visualization

Use web-based viewers, remote desktops, notebooks and live dashboards connected to HPC data.

Kubernetes helps connect viewers, APIs and background workers.

LEARN

Active learning workflows

Let models identify uncertain regions, request more computation or human validation, and retrain.

Kubernetes helps workflows react to intermediate results.

HYBRID

Hybrid workflows

Combine services, batch jobs, GPUs, storage, APIs and external resources in a single workflow.

Kubernetes helps organize the application while HPC handles heavy execution.

How it fits with Correfoc

A conceptual architecture for modern scientific platforms

In Correfoc, Containers & Kubernetes can provide the application layer for modern scientific platforms, while the HPC scheduler continues to provide efficient access to large-scale computational resources.

Open OnDemand User access layer
Kubernetes Application orchestration layer
Slurm HPC execution layer
data, services and compute resources
Shared storage Data and results layer
GPUs / CPUs Compute layer
APIs and dashboards Scientific application interface

What researchers get

Benefits described from the user side

Launch complex applications without managing infrastructure details.

Use reproducible software environments.

Access interactive scientific tools from the browser.

Combine visualization and computation in the same workflow.

Run services close to the data.

Build workflows that react to intermediate results.

Move from isolated jobs to integrated research platforms.

What this service is not

Clear boundaries help choose the right tool

Not a Slurm replacement

Batch HPC, MPI jobs and large scheduled workloads still belong in the scheduler.

Not for every HPC workload

Kubernetes is most useful when the workflow behaves like an application.

Not generic web hosting

The service is intended for scientific applications and research platforms.

Not a policy bypass

Scheduling, security and resource policies still apply.

Containers & Kubernetes is intended for scientific applications, reproducible workflows, research platforms and advanced interactive computing.

Coming soon

Is this service right for your project?

If your group needs to turn a workflow into an application, deploy a viewer, serve a model, run dynamic workers, or combine notebooks, APIs and visualization with HPC computation, Containers & Kubernetes can help shape the right architecture once the service becomes available.