Introduction

The demand for computing power has never been higher. AI workloads continue to grow, data pipelines become more complex, and both users in production and innovation rely on increasingly large HPC platforms capable of handling more numerous and larger-scale calculations. Yet despite their impressive capabilities, computing centers such as cloud providers, on-prem infrastructures and HPC facilities still face a persistent challenge: users lack a straightforward way to use and select the right environments for their workloads.

Ryax addresses this challenge by providing a unified platform that makes hybrid infrastructures feel consistent and effortless to use. Today, AI workflows on the cloud typically rely on Kubernetes, while HPC workflows follow very different practices using tools such as Apptainer a.k.a Singularity and relying on batch schedulers. In most cases, users prepare their applications for a single environment because adapting them to both cloud and HPC demands extra skills and effort.

Ryax removes this barrier completely. Users describe their workflow steps once, and Ryax takes care of building the required containers for each target environment, Kubernetes on the cloud and Apptainer (formerly Singularity) on HPC. While Ryax's orchestrator selects the most suitable infrastructure for each step based on objectives such as performance, cost, and energy efficiency, it automatically uses the appropriate tooling for each environment without asking users to learn or manage these details.

This approach is central to how Ryax simplifies hybrid execution and is the focus of our new weekly blog series where we reveal the techniques that let you manage your infrastructure with confidence and flexibility. If you missed the first post, you can read it here: How We Manage Multi-Cloud Infrastructure with Ryax.

Adding an HPC cluster into Ryax

AI workflows are typically deployed on cloud platforms, while HPC systems remain powerful but separate compute resources. Bridging these two environments allows AI workloads to leverage high-performance parallel computing alongside cloud infrastructure, unlocking resources that were previously unavailable.

Typically, running workflows on HPC requires some configuration from administrators. But here comes the good news: if you already know how to connect to your HPC system, then you already know how to connect it to Ryax. A single configuration file is enough to expose the parts of your cluster you want to make available, including your cluster name, connection details, and the default parameters for the partitions you choose to allow.

That’s it. No extra complexity. The example YAML file below shows just how simple this setup really is:

config:
  site:
    name: hpc-on-prem-1
    type: SLURM_SSH
    spec:
      partitions:
        - name: debug
          cpu: 16
          memory: 24G
          gpu: 1
          time: 2H
      credentials:
        server: my.hpc-site.com
        username: ryax-username

Here we show how to add a single HPC cluster to Ryax, but the process is exactly the same for any number of clusters. Adding more resources does not change the way infrastructure decisions are made, and it does not add complexity to your workflows.

Activating advanced usage for parallel jobs

Once your HPC cluster is connected to Ryax, enabling large scale parallel execution is surprisingly straightforward. Ryax provides an HPC addon that activates advanced capabilities on your cluster without requiring any deep knowledge of schedulers, modules, or container tooling.

When this addon is enabled on a workflow action, the Ryax builder automatically produces an image for Apptainer or Singularity instead of a Docker image, ensuring full compatibility with HPC environments.

The standard dependency field can then be used to declare HPC specific packages, including the MPI library you want to rely on.

apiVersion: "ryax.tech/v2.0"
kind: Processor
   spec:
   # ...
   addons:
   hpc: {}
   # ...
   dependencies:
   - openmpi
   - openssh

The addon also exposes HPC specific parameters such as scheduler directives, resource requests, and the information needed to submit jobs through the cluster frontend. This means advanced HPC users can keep the level of control they are used to, while AI users benefit from a single, simple configuration model that works for both cloud and HPC execution.

User Benefits Unlocked

Adding HPC access has an immediate effect for users. Any step in a workflow can now leverage HPC resources, either by explicit selection or by letting Ryax automatically choose the optimal infrastructure based on constraints and objectives. AI researchers, data analysts, and HPC engineers alike can take advantage of these resources without worrying about setup, scheduling, or dependencies.

Ryax’s logging and debugging system works seamlessly for all actions, including those executed through HPC offloading on external clusters. Logs from scheduler batch scripts are periodically collected and displayed directly in the workflow’s execution view, providing a unified and consistent way to monitor and debug actions, whether they run on cloud or HPC resources.

Users gain flexibility, efficiency, and a smoother experience, enabling them to focus on results rather than infrastructure. By connecting HPC and cloud seamlessly, Ryax also makes it easier to run hybrid workflows that combine computation, cloud services, and AI processing—opening the door to new types of experiments and projects without added complexity.

Conclusion

AI and HPC workloads are not rivals: they are different steps of the same workflow family. With Ryax, moving seamlessly between cloud platforms and HPC clusters becomes effortless for any user, whether an AI researcher, a data analyst, or an HPC engineer. Adding new infrastructure is just a few clicks away, and Ryax immediately integrates it alongside existing resources.

By simplifying access to computing centers for the AI ecosystem, Ryax makes advanced compute resources more usable and accessible. AI teams can now take full advantage of HPC and cloud infrastructures without friction, enabling new types of workflows and users that were previously out of reach. This wider adoption brings more engagement. It also naturally opens new revenue opportunities, creating a win-win: AI users gain seamless access to powerful resources, and computing centers see better utilization without disrupting existing HPC workloads, as all jobs continue to run through the same scheduler.

Ryax also reduces vendor lock-in for both users and computing centers. Workflows are no longer restricted to a single cloud provider, a single HPC scheduler, or a single type of environment. Users can allow Ryax to automatically select the best platform for each step, or take control by specifying constraints and preferences. This flexibility enables workloads to run across cloud and HPC resources seamlessly, while infrastructure providers remain competitive and can support more diverse users and use cases without locking them into rigid environments.

Finally, Ryax enables new AI and HPC use cases. Each step of a workflow can run on the most suitable infrastructure, whether cloud or HPC, and users can let Ryax choose the optimal platform or set constraints to guide execution. This allows combining real-time cloud services, HPC computation, and AI processing in a single workflow, unlocking experiments and workflows that were previously too complex or constrained to deploy.

Stay tuned for upcoming posts, where we will continue to explore how Ryax orchestrates workflows, optimizes execution, and helps you keep full control of your infrastructure.

GITHUB

JOIN OUR DISCORD

Simplifying computing center access with Ryax (hybrid workflows)

Conclusion

Blog
Documentation
About us

GitHub
Discord
Linkedin
Youtube