Configure Embedded Cluster (Beta)

This topic describes how to configure and use Replicated Embedded Cluster with your application. For more information about Embedded Cluster, see Embedded Cluster Overview. For information about updating an existing release from Embedded Cluster v2 to v3, see Migrate from Embedded Cluster v2.

Create a release with Embedded Cluster v3

To create an application release that supports installation with Embedded Cluster v3:

If you use the Replicated proxy registry, update all references to private or third-party images to use the Replicated proxy registry domain. See the Embedded Cluster v3 steps in Configure your application to use the proxy registry.
In your application Helm chart Chart.yaml file, add the SDK as a dependency. If your application uses multiple charts, declare the SDK as a dependency of the chart that customers install first. Do not declare the SDK in more than one chart.
```
# Chart.yaml
dependencies:
- name: replicated
  repository: oci://registry.replicated.com/library
  version: 1.19.6
```
For the latest version information for the Replicated SDK, see the replicated-sdk repository in GitHub.
Package each chart into a .tgz chart archive. See Package a Helm chart for a release.

For each chart archive, add a unique HelmChart v2 custom resource (version kots.io/v1beta2).

# HelmChart custom resource
apiVersion: kots.io/v1beta2
kind: HelmChart
metadata:
  name: samplechart
spec:
  # chart identifies a matching chart from a .tgz
  chart:
    name: samplechart
    chartVersion: 3.1.7

(Optional) To conditionally include or exclude Helm charts or resources based on the install method or user configuration, see Conditionally include or exclude resources.
If you support air gap installations, update all image references so they resolve correctly in both online and air gap installations. See Add support for air gap installations on this page.
Add an Embedded Cluster Config manifest to the release. At minimum, the Config must specify the Embedded Cluster version to use.
```
apiVersion: embeddedcluster.replicated.com/v1beta1
kind: Config
spec:
  version: 3.0.0-beta.1+k8s-1.34
```
If you use custom domains for the Replicated proxy registry or Replicated app service, add them to the Embedded Cluster Config domains key. See Configure Embedded Cluster to use custom domains in Use custom domains.
If you need Embedded Cluster to deploy certain components to the cluster before it deploys your application, add the Helm charts for those components to the Embedded Cluster Config extensions key. See (Optional) Add Helm chart extensions on this page.
Save the release and promote it to the channel that you use for testing internally.
Install with Embedded Cluster in a development environment to test. See Online installation with Embedded Cluster or Air gap installation with Embedded Cluster.

Add preflight checks

Embedded Cluster v3 uses Troubleshoot v1beta3 for application preflight checks. You package preflight specs as a release-level YAML file outside your Helm charts. During installation and upgrades, Embedded Cluster renders the spec through the Helm template engine and runs the checks automatically.

How preflight rendering works in v3

Embedded Cluster renders your preflight spec through the same Helm engine it uses to install your charts. The rendering pipeline works as follows:

Embedded Cluster takes the external preflight spec from your release.
For each Helm chart in the release, Embedded Cluster injects the spec into the chart as a template and renders it with helm template using the same values that Helm uses during installation (SDK values, chart defaults from values.yaml, and user overrides from the HelmChart custom resource).
Embedded Cluster merges the per-chart rendered outputs into a single resolved spec.
Embedded Cluster runs the Troubleshoot preflight binary against the fully resolved spec. No additional flags or values are necessary because all Helm expressions are already evaluated.

This per-chart rendering means your preflight spec has access to the full Helm context for each chart:

Chart values and defaults: .Values.* resolves against each chart's values.yaml defaults merged with any overrides from the HelmChart custom resource values key.
Helm helpers: {{ include "mychart.fullname" . }} resolves using the _helpers.tpl from the chart being rendered.
Chart metadata: .Chart.Name, .Chart.Version, .Release.Name, and .Release.Namespace are all available.

If a conditional in your spec evaluates to false for a given chart (for example, {{- if .Values.featureNotEnabled }}), Embedded Cluster excludes those checks from that chart's rendered output. No error occurs.

Create a preflight spec

Add a troubleshoot.sh/v1beta3 Preflight YAML file to your release at the top level (not inside any Helm chart):

apiVersion: troubleshoot.sh/v1beta3
kind: Preflight
metadata:
  name: my-app-preflights
spec:
  collectors:
    - clusterResources: {}
  analyzers:
    - clusterVersion:
        outcomes:
          - fail:
              when: "< 1.28.0"
              message: "Kubernetes 1.28.0 or later is required."
          - pass:
              when: ">= 1.28.0"
              message: "Kubernetes version is sufficient."

For releases with a single Helm chart, you do not need additional gating. The spec renders in that chart's context.

Multi-chart releases

For releases with multiple Helm charts, gate each collector and analyzer to the chart it belongs to using .Chart.Name conditionals. Because Embedded Cluster renders the spec one time per chart, a {{ if eq .Chart.Name "postgresql" }} block only produces output when Embedded Cluster renders it in the context of the postgresql chart:

apiVersion: troubleshoot.sh/v1beta3
kind: Preflight
metadata:
  name: my-app-preflights
spec:
  collectors:
  {{- if eq .Chart.Name "my-app" }}
    - clusterResources: {}
  {{- end }}
  {{- if eq .Chart.Name "postgresql" }}
    - run:
        name: pg-version
        image: '{{ include "postgresql.v1.image" . }}'
        command: ["psql", "--version"]
  {{- end }}
  analyzers:
  {{- if eq .Chart.Name "my-app" }}
    - clusterVersion:
        outcomes:
          - fail:
              when: "< 1.28.0"
              message: "Kubernetes 1.28.0 or later is required."
          - pass:
              when: ">= 1.28.0"
              message: "Kubernetes version is sufficient."
  {{- end }}
  {{- if eq .Chart.Name "postgresql" }}
    - textAnalyze:
        checkName: PostgreSQL Version
        fileName: pg-version
        outcomes:
          - fail:
              when: "< 14"
              message: "PostgreSQL 14 or later is required."
          - pass:
              message: "PostgreSQL version is compatible."
  {{- end }}

A helper like postgresql.v1.image is only evaluated in the context of the postgresql chart where it is defined, so it resolves correctly even though other charts in the release do not define that helper.

Conditional checks with chart values

You can use chart values to conditionally include or exclude preflight checks. For example, if your chart has an optional component that can be enabled or disabled through values, you can gate the related preflight checks on that value:

spec:
  analyzers:
    - clusterVersion:
        outcomes:
          - fail:
              when: "< 1.28.0"
              message: "Kubernetes 1.28.0 or later is required."
          - pass:
              message: "Kubernetes version is sufficient."
    {{- if .Values.gpu.enabled }}
    - nodeResources:
        checkName: GPU Nodes Available
        filters:
          allocatable:
            nvidia.com/gpu: "1"
        outcomes:
          - fail:
              when: "count() < 1"
              message: "At least one node with an NVIDIA GPU is required when GPU support is enabled."
          - pass:
              message: "GPU node available."
    {{- end }}

If the chart's values.yaml has gpu.enabled: false and the customer has not overridden it, the GPU node check is excluded from the rendered spec.

Run connectivity checks inside the cluster

The http, postgres, mysql, mssql, redis, and clickhouse collectors run wherever the preflight process runs. In Embedded Cluster v3, that is the host, not inside the cluster. As a result, a check such as "is the application's database reachable" tests connectivity from the host rather than from within the cluster, which can produce misleading results.

To run one of these checks from inside the cluster, wrap it in a runPod collector that uses the Troubleshoot image and the matching collect subcommand. The Pod runs the collector from within the cluster and prints its result, which you evaluate with a textAnalyze analyzer. The following example checks that a PostgreSQL database is reachable from inside the cluster:

apiVersion: troubleshoot.sh/v1beta3
kind: Preflight
metadata:
  name: my-app-preflights
spec:
  collectors:
    - runPod:
        name: postgres-check
        namespace: '{{ .Release.Namespace }}'
        podSpec:
          restartPolicy: Never
          containers:
            - name: check
              image: proxy.replicated.com/library/troubleshoot:v0.131.0
              command: ["collect", "postgres", "--uri", "postgres://{{ .Values.db.user }}:{{ .Values.db.password }}@{{ .Values.db.host }}:5432/{{ .Values.db.name }}"]
  analyzers:
    - textAnalyze:
        checkName: PostgreSQL reachable
        collectorName: postgres-check
        fileName: "*.log"
        regex: '"isConnected": *true'
        outcomes:
          - pass:
              when: "true"
              message: "Connected to PostgreSQL from inside the cluster."
          - fail:
              when: "false"
              message: "Could not connect to PostgreSQL from inside the cluster."

Each collector documents its collect subcommand and flags on its own page in the Troubleshoot documentation: http, postgres, mysql, mssql, redis, and clickhouse.

Include preflight images in air gap bundles

In air gap installations, images referenced by a v1beta3 preflight spec are not automatically included in the air gap bundle. This includes the Troubleshoot image used to run checks inside the cluster (see Run connectivity checks inside the cluster) and any other images your collectors reference, such as a runPod container image. Unlike application workload images, which are detected from your Helm charts, preflight images must be declared explicitly.

Add each image to the additionalImages field of the Application custom resource so that it is included in the air gap bundle:

apiVersion: kots.io/v1beta1
kind: Application
metadata:
  name: my-app
spec:
  additionalImages:
    - proxy.replicated.com/library/troubleshoot:v0.131.0

For more information, see Define Additional Images.

Limitations

A release may contain at most one external preflight spec.
The preflight spec uses Helm template syntax, not repl{{ }} Replicated template syntax.
The spec is not valid YAML before rendering. Standard YAML linters will not validate it directly.
Support bundle specs are separate from preflight specs. Support bundles continue to use v1beta2 and are not affected by this requirement.

For more information about host preflight checks that Embedded Cluster runs automatically, see Embedded Cluster host preflight checks.

Set up the wizard's Configure screen

During installation and upgrades, the Embedded Cluster wizard includes a Configure step where end customers provide configuration values for your application. This screen is defined by the Config custom resource in your release.

The Config custom resource lets you define groups of configuration fields that appear in the wizard. Each group becomes a section in the sidebar, and customers navigate between groups to complete the configuration.

How it works

The Config custom resource defines one or more groups, each containing one or more items. During installation, the wizard renders these as form fields. The values that customers provide are available to your Helm chart through Replicated template functions.

For example, the following Config custom resource defines two groups:

apiVersion: kots.io/v1beta1
kind: Config
metadata:
  name: config-sample
spec:
  groups:
    - name: database
      title: Database Configuration
      description: Choose 'Embedded' to deploy PostgreSQL within the cluster, or 'External' to connect to your own PostgreSQL instance.
      items:
        - name: database_type
          title: PostgreSQL Type
          type: select_one
          default: embedded
          items:
            - name: embedded
              title: Embedded PostgreSQL
            - name: external
              title: External PostgreSQL
        - name: external_host
          title: PostgreSQL Host
          type: text
          when: '{{repl ConfigOptionEquals "database_type" "external"}}'
    - name: features
      title: Features
      items:
        - name: enable_feature_x
          title: Enable Feature X
          type: bool
          default: "0"

In the wizard, this renders as two groups in the sidebar ("Database Configuration" and "Features"). Customers select their database type, and if they choose "External PostgreSQL", additional fields appear for the database connection details.

Key concepts

Groups define the sidebar sections. Each group has a title that appears in the sidebar and a list of items.
Items define individual form fields. Supported types include text, password, bool, select_one, textarea, and more. See the Config custom resource reference for all available types.
Conditional fields use the when property with template functions to show or hide fields based on other values. This lets you create dynamic forms that adapt to customer choices.
Default values and generated values (such as auto-generated passwords) reduce the number of fields customers need to fill in manually.
Validation can be added using the validation property to check field values against regex patterns before the customer proceeds.

During upgrades, the wizard pre-populates the Configure screen with the customer's existing values, allowing them to review and update their configuration before proceeding.

For the complete field reference, see Config custom resource.

To learn how to map customer-provided config values to your Helm chart using the HelmChart CR and template functions, see Use config values in your Helm chart.

Customize the wizard branding

You can customize the appearance of the Embedded Cluster install and upgrade wizard using the Application custom resource in your release. The Application custom resource lets you set a custom title and icon that appear throughout the wizard.

Set the title and icon

Add an Application custom resource to your release with the title and icon fields:

apiVersion: kots.io/v1beta1
kind: Application
metadata:
  name: acme-app
spec:
  title: Acme Application
  icon: https://acme.com/icon.png

The title appears in the wizard header on every screen, along with "Installation Wizard" or "Upgrade Wizard" as the subtitle. The icon appears next to the title. If no icon is provided, a default letter avatar is generated from the first character of the title.

note

For air gap installations, use a Base64 encoded image for the icon field because remote URLs are not accessible in air gap environments.

For more information about the Application custom resource fields, see Application custom resource.

Add application links to the Finish page

To display access links on the Installation Complete page, include a SIG Application custom resource (app.k8s.io/v1beta1) in your release with spec.descriptor.links:

apiVersion: app.k8s.io/v1beta1
kind: Application
metadata:
  name: my-app
spec:
  descriptor:
    links:
      - description: Open My App
        url: "http://my-app-url"

Links defined here appear on the Finish page after a successful install and on the dashboard in the persistent console.

Add support for air gap installations

This section describes how to support air gap installations with Embedded Cluster v3. It includes information about how to configure your release and lists the limitations and known issues of air gap installations with Embedded Cluster v3.

Configure the release

To support air gap installations with Embedded Cluster v3:

Configure each HelmChart custom resource's builder key. This ensures that all the required and optional images for your application are available in environments without internet access. See builder in HelmChart v2.

My chart's default values already expose all images. Do I still need to configure the builder key?
If the default values in your Helm chart already expose all the images for air gap installations, then you do not need to configure the builder key.
When building an air gap bundle, the Vendor Portal runs helm template on each Helm chart to detect which images to include. The bundle includes all images that helm template yields.
For many applications, running helm template with the default values would not yield all the images required to install. In these cases, vendors can pass the additional values in the builder key to ensure that the air gap bundle includes all the necessary images.
Configure each HelmChart custom resource to ensure that all image references resolve correctly in both online and air gap installations. You do this in the HelmChart custom resource's values key using the ReplicatedImageName and ReplicatedImageRegistry template functions. See the following examples for more information:
Example (Single value for full image name)
For charts that expect the full image reference in a single field, use the ReplicatedImageName template function in the HelmChart custom resource. ReplicatedImageName returns the full image name, including both the repository and registry.
For example:
# values.yaml initImage: proxy.replicated.com/proxy/my-app/docker.io/library/busybox:1.36
# HelmChart custom resource apiVersion: kots.io/v1beta2 kind: HelmChart spec: values: initImage: '{{repl ReplicatedImageName (HelmValue ".initImage") true }}'
ReplicatedImageName sets noProxy to true because the image reference value in values.yaml already contains the proxy path prefix (proxy.replicated.com/proxy/my-app/...)
Example (Separate values for image registry and repository)
If a chart uses separate registry and repository fields for image references, use the ReplicatedImageRegistry template function to rewrite the registry field. You do not need to template the repository field.
# values.yaml postgresql: image: # proxy.replicated.com or your custom domain registry: proxy.replicated.com/proxy/app-slug/docker.io repository: bitnami/postgresql
# HelmChart custom resource apiVersion: kots.io/v1beta2 kind: HelmChart spec: values: image: registry: '{{repl ReplicatedImageRegistry (HelmValue ".image.registry") }}'
Example (References to public images)
For public images that don't go through the Replicated proxy registry, set the upstream reference directly in the chart's values.yaml. Use noProxy so that ReplicatedImageName leaves the reference unchanged in online installations. When you include noProxy, ReplicatedImageName still rewrites the image to the local registry in air gap installations.
# values.yaml publicImage: docker.io/library/busybox:1.36
# HelmChart custom resource apiVersion: kots.io/v1beta2 kind: HelmChart spec: values: publicImage: '{{repl ReplicatedImageName (HelmValue ".publicImage") true }}'

In the HelmChart resource that corresponds to the chart where you included the Replicated SDK as a dependency, rewrite the Replicated SDK image registry using the ReplicatedImageRegistry template function:

# HelmChart custom resource
apiVersion: kots.io/v1beta2
kind: HelmChart
spec:
  values:
    replicated:
      image:
        registry: '{{repl ReplicatedImageRegistry (HelmValue ".replicated.image.registry") }}'

If you added any Helm chart extensions in the Embedded Cluster Config, rewrite image references in each extension using either the ReplicatedImageName template function (if the chart uses a single field for the full image reference) or the ReplicatedImageRegistry template function (if the chart uses separate fields for registry and repository).

Example (Extension for a Helm chart that you own)

# Embedded Cluster Config
apiVersion: embeddedcluster.replicated.com/v1beta1
kind: Config
spec:
  extensions:
    helmCharts:
      - chart:
          name: ingress
          chartVersion: "1.2.3"
        releaseName: ingress
        namespace: ingress
        values: |
          controller:
            image:
              registry: 'repl{{ ReplicatedImageRegistry (HelmValue ".controller.image.registry") }}'

Example (Extension for a third-party Helm chart)

# Embedded Cluster Config
apiVersion: embeddedcluster.replicated.com/v1beta1
kind: Config
spec:
  extensions:
    helmCharts:
      - chart:
          name: ingress-nginx
          chartVersion: "4.11.3"
        releaseName: ingress-nginx
        namespace: ingress-nginx
        values: |
          controller:
            image:
              registry: 'repl{{ ReplicatedImageRegistry "registry.k8s.io" }}'

The template functions add the proxy prefix in online installations and rewrite to the local registry in air gap installations.

In the Vendor Portal, go to the channel where you promoted the release to build the air gap bundle. Do one of the following:
- If you enabled the Automatically create airgap builds for newly promoted releases in this channel setting for the channel, watch for the build status to complete.
- If automatic air gap builds are not enabled, go to the Release history page for the channel and build the air gap bundle manually.
Create or edit a customer with the Air Gap Installation Option (Replicated Installers only) entitlement enabled so that you can test air gap installations. See Create and Manage Customers.
(Optional) Create a VM with Compatibility Matrix and set its network policy to airgap to block outbound network access:
```
replicated vm create --distribution ubuntu 
```
```
replicated network update NETWORK_ID --policy airgap
```
Where NETWORK_ID is the ID of the network from the output of the vm create command.
Install in your development environment to test. See Air gap installation with Embedded Cluster.

Limitations and known issues

Embedded Cluster installations in air gap environments have the following limitations and known issues:

If you pass ?airgap=true to the replicated.app endpoint but an air gap bundle is not built for the latest release, the API will not return a 404. Instead it will return the tarball without the air gap bundle (as in, with the installer and the license in it, like for online installations).
Images used by Helm extensions must not refer to a multi-architecture image by digest. Air gap bundles include only x64 images, and the digest for the x64 image will be different from the digest for the multi-architecture image, preventing Kubernetes from locating the image in the bundle. An example of a chart that does this is ingress-nginx/ingress-nginx chart. For an example of how to set digests to empty strings and pull by tag only, see extensions in Embedded Cluster Config.
Embedded Cluster loads images for Helm extensions directly into containerd so that they are available without internet access. But if an image used by a Helm extension has Always set as the image pull policy, Kubernetes will try to pull the image from the internet. If necessary, use the Helm values to configure IfNotPresent as the image pull policy to ensure the extension works in air gap environments.
On the channel release history page, the links for Download air gap bundle, Copy download URL, and View bundle contents pertain to the application air gap bundle only, not the Embedded Cluster bundle.

Add Helm chart extensions

If your application requires certain components deployed before the application and as part of the cluster itself, add them as extensions in the Embedded Cluster Config. For example, you can add a Helm extension to deploy an ingress controller. You can add extensions for Helm charts that you own or for third-party charts.

To add Helm extensions:

In the Embedded Cluster Config, add the Helm chart to the extensions key.
If you support air gap installations, configure each of your extensions so that they resolve correctly for both online and air gap installations. See Add support for air gap installations on this page.
Save the release and promote it to the channel that you use for testing internally.
Install with Embedded Cluster in a development environment to test. See Online installation with Embedded Cluster or Air gap installation with Embedded Cluster.

Serve installation assets using the Vendor API

To install with Embedded Cluster, your end customers need to download the Embedded Cluster installer binary and their license. Air gap installations also require an air gap bundle. End customers can download all these installation assets using a curl command by following the installation steps available in the Replicated Enterprise Portal.

However, some vendors already have a portal where their customers can log in to access documentation or download artifacts. In cases like this, you can serve the Embedded Cluster installation assets yourself using the Replicated Vendor API. This removes the need for customers to download assets from the Replicated app service using a curl command during installation.

To serve Embedded Cluster installation assets with the Vendor API:

If you have not done so already, create an API token for the Vendor API. See Use the Vendor API v3.
Call the Get an Embedded Cluster release endpoint to download the assets needed to install your application with Embedded Cluster. Your customers must take this binary and their license and copy them to the machine where they will install your application.

Note the following:
- (Recommended) Provide the customerId query parameter so that the downloaded tarball includes the customer’s license. This mirrors what the Replicated app service returns when a customer downloads the binary directly and is the most useful option. Excluding the customerId is useful if you plan to distribute the license separately.
- If you do not provide any query parameters, this endpoint downloads the Embedded Cluster binary for the latest release on the specified channel. You can provide the channelSequence query parameter to download the binary for a particular release.

Distribute the NVIDIA gpu operator with Embedded Cluster

note

Distributing the NVIDIA GPU Operator with Embedded Cluster is not an officially supported feature from Replicated. However, it is a common use case.

The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPUs. For more information about this operator, see the NVIDIA GPU Operator documentation.

Include the NVIDIA gpu operator and configure containerd options

You can include the NVIDIA GPU Operator in your release as an additional Helm chart, or using Embedded Cluster Helm extensions. For information about adding Helm extensions, see extensions in Embedded Cluster Config.

To use the NVIDIA GPU Operator as a Helm extension, package the GPU Operator chart as a .tgz archive and include it in your release. Then configure the containerd options in the Embedded Cluster Config as follows:

# Embedded Cluster Config
apiVersion: embeddedcluster.replicated.com/v1beta1
kind: Config
spec:
  extensions:
    helmCharts:
      - chart:
          name: gpu-operator
          chartVersion: "v24.9.1"
        releaseName: gpu-operator
        namespace: gpu-operator
        values: |
          # configure the containerd options
          toolkit:
            env:
            - name: CONTAINERD_CONFIG
              value: /etc/k0s/containerd.d/nvidia.toml
            - name: CONTAINERD_SOCKET
              value: /run/k0s/containerd.sock

containerd known issue

When you configure the containerd options as shown earlier on this page, the NVIDIA GPU Operator automatically creates the required configurations in the /etc/k0s/containerd.d/nvidia.toml file. It is not necessary to create this file manually, or modify any other configuration on the hosts.

If you include the NVIDIA GPU Operator as a Helm extension, remove any existing containerd services from the host before installing with Embedded Cluster. This includes services deployed by Docker. If any containerd services are present on the host, the NVIDIA GPU Operator will generate an invalid containerd config, causing the installation to fail. For more information, see Installation failure when NVIDIA GPU Operator is included as Helm extension in Troubleshooting Embedded Cluster.

This is the result of a known issue with v24.9.x of the NVIDIA GPU Operator. For more information about the known issue, see container-toolkit does not modify the containerd config correctly when there are multiple instances of the containerd binary in the nvidia-container-toolkit repository in GitHub.

Create a release with Embedded Cluster v3​

Add preflight checks​

How preflight rendering works in v3​

Create a preflight spec​

Multi-chart releases​

Conditional checks with chart values​

Run connectivity checks inside the cluster​

Include preflight images in air gap bundles​

Limitations​

Set up the wizard's Configure screen​

How it works​

Key concepts​

Customize the wizard branding​

Set the title and icon​

Add application links to the Finish page​

Add support for air gap installations​

Configure the release​

Limitations and known issues​

Add Helm chart extensions​

Serve installation assets using the Vendor API​

Distribute the NVIDIA gpu operator with Embedded Cluster​

Include the NVIDIA gpu operator and configure containerd options​

containerd known issue​

On this page