Skip to main content

Access and manage embedded clusters (Beta)

This topic describes managing nodes in clusters created with Replicated Embedded Cluster.

Access the cluster

You can use the CLI to access the cluster. This is useful for development or troubleshooting.

To access the cluster and use other included binaries:

  1. SSH into a controller node.

    note

    You cannot run the shell command on worker nodes.

  2. Use the Embedded Cluster shell command to start a shell with access to the cluster:

    sudo ./APP_SLUG shell

    Where APP_SLUG is the unique slug for the application.

    The output looks similar to the following:

       __4___
    _ \ \ \ \ Welcome to APP_SLUG debug shell.
    <'\ /_/_/_/ This terminal is now configured to access your cluster.
    ((____!___/) Type 'exit' (or CTRL+d) to exit.
    \0\0\0\0\/ Happy hacking.
    ~~~~~~~~~~~
    root@alex-ec-1:/home/alex# export KUBECONFIG="/var/lib/embedded-cluster/k0s/pki/admin.conf"
    root@alex-ec-1:/home/alex# export PATH="$PATH:/var/lib/embedded-cluster/bin"
    root@alex-ec-1:/home/alex# source <(k0s completion bash)
    root@alex-ec-1:/home/alex# source <(cat /var/lib/embedded-cluster/bin/kubectl_completion_bash.sh)
    root@alex-ec-1:/home/alex# source /etc/bash_completion

    The appropriate kubeconfig is exported, and the location of useful binaries like kubectl and Replicated’s preflight and support-bundle plugins is added to PATH.

  3. Use the available binaries as needed.

    Example:

    kubectl version
    Client Version: v1.29.1
    Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
    Server Version: v1.29.1+k0s
  4. Type exit or Ctrl + D to exit the shell.

Configure multi-node clusters

This section describes how to join nodes to a cluster with Embedded Cluster.

Limitations

Multi-node clusters with Embedded Cluster have the following limitations:

  • All nodes joined to the cluster use the same Embedded Cluster data directory as the installation node. You cannot choose a different data directory for Embedded Cluster when joining nodes.

  • You should not join more than one controller node at the same time. When joining a controller node, Embedded Cluster prints a warning explaining that you should not attempt to join another node until the controller node joins successfully.

  • You cannot change a node's role (controller or worker) after you join the node. If you need to change a node’s role, reset the node and add it again with the new role.

Requirement

To deploy multi-node clusters with Embedded Cluster, you must enable the Multi-node Cluster (Embedded Cluster only) license field for the customer. For more information about managing customer licenses, see Create and Manage Customers.

Join nodes

To join a node:

  1. SSH into a controller node.

  2. Run the following command to generate the .tar.gz bundle for joining a node:

    sudo ./APP_SLUG create-join-bundle --role [controller | worker]

    Where:

    • APP_SLUG is the unique slug for the application.

    • --role is the role to assign the node (controller or worker).

      note

      You cannot change the role after you add a node. If you need to change a node’s role, reset the node and add it again with the new role.

  3. Use scp to copy the .tar.gz bundle to the node that you want to join.

  4. Extract the .tar.gz.

  5. Run the join command to add the node to the cluster:

    sudo ./APP_SLUG node join
  6. Repeat these steps for each node you want to add.

High availability for multi-node clusters

Embedded Cluster automatically enables high availability (HA) when at least three controller nodes are present in the cluster.

In HA installations, Embedded Cluster deploys multiple replicas of the OpenEBS and image registry built-in extensions. Also, any Helm extensions that you include in the Embedded Cluster Config are installed in the cluster depending on the given chart and whether or not it is configured to be deployed with high availability.

Best practices for high availability

Consider the following best practices and recommendations for HA clusters:

  • HA requires at least three controller nodes that run the Kubernetes control plane. This is because clusters use a quorum system, in which more than half the nodes must be up and reachable. In clusters with three controller nodes, the Kubernetes control plane can continue to operate if one node fails because the remaining two nodes can still form a quorum.

  • Always use an odd number of controller nodes in HA clusters. Using an odd number of controller nodes ensures that the cluster can make decisions efficiently with quorum calculations. Clusters with an odd number of controller nodes also avoid split-brain scenarios where the cluster runs as two, independent groups of nodes, resulting in inconsistencies and conflicts.

  • You can have any number of worker nodes in HA clusters. Worker nodes do not run the Kubernetes control plane, but can run workloads such as application workloads.

Create a multi-node cluster with HA

To create a multi-node cluster with HA:

  • During installation with Embedded Cluster, follow the steps in the Embedded Cluster UI to join a total of three controller nodes to the cluster. For more information about joining nodes, see Join nodes on this page.

    Embedded Cluster automatically converts the installation to HA when three or more controller nodes are present.

Enable HA for an existing cluster

To enable HA for an existing Embedded Cluster installation with three or more controller nodes:

  • On one of the controller nodes, run this command:

    sudo ./APP_SLUG enable-ha

    Where APP_SLUG is the unique slug for the application.

Reset nodes and remove clusters

This section describes how to reset individual nodes and how to delete an entire multi-node cluster using the Embedded Cluster reset command.

About the reset command

Resetting a node with Embedded Cluster removes the cluster and your application from that node. This is useful for iteration, development, and when you make mistakes because you can reuse the machine instead of having to procure a new one.

The reset command performs the following steps:

  1. Run safety checks. For example, reset does not remove a controller node when there are workers nodes available. And, it does not remove a node when the etcd cluster is unhealthy.
  2. Drain the node and evict all the Pods gracefully
  3. Delete the node from the cluster
  4. Stop and reset k0s
  5. Remove all Embedded Cluster files
  6. Reboot the node

For more information about the command, see reset.

Limitations and best practices

Before you reset a node or remove a cluster, consider the following limitations and best practices:

  • When you reset a node, Embedded Cluster deletes OpenEBS PVCs on that node. Kubernetes automatically recreates only PVCs created as part of a StatefulSet on another node in the cluster. To recreate other PVCs, redeploy the application in the cluster.

  • If you need to reset one controller node in a three-node cluster, first join a fourth controller node to the cluster before removing the target node. This ensures that you maintain a minimum of three nodes for the Kubernetes control plane. You can add and remove worker nodes as needed because they do not have any control plane components.

  • When resetting a single node or deleting a test environment, you can include the --force flag with the reset command to ignore any errors.

  • When removing a multi-node cluster, run reset on each of the worker nodes first. Then, run reset on controller nodes. Controller nodes also remove themselves from etcd membership.

Reset a node

To reset a node:

  1. SSH onto the node. Ensure that the Embedded Cluster binary is still available on the machine.

  2. Run the following command to remove the node and reboot the machine:

    sudo ./APP_SLUG reset

    Where APP_SLUG is the unique slug for the application.

Remove a multi-node cluster

To remove a multi-node cluster:

  1. SSH onto a worker node.

    note

    The safety checks for the reset command prevent you from removing a controller node when there are still worker nodes available in the cluster.

  2. Remove the node and reboot the machine:

    sudo ./APP_SLUG reset

    Where APP_SLUG is the unique slug for the application.

  3. After removing all the worker nodes in the cluster, SSH onto a controller node and run the reset command to remove the node.

  4. Repeat the previous step on the remaining controller nodes in the cluster.