Container Compute Service (ACS) is integrated into Container Service for Kubernetes. This allows you to use the computing power of ACS in ACK Pro clusters. This topic describes how to use the computing power of ACS in ACK Pro clusters.
How to use the computing power of ACS in ACK Pro clusters
Container Compute Service (ACS) is a cloud computing service that provides container compute resources that comply with the container specifications of Kubernetes. ACS adopts a layered architecture to implement Kubernetes control and computing power. The compute resources layer schedules and allocates resources to pods. The Kubernetes control layer manages workloads, such as Deployments, Services, StatefulSets, and CronJobs.
The computing power of ACS can be implemented in Kubernetes clusters by using virtual nodes. This way, Kubernetes clusters are empowered with high elasticity and are no longer limited by the computing capacity of cluster nodes. After you use ACS to take over infrastructure management for pods, the Kubernetes cluster no longer needs to schedule or launch individual pods. In addition, the Kubernetes cluster no longer needs to be concerned about the resources of underlying VMs. ACS can meet the resource requirements of pods at any time.
Container Service for Kubernetes (ACK) is one of the first services to participate in the Certified Kubernetes Conformance Program in the world. ACK provides high-performance containerized application management service. ACK is integrated with the virtualization, storage, network, and security capabilities provided by Alibaba Cloud. ACK simplifies your cluster setup and scaling and allows you to focus on containerized application development and management.
Before you can create ACS pods in an ACK Pro cluster, you must deploy virtual nodes in the cluster. If you need to scale out your ACK Pro cluster, you can create ACS pods on virtual nodes, without the need to plan the resource capacities of the virtual nodes. ACS pods can communicate with pods on physical nodes in the cluster. We recommend that you deploy long-lived applications whose workloads periodically fluctuate on virtual nodes. This improves resource utilization, reduces resource costs, and accelerates the scaling process. When the workload of your application decreases, you can remove pods from virtual nodes to reduce resource costs. Pods on virtual nodes run in a secure and isolated environment that is built on top of ACS. In this case, a pod is referred to as an ACS pod. For more information, see ACK cluster overview.
Prerequisites
To use the computing power of ACS in ACK Pro clusters, you must first activate the required cloud services and grant the required permissions.
Activate Container Service for Kubernetes, assign default roles to ACS, and activate other required cloud services. For more information, see Create an ACK managed cluster.
Log on to the ACS console. Follow the on-screen instructions to activate ACS.
An ACK Pro cluster that runs Kubernetes 1.26 or later is created. For more information, see Create an ACK managed cluster. For more information about how to update a cluster, see Manually upgrade ACK clusters.
You must install a specific version of the ACK Virtual Node component based on the Kubernetes version of your ACK Pro cluster. The following table describes the version mapping details.
Kubernetes version
ACK Virtual Node version
≥ 1.26
≥ v2.13.0
Install ACK Virtual Node to implement the computing power of ACS
The computing power of ACS can be implemented in ACK clusters by using virtual nodes. This way, Kubernetes clusters are empowered with high elasticity and are no longer limited by the computing capacity of cluster nodes. The following section describes how to transfer files by using SFTP.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the one you want to manage and click its name. In the left-side navigation pane, click Add-ons.
On the Core Components tab, select ACK Virtual Node and click Install to install the component or click Update to update the component to the required version.
If the console prompts you to activate and grant permissions to ACS when you install ACK Virtual Node, follow the on-screen instructions to activate and grant permissions to ACS. Click OK.
After you install the component, choose
in the left-side navigation pane of the cluster details page. By default, the names of virtual nodes are prefixed withvirtual-kubelet-
.
Example
After you install the required version of ACK Virtual Node or update the component to the required version as described in the Prerequisites section, you can create ACS pods and elastic container instances.
When you schedule pods to virtual nodes, if you do not specify the compute class of the pods, elastic container instances are prioritized for pod scheduling by default.
To implement the computing power of ACS in an ACK cluster, perform the following steps:
Configure node selectors, affinity and anti-affinity rules, ResourcePolicies, and the
alibabacloud.com/acs: "true"
label to schedule pods to virtual nodes. For more information, see Node affinity scheduling.NoteThe
alibabacloud.com/acs: "true"
label does not apply to ACK Serverless clusters. It applies to the following clusters: ACK managed clusters, ACK dedicated clusters, ACK One registered clusters, and ACK Edge clusters.When you create an ACS pod, add the alibabacloud.com/compute-class:Compute class label to the pod to specify the compute class of the pod. For more information about the compute classes of ACS pods, see ACS pod overview.
The following section describes how to transfer files by using SFTP.
Create a Deployment.
ImportantIf you schedule pods to virtual nodes by using the
alibabacloud.com/acs: "true"
pod label, theWaitForFirstConsumer
StorageClass is not supported. Therefore, when you use ACS pods that are mounted with disks in ACK clusters, use the nodeSelector or create a ResourcePolicy to schedule pods to virtual nodes. For more information about configuring a ResourcePolicy, see ACK Pro clusters support colocated scheduling of ECS instances and ACS computing power.NodeSelector
Run the following command to query the labels of a virtual node. Replace
virtual-kubelet-cn-hangzhou-k
in the following command with the actual virtual node name.kubectl get node virtual-kubelet-cn-hangzhou-k -oyaml
The following expected output displays only the content related to
labels
:apiVersion: v1 kind: Node metadata: labels: kubernetes.io/arch: amd64 kubernetes.io/hostname: virtual-kubelet-cn-hangzhou-k kubernetes.io/os: linux kubernetes.io/role: agent service.alibabacloud.com/exclude-node: "true" topology.diskplugin.csi.alibabacloud.com/zone: cn-hangzhou-k topology.kubernetes.io/region: cn-hangzhou topology.kubernetes.io/zone: cn-hangzhou-k type: virtual-kubelet # Each virtual node has this label. If you want to schedule a pod to a virtual node, you can configure this label as the node selector of the pod. name: virtual-kubelet-cn-hangzhou-k spec: taints: - effect: NoSchedule key: virtual-kubelet.io/provider value: alibabacloud
Create a YAML file named nginx.yaml based on the following content to provision two pods:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: name: nginx labels: app: nginx alibabacloud.com/compute-class: general-purpose # The compute class of the ACS pod. Default value: general-purpose. alibabacloud.com/compute-qos: default # The QoS class of the ACS pod. Default value: default. spec: nodeSelector: type: virtual-kubelet # The node selector used to select a virtual node. tolerations: - key: "virtual-kubelet.io/provider" # The toleration used to tolerate virtual nodes. operator: "Exists" effect: "NoSchedule" containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 resources: limits: cpu: 2 requests: cpu: 2
Deploy an NGINX application and query the pods.
Run the following command to deploy an NGINX application:
kubectl apply -f nginx.yaml
Run the following command to check whether the NGINX application is deployed:
kubectl get pods -o wide
Expected results:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-9cdf7bbf9-s**** 1/1 Running 0 36s 10.0.6.68 virtual-kubelet-cn-hangzhou-j <none> <none> nginx-9cdf7bbf9-v**** 1/1 Running 0 36s 10.0.6.67 virtual-kubelet-cn-hangzhou-k <none> <none>
The result shows that the two pods are deployed on nodes that have the
type=virtual-kubelet
label
, which is specified by thenodeSelector
parameter in the Deployment configurations.
Schedule pods based on pod labels
Create a file named nginx.yaml and copy the following content to the file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx alibabacloud.com/acs: "true" # Use the compute power of ACS. alibabacloud.com/compute-class: general-purpose # The compute class of the ACS pod. Default value: general-purpose. alibabacloud.com/compute-qos: default # The QoS class of the ACS pod. Default value: default. spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 resources: limits: cpu: 2 requests: cpu: 2
Deploy an NGINX application and query the pods.
Run the following command to deploy an NGINX application:
kubectl apply -f nginx.yaml
Run the following command to check whether the NGINX application is deployed:
kubectl get pods -o wide
Expected results:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-9cdf7bbf9-s**** 1/1 Running 0 36s 10.0.6.68 virtual-kubelet-cn-hangzhou-j <none> <none> nginx-9cdf7bbf9-v**** 1/1 Running 0 36s 10.0.6.67 virtual-kubelet-cn-hangzhou-k <none> <none>
The result shows that the two pods are deployed on nodes that have the
type=virtual-kubelet
label
, which is specified by thenodeSelector
parameter in the Deployment configurations.
Check whether ACS pods are created for the NGINX applications.
Run the following command to query the details of a pod created for the NGINX application:
kubectl describe pod nginx-9cdf7bbf9-s****
The following expected output displays only the key information:
Annotations: ProviderCreate: done alibabacloud.com/client-token: edf29202-54ac-438e-9626-a1ca007xxxxx alibabacloud.com/instance-id: acs-2ze008giupcyaqbxxxxx alibabacloud.com/pod-ephemeral-storage: 30Gi alibabacloud.com/pod-use-spec: 2-4Gi alibabacloud.com/request-id: A0EF3BF3-37E7-5A07-AC2D-68A0CFCxxxxx alibabacloud.com/schedule-result: finished alibabacloud.com/user-id: 14889995898xxxxx kubernetes.io/pod-stream-port: 10250 kubernetes.io/preferred-scheduling-node: virtual-kubelet-cn-hangzhou-j/1 kubernetes.io/resource-type: serverless
The output shows that the configurations of the pod include the
alibabacloud.com/instance-id: acs-2ze008giupcyaqbxxxxx
annotation, which indicates that the pod is an ACS pod.
Example
The procedure for using ACS GPU compute power is similar to that for using ACS CPU compute power. However, you also need to ensure that the scheduling components meet the version requirements and add some additional configurations.
Configure the component
You must install a specific version of the kube-scheduler component based on the Kubernetes version of your ACK Pro cluster. The following table describes the version mapping details.
Kubernetes version | Scheduler version |
≥ 1.26 | Scheduler versions for different Kubernetes versions:
|
How to activate
The feature of using ACS GPU compute power in ACK clusters is invitational preview. To use this feature, submit a ticket.
How to use this feature
...
labels:
# Add labels to request ACS GPU resources.
alibabacloud.com/compute-class: gpu #Set to gpu if GPU compute power is used.
alibabacloud.com/compute-qos: default #The QoS class, which is the same as regular ACS compute power.
alibabacloud.com/gpu-model-series: GN8IS # The GPU model. Specify the actual model that you use.
...
For more information about the relationship between ACS compute classes and QoS classes, see Relationship between compute classes and QoS classes.
For more information about the supported GPU models for
gpu-model-series
, see Specify GPU models and driver versions for ACS GPU-accelerated pods.The
alibabacloud.com/acs: "true"
label does not apply to ACK Serverless clusters. It applies to the following clusters: ACK managed clusters, ACK dedicated clusters, ACK One registered clusters, and ACK Edge clusters.
The following section shows three examples on using GPU compute power.
NodeSelector
Create a GPU-accelerated workload based on the following content.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # The ACS attributes. alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model. Specify the actual model that you use, such as T4. spec: # The specified label. nodeSelector: type: virtual-kubelet # The taint to be tolerated. tolerations: - key: "virtual-kubelet.io/provider" # The toleration used to tolerate virtual nodes. operator: "Exists" effect: "NoSchedule" containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"
ResourcePolicy
Create a GPU-accelerated workload based on the following content.
apiVersion: scheduling.alibabacloud.com/v1alpha1 kind: ResourcePolicy metadata: name: dep-rp-demo namespace: default spec: selector: app: dep-rp-demo units: - resource: acs podLabels: alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model. Specify the actual model that you use, such as T4. --- apiVersion: apps/v1 kind: Deployment metadata: name: dep-rp-demo labels: app: dep-rp-demo annotations: resourcePolicy: "dep-rp-demo" # The name of the ResourcePolicy. spec: replicas: 1 selector: matchLabels: app: dep-rp-demo template: metadata: labels: app: dep-rp-demo spec: containers: - name: demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"
For more information about how to use ResourcePolicies to schedule resources, see Resource scheduling based on custom priorities.
Schedule pods based on pod labels
Create a GPU-accelerated workload based on the following content.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # The ACS attributes. alibabacloud.com/acs: "true" # Use the compute power of ACS. alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model. Specify the actual model that you use, such as T4. spec: containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"
Run the following command to query the status of the GPU-accelerated workload:
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyaml
The following expected output displays only the key information:
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"
Examples on using ACS GPU-HPN pods in ACK
You can use ACS GPU-HPN pods in the same way as using ACS CPU-accelerated pods. Make sure that the following requirements are met:
You can use ACS GPU-HPN pods only in ACK managed clusters, ACK One registered clusters, and ACK One Kubernetes clusters for distributed Argo workflows.
You need to first purchase GPU-HPN capacity reservations and associated them with your clusters.
You need to update and configure the kube-scheduler and ACK Virtual Node components. The component versions are in invitational preview, submit a ticket to update to these versions.
Procedure
...
labels:
# Add labels to request ACS GPU resources.
alibabacloud.com/compute-class: gpu-hpn #Set to gpu-hpn.
alibabacloud.com/compute-qos: default #The QoS class.
...
For more information about compute classes and QoS classes, see Mappings between compute classes and computing power QoS classes.
For more information about other ACS pod parameters, see ACS Pod.
You can configure the Kubernetes NodeSelector to schedule pods to GPU-HPN nodes.
ImportantWhen you configure ACS GPU-HPN pods, take note of the following parameters:
alibabacloud.com/compute-class: gpu-hpn
: Specify the compute class.alibabacloud.com/node-type: reserved
: Specify reserved nodes.Configure
requests
andlimits
based on the actual model.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS attributes. alibabacloud.com/compute-class: gpu-hpn alibabacloud.com/compute-qos: default spec: # Specify GPU-HPN reserved nodes. nodeSelector: alibabacloud.com/node-type: reserved containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Specify the resource name based on the actual model. requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Specify the resource name based on the actual model.
Query the GPU loads.
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyaml
Expected output (key information):
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"