AWS Cloud Provider
In Kubernetes 1.27 and later, you must use an out-of-tree AWS cloud provider. In-tree cloud providers have been deprecated, and the Amazon cloud provider has been removed completely, and won't work after an upgrade to Kubernetes 1.27. The steps listed below are still required to set up an Amazon cloud provider. You can set up an out-of-tree cloud provider for RKE after creating an appropriate IAM role and configuring the ClusterID.
You can also migrate from an in-tree to an out-of-tree AWS cloud provider on Kubernetes 1.26 and earlier. All existing clusters must migrate prior to upgrading to v1.27 in order to stay functional.
To enable the AWS cloud provider, there are no RKE configuration options. You only need to set the name as aws
. In order to use the AWS cloud provider, all cluster nodes must have already been configured with an appropriate IAM role and your AWS resources must be tagged with a cluster ID.
cloud_provider:
name: aws
IAM Requirements
In a cluster with the AWS cloud provider enabled, nodes must have at least the ec2:Describe*
action.
In order to use Elastic Load Balancers (ELBs) and EBS volumes with Kubernetes, the node(s) will need to have the an IAM role with appropriate permissions.
IAM policy for nodes with the controlplane
role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"ec2:DescribeInstances",
"ec2:DescribeRegions",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:CreateSecurityGroup",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:ModifyInstanceAttribute",
"ec2:ModifyVolume",
"ec2:AttachVolume",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CreateRoute",
"ec2:DeleteRoute",
"ec2:DeleteSecurityGroup",
"ec2:DeleteVolume",
"ec2:DetachVolume",
"ec2:RevokeSecurityGroupIngress",
"ec2:DescribeVpcs",
"elasticloadbalancing:AddTags",
"elasticloadbalancing:AttachLoadBalancerToSubnets",
"elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",
"elasticloadbalancing:CreateLoadBalancer",
"elasticloadbalancing:CreateLoadBalancerPolicy",
"elasticloadbalancing:CreateLoadBalancerListeners",
"elasticloadbalancing:ConfigureHealthCheck",
"elasticloadbalancing:DeleteLoadBalancer",
"elasticloadbalancing:DeleteLoadBalancerListeners",
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeLoadBalancerAttributes",
"elasticloadbalancing:DetachLoadBalancerFromSubnets",
"elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
"elasticloadbalancing:ModifyLoadBalancerAttributes",
"elasticloadbalancing:RegisterInstancesWithLoadBalancer",
"elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer",
"elasticloadbalancing:AddTags",
"elasticloadbalancing:CreateListener",
"elasticloadbalancing:CreateTargetGroup",
"elasticloadbalancing:DeleteListener",
"elasticloadbalancing:DeleteTargetGroup",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:DescribeLoadBalancerPolicies",
"elasticloadbalancing:DescribeTargetGroups",
"elasticloadbalancing:DescribeTargetHealth",
"elasticloadbalancing:ModifyListener",
"elasticloadbalancing:ModifyTargetGroup",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:SetLoadBalancerPoliciesOfListener",
"iam:CreateServiceLinkedRole",
"kms:DescribeKey"
],
"Resource": [
"*"
]
}
]
}
IAM policy for nodes with the etcd
or worker
role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeRegions",
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:BatchGetImage"
],
"Resource": "*"
}
]
}
Tagging AWS Resources
The AWS cloud provider uses tagging to discover and manage resources, the following resources are not automatically tagged by Kubernetes or RKE:
-
VPC: The VPC used by the cluster
-
Subnet: The subnets used by the cluster
-
EC2 instances: All nodes launched for the cluster
-
Security Groups: The security group(s) used by nodes in the cluster
Note: If creating a
LoadBalancer
service and there is more than one security group attached to nodes, you must tag only one of the security groups asowned
so that Kubernetes knows which group to add and remove rules. A single untagged security group is allowed, however, sharing this between clusters is not recommended.
AWS Documentation: Tagging Your Amazon EC2 Resources
You must tag with one of the following:
Key | Value |
---|---|
kubernetes.io/cluster/<CLUSTERID> | shared |
<CLUSTERID>
can be any string you choose. However, the same string must be used on every resource you tag. Setting the tag value to owned
informs the cluster that all resources tagged with the <CLUSTERID>
are owned and managed by this cluster only.
If you do not share resources between clusters, you can change the tag to:
Key | Value |
---|---|
kubernetes.io/cluster/<CLUSTERID> | owned |
Tagging for Load Balancers
When provisioning a LoadBalancer
service Kubernetes will attempt to discover the correct subnets, this is also achieved by tags and requires adding additional subnet tags to ensure internet-facing and internal ELBs are created in the correct subnets.
AWS Documentation: Subnet tagging for load balancers
Using the Out-of-Tree AWS Cloud Provider for RKE
- Node name conventions and other prerequisites must be followed so that the cloud provider can find the instance. RKE provisioned clusters don't support configuring
providerID
.
If you use IP-based naming, the nodes must be named after the instance followed by the regional domain name (ip-xxx-xxx-xxx-xxx.ec2.<region>.internal
). If you have a custom domain name set in the DHCP options, you must set --hostname-override
on kube-proxy
and kubelet
to match this naming convention.
- Select the cloud provider.
Selecting external-aws
sets --cloud-provider=external
and allows setting use_instance_metadata_hostname
. Enabling use_instance_metadata_hostname
will query the EC2 metadata service and set http://169.254.169.254/latest/meta-data/hostname
as hostname-override
for kubelet
and kube-proxy
.
Enabling use_instance_metadata_hostname
is required if hostname-override
is empty or if hostname-override
doesn't meet the node naming conventions mentioned above in step 1.
cloud_provider:
name: external-aws
use_instance_metadata_hostname: true/false
Existing clusters that use external
cloud provider will set --cloud-provider=external
for Kubernetes components but won't set the hostname-override
by querying the EC2 metadata service.
- Install the AWS cloud controller manager after the cluster finishes provisioning. Note that the cluster isn't successfully provisioned and nodes are still in an
uninitialized
state until you deploy the cloud controller manager.
Helm Chart Installation from CLI
Official upstream docs for Helm chart installation can be found on GitHub.
- Add the Helm repository:
helm repo add aws-cloud-controller-manager https://kubernetes.github.io/cloud-provider-aws
helm repo update
- Create a
values.yaml
file with the following contents, to override the defaultvalues.yaml
:
# values.yaml
hostNetworking: true
tolerations:
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
value: 'true'
- effect: NoSchedule
value: 'true'
key: node-role.kubernetes.io/controlplane
nodeSelector:
node-role.kubernetes.io/controlplane: 'true'
args:
- --configure-cloud-routes=false
- --use-service-account-credentials=true
- --v=2
- --cloud-provider=aws
clusterRoleRules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- ""
resources:
- nodes
verbs:
- '*'
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
resources:
- services
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- services/status
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ''
resources:
- serviceaccounts
verbs:
- create
- get
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- get
- list
- update
- watch
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- serviceaccounts/token
verbs:
- create
- Install the Helm chart:
helm upgrade --install aws-cloud-controller-manager -n kube-system aws-cloud-controller-manager/aws-cloud-controller-manager --values values.yaml
Verify that the Helm chart installed successfully:
helm status -n kube-system aws-cloud-controller-manager
- If present, edit the DaemonSet to remove the default node selector
node-role.kubernetes.io/control-plane: ""
:
kubectl edit daemonset aws-cloud-controller-manager -n kube-system
- (Optional) Verify that the cloud controller manager update succeeded:
kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager
Migrating to the Out-of-Tree AWS Cloud Provider for RKE
To migrate from an in-tree cloud provider to the out-of-tree AWS cloud provider, you must stop the existing cluster's kube controller manager and install the AWS cloud controller manager. There are many ways to do this. Refer to the official AWS documentation on the external cloud controller manager for details.
If it's acceptable to have some downtime, you can switch to an external cloud provider, which removes in-tree components and then deploy charts to install the AWS cloud controller manager.
If your setup can't tolerate any control plane downtime, you must enable leader migration. This facilitates a smooth transition from the controllers in the kube controller manager to their counterparts in the cloud controller manager. Refer to the official AWS documentation on Using Leader Migration for more details.
The Kubernetes cloud controller migration documentation mentions that it is possible to migrate with the same Kubernetes version, but assumes that migration is part of a Kubernetes upgrade.
Refer to the Kubernetes documentation on migrating to use the cloud controller manager to see if you need to customize your setup before migrating. Confirm your migration configuration values. If your cloud provider provides an implementation of the Node IPAM controller, you also need to migrate the IPAM controller.
- Update the cluster config to enable leader migration in
cluster.yml
:
services:
kube-controller:
extra_args:
enable-leader-migration: "true"
Note that the cloud provider is still aws
at this step:
cloud_provider:
name: aws
- Cordon the control plane nodes, so that the AWS cloud controller pods run on nodes only after upgrading to the external cloud provider.
kubectl cordon -l "node-role.kubernetes.io/controlplane=true"
- To install the AWS cloud controller manager, you must enable leader migration in values.yaml and follow the same steps as when installing chart on a new cluster. To enable leader migration, add the following to the container arguments in values.yaml:
- '--enable-leader-migration=true'
-
Confirm that the chart is installed but the new pods aren't running yet due to cordoned controlplane nodes. After updating the cluster in the next step, RKE will uncordon each node after upgrading and
aws-controller-manager
pods will be scheduled. -
Update
cluster.yml
to change the cloud provider and remove the leader migration arguments from the kube-controller.
Selecting external-aws
sets --cloud-provider=external
and allows setting use_instance_metadata_hostname
. Enabling use_instance_metadata_hostname
will query the EC2 metadata service and set http://169.254.169.254/latest/meta-data/hostname
as hostname-override
for kubelet
and kube-proxy
.
Enabling use_instance_metadata_hostname
is required if hostname-override
is empty or if hostname-override
doesn't meet the node naming conventions.
cloud_provider:
name: external-aws
use_instance_metadata_hostname: true/false
Remove enable-leader-migration
from:
services:
kube-controller:
extra_args:
enable-leader-migration: "true"
-
If you're upgrading the cluster's Kubernetes version, set the Kubernetes version as well.
-
Update the cluster. The
aws-cloud-controller-manager
pods should now be running. -
(Optional) After the upgrade, leader migration is no longer required due to only one cloud-controller-manager and can be removed. Upgrade the chart and remove the following section from the container arguments:
- --enable-leader-migration=true
Verify the cloud controller manager update was successfully rolled out with the following command:
kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager