Published: Mar 28, 2020 by martoc
This shows the steps required to configure the control and data plane of a Kubernetes deployment on AWS using EKS and the autoscaling service.
First, you have to create an EKS cluster, this is pretty straight forward, however, bear in mind that some features have not been implemented on Cloudformation, logging and tagging have to be done directory calling APIs, this is an example of how the control plane can be created using Cloudformation
AWSTemplateFormatVersion: '2010-09-09'
Description: EKS Cluster
Parameters:
Product:
Description: Name of the product
Type: String
Default: product
Owner:
Type: String
Description: Owner of the cluster
Default: owner
Stack:
Type: String
Description: Stack name
Default: infrastructure
Application:
Type: String
Description: Application name
Default: eks-cluster
Environment:
Type: String
Description: Environment name
Default: prod
Name:
Type: String
Description: Name of the cluster
Default: main
Version:
Type: String
Description: Version of the cluster
Default: 0.0.0
VpcId:
Description: Virtual Private Network ID
Type: String
Subnets:
Type: List<AWS::EC2::Subnet::Id>
Description: List of subnets IDs
Resources:
Key:
Type: AWS::KMS::Key
Properties:
Description: EKS encryption key
EnableKeyRotation: true
PendingWindowInDays: 7
KeyPolicy:
Version: '2012-10-17'
Id: key-default
Statement:
- Sid: Enable IAM User Permissions
Effect: Allow
Principal:
AWS: !Join
- ''
- - 'arn:aws:iam::'
- !Ref 'AWS::AccountId'
- :root
Action: kms:*
Resource: '*'
- Sid: Allow administration of the key
Effect: Allow
Principal:
AWS:
- !Join
- ''
- - 'arn:aws:iam::'
- !Ref 'AWS::AccountId'
- :role/sso/fed.admin.user
Action:
- kms:Create*
- kms:Describe*
- kms:Enable*
- kms:List*
- kms:Put*
- kms:Update*
- kms:Revoke*
- kms:Disable*
- kms:Get*
- kms:Delete*
- kms:ScheduleKeyDeletion
- kms:CancelKeyDeletion
Resource: '*'
- Sid: Allow use of the key
Effect: Allow
Principal:
AWS:
- !Join
- ''
- - 'arn:aws:iam::'
- !Ref 'AWS::AccountId'
- :root
Action:
- kms:Encrypt
- kms:Decrypt
- kms:ReEncrypt*
- kms:GenerateDataKey*
- kms:DescribeKey
Resource: '*'
- Sid: Allow attachment of persistent resources
Effect: Allow
Principal:
AWS:
- !Join
- ''
- - 'arn:aws:iam::'
- !Ref 'AWS::AccountId'
- :root
Action:
- kms:CreateGrant
- kms:ListGrants
- kms:RevokeGrant
Resource: '*'
Condition:
Bool:
kms:GrantIsForAWSResource: 'true'
Tags:
- Key: corp:Owner
Value: !Ref 'Owner'
- Key: corp:Environment
Value: !Ref 'Environment'
- Key: corp:Stack
Value: !Ref 'Stack'
- Key: corp:Application
Value: !Ref 'Application'
- Key: Product
Value: !Ref 'Product'
- Key: Name
Value: !Ref 'Name'
- Key: Version
Value: !Ref 'Version'
Role:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- eks.amazonaws.com
Action:
- sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSServicePolicy
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
Tags:
- Key: corp:Owner
Value: !Ref Owner
- Key: corp:Stack
Value: !Ref Stack
- Key: corp:Application
Value: !Ref Application
- Key: corp:Environment
Value: !Ref Environment
- Key: Name
Value: !Ref Name
- Key: Version
Value: !Ref Version
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Limits security group ingress traffic
SecurityGroupIngress:
- Description: Data plane to control plane
CidrIp: 10.0.0.0/8
IpProtocol: tcp
FromPort: 443
ToPort: 443
VpcId: !Ref VpcId
Tags:
- Key: corp:Owner
Value: !Ref Owner
- Key: corp:Stack
Value: !Ref Stack
- Key: corp:Application
Value: !Ref Application
- Key: corp:Environment
Value: !Ref Environment
- Key: Product
Value: !Ref 'Product'
- Key: Name
Value: !Ref Name
- Key: Version
Value: !Ref Version
Cluster:
Type: AWS::EKS::Cluster
Properties:
Name: !Ref Name
Version: "1.15"
EncryptionConfig:
- Provider:
KeyArn: !GetAtt Key.Arn
Resources:
- secrets
RoleArn: !GetAtt Role.Arn
ResourcesVpcConfig:
SecurityGroupIds:
- !Ref SecurityGroup
SubnetIds: !Ref Subnets
The worker nodes are a simple Autoscaling group, the following template
configures it, the AMI was created using this repository
https://github.com/awslabs/amazon-eks-ami, I did not use the same
configuration because I had to encrypt the AMI using KMS, when the AMI is
encrypted you must give access to the Autoscaling service to use this key. Add
this role arn:aws:iam::<ACCOUNT_ID>:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling
in the KMS policy, remember to replace the placeholder ACCOUNT_ID
.
AWSTemplateFormatVersion: "2010-09-09"
Description: EKS Node Group
Parameters:
Product:
Description: Name of the product
Type: String
Default: product
Owner:
Type: String
Description: Owner of the cluster
Default: owner
Stack:
Type: String
Description: Stack name
Default: infrastructure
Application:
Type: String
Description: Application name
Default: eks-cluster
Environment:
Type: String
Description: Environment name
Default: prod
Name:
Type: String
Description: Name of the cluster
Default: worker
Version:
Type: String
Description: Version of the cluster
Default: 0.0.0
BootstrapArguments:
Type: String
Default: ""
Description: "Arguments to pass to the bootstrap script. See files/bootstrap.sh in https://github.com/awslabs/amazon-eks-ami"
ClusterControlPlaneSecurityGroup:
Type: "AWS::EC2::SecurityGroup::Id"
Description: The security group of the cluster control plane.
ClusterName:
Type: String
Description: The cluster name provided when the cluster was created. If it is incorrect, nodes will not be able to join the cluster.
NodeAutoScalingGroupDesiredCapacity:
Type: Number
Default: 3
Description: Desired capacity of Node Group ASG.
NodeAutoScalingGroupMaxSize:
Type: Number
Default: 4
Description: Maximum size of Node Group ASG. Set to at least 1 greater than NodeAutoScalingGroupDesiredCapacity.
NodeAutoScalingGroupMinSize:
Type: Number
Default: 1
Description: Minimum size of Node Group ASG.
NodeImageId:
Type: AWS::EC2::Image::Id
Default: ""
Description: (Optional) Specify your own custom image ID. This value overrides any AWS Systems Manager Parameter Store value specified above.
NodeInstanceType:
Type: String
Default: t3.medium
AllowedValues:
- a1.medium
- a1.large
- a1.xlarge
- a1.2xlarge
- a1.4xlarge
- c1.medium
- c1.xlarge
- c3.large
- c3.xlarge
- c3.2xlarge
- c3.4xlarge
- c3.8xlarge
- c4.large
- c4.xlarge
- c4.2xlarge
- c4.4xlarge
- c4.8xlarge
- c5.large
- c5.xlarge
- c5.2xlarge
- c5.4xlarge
- c5.9xlarge
- c5.12xlarge
- c5.18xlarge
- c5.24xlarge
- c5.metal
- c5d.large
- c5d.xlarge
- c5d.2xlarge
- c5d.4xlarge
- c5d.9xlarge
- c5d.12xlarge
- c5d.18xlarge
- c5d.24xlarge
- c5d.metal
- c5n.large
- c5n.xlarge
- c5n.2xlarge
- c5n.4xlarge
- c5n.9xlarge
- c5n.18xlarge
- cc2.8xlarge
- cr1.8xlarge
- d2.xlarge
- d2.2xlarge
- d2.4xlarge
- d2.8xlarge
- f1.2xlarge
- f1.4xlarge
- f1.16xlarge
- g2.2xlarge
- g2.8xlarge
- g3s.xlarge
- g3.4xlarge
- g3.8xlarge
- g3.16xlarge
- h1.2xlarge
- h1.4xlarge
- h1.8xlarge
- h1.16xlarge
- hs1.8xlarge
- i2.xlarge
- i2.2xlarge
- i2.4xlarge
- i2.8xlarge
- i3.large
- i3.xlarge
- i3.2xlarge
- i3.4xlarge
- i3.8xlarge
- i3.16xlarge
- i3.metal
- i3en.large
- i3en.xlarge
- i3en.2xlarge
- i3en.3xlarge
- i3en.6xlarge
- i3en.12xlarge
- i3en.24xlarge
- m1.small
- m1.medium
- m1.large
- m1.xlarge
- m2.xlarge
- m2.2xlarge
- m2.4xlarge
- m3.medium
- m3.large
- m3.xlarge
- m3.2xlarge
- m4.large
- m4.xlarge
- m4.2xlarge
- m4.4xlarge
- m4.10xlarge
- m4.16xlarge
- m5.large
- m5.xlarge
- m5.2xlarge
- m5.4xlarge
- m5.8xlarge
- m5.12xlarge
- m5.16xlarge
- m5.24xlarge
- m5.metal
- m5a.large
- m5a.xlarge
- m5a.2xlarge
- m5a.4xlarge
- m5a.8xlarge
- m5a.12xlarge
- m5a.16xlarge
- m5a.24xlarge
- m5ad.large
- m5ad.xlarge
- m5ad.2xlarge
- m5ad.4xlarge
- m5ad.12xlarge
- m5ad.24xlarge
- m5d.large
- m5d.xlarge
- m5d.2xlarge
- m5d.4xlarge
- m5d.8xlarge
- m5d.12xlarge
- m5d.16xlarge
- m5d.24xlarge
- m5d.metal
- m5dn.large
- m5dn.xlarge
- m5dn.2xlarge
- m5dn.4xlarge
- m5dn.8xlarge
- m5dn.12xlarge
- m5dn.16xlarge
- m5dn.24xlarge
- m5n.large
- m5n.xlarge
- m5n.2xlarge
- m5n.4xlarge
- m5n.8xlarge
- m5n.12xlarge
- m5n.16xlarge
- m5n.24xlarge
- p2.xlarge
- p2.8xlarge
- p2.16xlarge
- p3.2xlarge
- p3.8xlarge
- p3.16xlarge
- p3dn.24xlarge
- g4dn.xlarge
- g4dn.2xlarge
- g4dn.4xlarge
- g4dn.8xlarge
- g4dn.12xlarge
- g4dn.16xlarge
- g4dn.metal
- r3.large
- r3.xlarge
- r3.2xlarge
- r3.4xlarge
- r3.8xlarge
- r4.large
- r4.xlarge
- r4.2xlarge
- r4.4xlarge
- r4.8xlarge
- r4.16xlarge
- r5.large
- r5.xlarge
- r5.2xlarge
- r5.4xlarge
- r5.8xlarge
- r5.12xlarge
- r5.16xlarge
- r5.24xlarge
- r5.metal
- r5a.large
- r5a.xlarge
- r5a.2xlarge
- r5a.4xlarge
- r5a.8xlarge
- r5a.12xlarge
- r5a.16xlarge
- r5a.24xlarge
- r5ad.large
- r5ad.xlarge
- r5ad.2xlarge
- r5ad.4xlarge
- r5ad.12xlarge
- r5ad.24xlarge
- r5d.large
- r5d.xlarge
- r5d.2xlarge
- r5d.4xlarge
- r5d.8xlarge
- r5d.12xlarge
- r5d.16xlarge
- r5d.24xlarge
- r5d.metal
- r5dn.large
- r5dn.xlarge
- r5dn.2xlarge
- r5dn.4xlarge
- r5dn.8xlarge
- r5dn.12xlarge
- r5dn.16xlarge
- r5dn.24xlarge
- r5n.large
- r5n.xlarge
- r5n.2xlarge
- r5n.4xlarge
- r5n.8xlarge
- r5n.12xlarge
- r5n.16xlarge
- r5n.24xlarge
- t1.micro
- t2.nano
- t2.micro
- t2.small
- t2.medium
- t2.large
- t2.xlarge
- t2.2xlarge
- t3.nano
- t3.micro
- t3.small
- t3.medium
- t3.large
- t3.xlarge
- t3.2xlarge
- t3a.nano
- t3a.micro
- t3a.small
- t3a.medium
- t3a.large
- t3a.xlarge
- t3a.2xlarge
- u-6tb1.metal
- u-9tb1.metal
- u-12tb1.metal
- x1.16xlarge
- x1.32xlarge
- x1e.xlarge
- x1e.2xlarge
- x1e.4xlarge
- x1e.8xlarge
- x1e.16xlarge
- x1e.32xlarge
- z1d.large
- z1d.xlarge
- z1d.2xlarge
- z1d.3xlarge
- z1d.6xlarge
- z1d.12xlarge
- z1d.metal
ConstraintDescription: Must be a valid EC2 instance type
Description: EC2 instance type for the node instances
NodeVolumeSize:
Type: Number
Default: 20
Description: Node volume size
Subnets:
Type: "List<AWS::EC2::Subnet::Id>"
Description: The subnets where workers can be created.
VpcId:
Type: "AWS::EC2::VPC::Id"
Description: The VPC of the worker instances
Mappings:
PartitionMap:
aws:
EC2ServicePrincipal: "ec2.amazonaws.com"
aws-cn:
EC2ServicePrincipal: "ec2.amazonaws.com.cn"
Resources:
Role:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${ClusterName}
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
AWS: !GetAtt 'NodeInstanceRole.Arn'
Action: sts:AssumeRole
Path: /
Tags:
- Key: corp:Owner
Value: !Ref 'Owner'
- Key: corp:Environment
Value: !Ref 'Environment'
- Key: corp:Stack
Value: !Ref 'Stack'
- Key: corp:Application
Value: !Ref 'Application'
- Key: Product
Value: !Ref 'Product'
- Key: Name
Value: !Sub ${ClusterName}-${Name}-node
- Key: Version
Value: !Ref 'Version'
NodeInstanceRole:
Type: "AWS::IAM::Role"
Properties:
RoleName: !Sub ${ClusterName}-service
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- !FindInMap [PartitionMap, !Ref "AWS::Partition", EC2ServicePrincipal]
Action:
- "sts:AssumeRole"
ManagedPolicyArns:
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
Path: /
Tags:
- Key: corp:Owner
Value: !Ref 'Owner'
- Key: corp:Environment
Value: !Ref 'Environment'
- Key: corp:Stack
Value: !Ref 'Stack'
- Key: corp:Application
Value: !Ref 'Application'
- Key: Product
Value: !Ref 'Product'
- Key: Name
Value: !Sub ${ClusterName}-${Name}-node
- Key: Version
Value: !Ref 'Version'
NodeInstanceProfile:
Type: "AWS::IAM::InstanceProfile"
Properties:
InstanceProfileName: !Sub ${ClusterName}-service
Path: /
Roles:
- Ref: NodeInstanceRole
NodeSecurityGroup:
Type: "AWS::EC2::SecurityGroup"
Properties:
GroupDescription: Security group for all nodes in the cluster
VpcId: !Ref VpcId
Tags:
- Key: !Sub kubernetes.io/cluster/${ClusterName}
Value: owned
- Key: corp:Owner
Value: !Ref 'Owner'
- Key: corp:Environment
Value: !Ref 'Environment'
- Key: corp:Stack
Value: !Ref 'Stack'
- Key: corp:Application
Value: !Ref 'Application'
- Key: Product
Value: !Ref 'Product'
- Key: Name
Value: !Sub ${ClusterName}-${Name}-node
- Key: Version
Value: !Ref 'Version'
NodeSecurityGroupIngress:
Type: "AWS::EC2::SecurityGroupIngress"
Properties:
Description: Allow node to communicate with each other
FromPort: 0
GroupId: !Ref NodeSecurityGroup
IpProtocol: "-1"
SourceSecurityGroupId: !Ref NodeSecurityGroup
ToPort: 65535
ClusterControlPlaneSecurityGroupIngress:
Type: "AWS::EC2::SecurityGroupIngress"
Properties:
Description: Allow pods to communicate with the cluster API Server
FromPort: 443
GroupId: !Ref ClusterControlPlaneSecurityGroup
IpProtocol: tcp
SourceSecurityGroupId: !Ref NodeSecurityGroup
ToPort: 443
ControlPlaneEgressToNodeSecurityGroup:
Type: "AWS::EC2::SecurityGroupEgress"
Properties:
Description: Allow the cluster control plane to communicate with worker Kubelet and pods
DestinationSecurityGroupId: !Ref NodeSecurityGroup
FromPort: 1025
GroupId: !Ref ClusterControlPlaneSecurityGroup
IpProtocol: tcp
ToPort: 65535
ControlPlaneEgressToNodeSecurityGroupOn443:
Type: "AWS::EC2::SecurityGroupEgress"
Properties:
Description: Allow the cluster control plane to communicate with pods running extension API servers on port 443
DestinationSecurityGroupId: !Ref NodeSecurityGroup
FromPort: 443
GroupId: !Ref ClusterControlPlaneSecurityGroup
IpProtocol: tcp
ToPort: 443
NodeSecurityGroupFromControlPlaneIngress:
Type: "AWS::EC2::SecurityGroupIngress"
Properties:
Description: Allow worker Kubelets and pods to receive communication from the cluster control plane
FromPort: 1025
GroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
SourceSecurityGroupId: !Ref ClusterControlPlaneSecurityGroup
ToPort: 65535
NodeSecurityGroupFromControlPlaneOn443Ingress:
Type: "AWS::EC2::SecurityGroupIngress"
Properties:
Description: Allow pods running extension API servers on port 443 to receive communication from cluster control plane
FromPort: 443
GroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
SourceSecurityGroupId: !Ref ClusterControlPlaneSecurityGroup
ToPort: 443
ClusterBastionSecurityGroupIngress:
Type: "AWS::EC2::SecurityGroupIngress"
Properties:
Description: Allow bastion to communicate with the cluster SSH daemon
FromPort: 22
GroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
CidrIp : !ImportValue bastion-infra-BastionVPCCidrBlock
ToPort: 22
NodeLaunchConfig:
Type: "AWS::AutoScaling::LaunchConfiguration"
Properties:
AssociatePublicIpAddress: true
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
DeleteOnTermination: true
VolumeSize: !Ref NodeVolumeSize
VolumeType: gp2
IamInstanceProfile: !Ref NodeInstanceProfile
ImageId: !Ref NodeImageId
InstanceType: !Ref NodeInstanceType
SecurityGroups:
- Ref: NodeSecurityGroup
UserData: !Base64
"Fn::Sub": |
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh ${ClusterName} ${BootstrapArguments}
/opt/aws/bin/cfn-signal --exit-code $? \
--stack ${AWS::StackName} \
--resource NodeGroup \
--region ${AWS::Region}
NodeGroup:
Type: "AWS::AutoScaling::AutoScalingGroup"
Properties:
DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity
LaunchConfigurationName: !Ref NodeLaunchConfig
MaxSize: !Ref NodeAutoScalingGroupMaxSize
MinSize: !Ref NodeAutoScalingGroupMinSize
ServiceLinkedRoleARN: !Sub arn:aws:iam::${AWS::AccountId}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling
Tags:
- Key: !Sub kubernetes.io/cluster/${ClusterName}
PropagateAtLaunch: true
Value: owned
- Key: corp:Owner
PropagateAtLaunch: true
Value: !Ref 'Owner'
- Key: corp:Environment
PropagateAtLaunch: true
Value: !Ref 'Environment'
- Key: corp:Stack
PropagateAtLaunch: true
Value: !Ref 'Stack'
- Key: corp:Application
PropagateAtLaunch: true
Value: !Ref 'Application'
- Key: Product
PropagateAtLaunch: true
Value: !Ref 'Product'
- Key: Name
PropagateAtLaunch: true
Value: !Sub ${ClusterName}-${Name}-node
- Key: Version
PropagateAtLaunch: true
Value: !Ref 'Version'
VPCZoneIdentifier: !Ref Subnets
UpdatePolicy:
AutoScalingRollingUpdate:
MaxBatchSize: 1
MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity
PauseTime: PT5M
Outputs:
NodeInstanceRole:
Description: The node instance role
Value: !GetAtt NodeInstanceRole.Arn
NodeSecurityGroup:
Description: The security group for the node group
Value: !Ref NodeSecurityGroup
NodeAutoScalingGroup:
Description: The autoscaling group
Value: !Ref NodeGroup
Before of after the cluster is created the worker nodes must have access to the
control plane APIs by default the role that created the control plane cluster
has access to administrate it, it grants the system system:masters
.
On your local, you will need to install this tool
brew install aws-iam-authenticator
This tool translates AWS IAM roles into Kubernetes Tokens, you configure kubeclt to use the previous tool running this command
aws eks update-kubeconfig --region <REGION> --name <CLUSTER_NAME>
Now you can execute kubectl
kubestl get services
Your nodes must have these roles system:bootstrappers
and system:nodes
granted on the control plane, the following configuration ties your AWS instance
profile role with these Kubernetes roles.
Create this file and replace the rolearn
by the instance role ARN not the
instance profile.
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: <ARN of instance role (not instance profile)>
username: system:node:
groups:
- system:bootstrappers
- system:nodes
Apply the changes on the cluster
kubectl apply -f aws-auth-cm.yaml
At this moment the cluster and the worker nodes should be fully configured
kubectl get nodes