Files
gitops-ack-kro-argocd/README.md
T

16 KiB

Amazon EKS cluster management using kro & ACK

This example demonstrates how to manage a fleet of Amazon EKS clusters using kro, ACK (AWS Controllers for Kubernetes), and Argo CD across multiple regions and accounts. You'll learn how to create EKS clusters and bootstrap them with required add-ons.

The solution implements a hub-spoke model where a management cluster (hub) is created during initial setup, with EKS capabilities (kro, ACK and Argo CD) enabled for provisioning and bootstrapping workload clusters (spokes) via a GitOps flow.

EKS cluster management using kro & ACK

Prerequisites

  1. AWS account for the management cluster, and optional AWS accounts for spoke clusters (management account can be reused for spokes)
  2. AWS IAM Identity Center (IdC) is enabled in the management account
  3. GitHub account and a valid GitHub Token
  4. GitHub cli
  5. Argo CD cli
  6. Terraform cli
  7. AWS cli v2.32.27+

Instructions

Configure workspace

  1. Create variables

    First, set these environment variables that typically don't need modification:

    export KRO_REPO_URL="https://github.com/kubernetes-sigs/kro.git"
    export WORKING_REPO="eks-cluster-mgmt" # Try to keep this default name as it is referenced in terraform and gitops configs
    export TF_VAR_FILE="terraform.tfvars" # the name of terraform configuration file to use
    

    Then customize these variables for your specific environment:

    export MGMT_ACCOUNT_ID=$(aws sts get-caller-identity --output text --query Account) # Or update to the AWS account to use for your management cluster
    export AWS_REGION="us-west-2" # change to your preferred region
    export WORKSPACE_PATH="$HOME" # the directory where repos will be cloned
    export GITHUB_ORG_NAME="iamahgoub" # your Github username or organization name
    
  2. Clone kro repository

    git clone $KRO_REPO_URL $WORKSPACE_PATH/kro
    
  3. Create your working GitHub repository

    Create a new repository using the GitHub CLI or through the GitHub website:

    gh repo create $WORKING_REPO --private
    
  4. Clone the working empty git repository

    gh repo clone $WORKING_REPO $WORKSPACE_PATH/$WORKING_REPO
    
  5. Populate the repository

    cp -r $WORKSPACE_PATH/kro/examples/aws/eks-cluster-mgmt/* $WORKSPACE_PATH/$WORKING_REPO/
    
  6. Update the Spoke accounts

    If deploying EKS clusters across multiple AWS accounts, update the configuration below. Even for single account deployments, you must specify the AWS account for each namespace.

    code $WORKSPACE_PATH/$WORKING_REPO/addons/tenants/tenant1/default/addons/multi-acct/values.yaml
    

    Values:

    clusters:
       workload-cluster1: "012345678910" # AWS account for workload cluster 1
       workload-cluster2: "123456789101" # AWS account for workload cluster 2
    

    Note: If you only want to use 1 AWS account, reuse the AWS account of your management cluster for the other workload clusters.

  7. Add, Commit and Push changes

    cd $WORKSPACE_PATH/$WORKING_REPO/
    git status
    git add .
    git commit -s -m "initial commit"
    git push
    

Create the Management cluster

  1. Update the terraform.tfvars with your values

    Modify the terraform.tfvars file with your GitHub working repo details:

    • Set git_org_name
    • Update any gitops_xxx values if you modified the proposed setup (git path, branch...)
    • Confirm gitops_xxx_repo_name is "eks-cluster-mgmt" (or update if modified)
    • Configure accounts_ids with the list of AWS accounts for spoke clusters (use management account ID if creating spoke clusters in the same account)
    # edit: terraform.tfvars
    code $WORKSPACE_PATH/$WORKING_REPO/terraform/hub/terraform.tfvars
    
  2. Log in to your AWS management account

    Connect to your AWS management account using your preferred authentication method:

    export AWS_PROFILE=management_account # use your own profile or ensure you're connected to the appropriate account
    
  3. Apply the terraform to create the management cluster:

    cd $WORKSPACE_PATH/$WORKING_REPO/terraform/hub
    ./install.sh
    

    Review the proposed changes and accept to deploy.

    Note: EKS capabilities are not supported yet by Terraform AWS provider. So, we will create them manually using CLI commands.

  4. Retrieve terraform outputs and set into environment variables:

    export CLUSTER_NAME=$(terraform output -raw cluster_name)
    export ACK_CONTROLLER_ROLE_ARN=$(terraform output -raw ack_controller_role_arn)
    export KRO_CONTROLLER_ROLE_ARN=$(terraform output -raw kro_controller_role_arn)
    export ARGOCD_CONTROLLER_ROLE_ARN=$(terraform output -raw argocd_controller_role_arn)
    
  5. Create ACK capability

    aws eks create-capability \
    --region ${AWS_REGION} \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-ack \
    --type ACK \
    --role-arn ${ACK_CONTROLLER_ROLE_ARN} \
    --delete-propagation-policy RETAIN
    
  6. Now, we need to create the Argo CD capability -- IdC has to be enabled in the management account for that. So before creating the capability let's store the IdC instance details and the user that will be used for accessing Argo CD in environment variables:

    export IDC_INSTANCE_ARN='<replace with IdC instance ARN>'
    export IDC_USER_ID='<replace with IdC user id>'
    export IDC_REGION='<replace with IdC region>'
    
  7. Create Argo CD capability

    aws eks create-capability \
    --region ${AWS_REGION} \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-argocd \
    --type ARGOCD \
    --role-arn ${ARGOCD_CONTROLLER_ROLE_ARN} \
    --delete-propagation-policy RETAIN \
    --configuration '{
       "argoCd": {
          "awsIdc": {
          "idcInstanceArn": "'${IDC_INSTANCE_ARN}'",
          "idcRegion": "'${IDC_REGION}'"
          },
          "rbacRoleMappings": [{
          "role": "ADMIN",
          "identities": [{
             "id": "'${IDC_USER_ID}'",
             "type": "SSO_USER"
          }]
          }]
       }
    }'
    
  8. Create kro capability

    aws eks create-capability \
    --region ${AWS_REGION} \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-kro \
    --type KRO \
    --role-arn ${KRO_CONTROLLER_ROLE_ARN} \
    --delete-propagation-policy RETAIN
    
  9. Make sure all the capabilities are now enabled by checking status using the console or the describe-capability command. For example:

    aws eks describe-capability \
    --region ${AWS_REGION} \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-argocd \
    --query 'capability.status' \
    --output text
    

    Modify/run the commands above for other capabilities to make sure they are all ACTIVE.

  10. Retrieve the ArgoCD server URL and log on using the user provided during the capability creation:

    export ARGOCD_SERVER=$(aws eks describe-capability \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-argocd \
    --query 'capability.configuration.argoCd.serverUrl' \
    --output text \
    --region ${AWS_REGION})
    
    echo ${ARGOCD_SERVER}
    export ARGOCD_SERVER=${ARGOCD_SERVER#https://}
    
  11. Generate an account token from the Argo CD UI (Settings → Accounts → admin → Generate New Token), then set it as an environment variable:

    export ARGOCD_AUTH_TOKEN="<your-token-here>"
    export ARGOCD_OPTS="--grpc-web"
    
  12. Configure GitHub repository access (if using private repository). We automate this process using the Argo CD CLI. You can also configure this in the Web interface under "Settings / Repositories"

    export GITHUB_TOKEN="<your-token-here>"
    argocd repo add https://github.com/$GITHUB_ORG_NAME/$WORKING_REPO.git --username iamahgoub --password $GITHUB_TOKEN --upsert --name github
    

    Note: If you encounter the error "Failed to load target state: failed to generate manifest for source 1 of 1: rpc error: code = Unknown desc = authentication required", verify your GitHub token settings.

  13. Connect to the cluster

    aws eks update-kubeconfig --name hub-cluster
    
  14. Install Argo CD App of App:

    kubectl apply -f $WORKSPACE_PATH/$WORKING_REPO/terraform/hub/bootstrap/applicationsets.yaml
    

Bootstrap Spoke accounts

For the management cluster to create resources in the spoke AWS accounts, we need to create an IAM roles in the spoke accounts to be assumed by the ACK capability in the management account for that purpose.

Note: Even if you're only testing this in the management account, you still need to perform this procedure, replacing the list of spoke account numbers with the management account number.

We provide a script to help with that. You need to first connect to each of your spoke accounts and execute the script.

  1. Log in to your AWS Spoke account

    Connect to your AWS spoke account. This example uses specific profiles, but adapt this to your own setup:

    export AWS_PROFILE=spoke_account1 # use your own profile or ensure you're connected to the appropriate account
    
  2. Execute the script to configure IAM roles

    cd $WORKSPACE_PATH/$WORKING_REPO/scripts
    ./create_ack_workload_roles.sh
    

Repeat this step for each spoke account you want to use with the solution

Create a Spoke cluster

Update $WORKSPACE_PATH/$WORKING_REPO

  1. Add cluster creation by kro

    Edit the file:

    code $WORKSPACE_PATH/$WORKING_REPO/fleet/kro-values/tenants/tenant1/kro-clusters/values.yaml
    

    Configure the AWS accounts for management and spoke accounts:

       workload-cluster1:
       managementAccountId: "012345678910"    # replace with your management cluster AWS account ID
       accountId: "123456789101"              # replace with your spoke workload cluster AWS account ID (can be the same)
       tenant: "tenant1"                      # We have only configured tenant1 in the repo. If you change it, you need to duplicate all tenant1 directories
       k8sVersion: "1.30"
       workloads: "true"                      # Set to true if you want to deploy the workloads namespaces and applications
       gitops:
          addonsRepoUrl: "https://github.com/XXXXX/eks-cluster-mgmt"    # replace with your github account
          fleetRepoUrl: "https://github.com/XXXXX/eks-cluster-mgmt"
          platformRepoUrl: "https://github.com/XXXXX/eks-cluster-mgmt"
          workloadRepoUrl: "https://github.com/XXXXX/eks-cluster-mgmt"
    
  2. Add, Commit and Push

    cd $WORKSPACE_PATH/$WORKING_REPO/
    git status
    git add .
    git commit -s -m "initial commit"
    git push
    
  3. After some time, the cluster should be created in the spoke account.

    kubectl get EksClusterwithvpcs -A
    
    NAMESPACE   NAME                STATE    SYNCED   AGE
    argocd      workload-cluster1   ACTIVE   True     36m
    

    If you see STATE=ERROR, this may be normal as it will take some time for all dependencies to be ready. Check the logs of kro and ACK controllers for possible configuration errors.

    You can also list resources created by kro to validate their status:

    kubectl get vpcs.kro.run -A
    kubectl get vpcs.ec2.services.k8s.aws -A -o yaml # check for errors
    

    If you see errors, double-check the multi-cluster account settings and verify that IAM roles in both management and workload AWS accounts are properly configured.

    When VPCs are ready, check EKS resources:

    kubectl get eksclusters.kro.run -A
    kubectl get clusters.eks.services.k8s.aws -A -o yaml # Check for errors
    
  4. Connect to the spoke cluster

    export AWS_PROFILE=spoke_account1 # use your own profile or ensure you're connected to the appropriate account
    

    Get kubectl configuration (update name and region if needed):

    aws eks update-kubeconfig --name workload-cluster1 --region us-west-2
    

    View deployed resources:

    kubectl get pods -A
    

    Output:

    NAMESPACE          NAME                                                READY   STATUS    RESTARTS      AGE
    external-secrets   external-secrets-679b98f996-74lsb                   1/1     Running   0          70s
    external-secrets   external-secrets-cert-controller-556d7f95c5-h5nvq   1/1     Running   0          70s
    external-secrets   external-secrets-webhook-7b456d589f-6bjzr           1/1     Running   0          70s
    

    This output shows that our GitOps solution has successfully deployed our addons in the cluster

You can repeat these steps for any additional clusters you want to manage.

Each cluster is created by its kro RGD, deployed to AWS using ACK controllers, and then automatically registered to Argo CD which can install addons and workloads automatically.

Conclusion

This solution demonstrates a powerful way to manage multiple EKS clusters across different AWS accounts and regions using three key components:

  1. kro (Kubernetes Resource Orchestrator)

    • Manages complex multi-resource deployments
    • Handles dependencies between resources
    • Provides a declarative way to define EKS clusters and their requirements
  2. AWS Controllers for Kubernetes (ACK)

    • Enables native AWS resource management from within Kubernetes
    • Supports cross-account operations through namespace isolation
    • Manages AWS resources like VPCs, IAM roles, and EKS clusters
  3. Argo CD

    • Implements GitOps practices for cluster configuration
    • Automatically bootstraps new clusters with required add-ons
    • Manages workload deployments across the cluster fleet

Key benefits of this architecture:

  • Scalability: Easily add new clusters by updating Git configuration
  • Consistency: Ensures uniform configuration across all clusters
  • Automation: Reduces manual intervention in cluster lifecycle management
  • Separation of Concerns: Clear distinction between infrastructure and application management
  • Audit Trail: All changes are tracked through Git history
  • Multi-Account Support: Secure isolation between different environments or business units

To expand this solution, you can:

  • Add more clusters by replicating the configuration pattern
  • Customize add-ons and workloads per cluster
  • Implement different configurations for different environments (dev, staging, prod)
  • Add monitoring and logging solutions across the cluster fleet
  • Implement cluster upgrade strategies using the same tooling

The combination of kro, ACK, and Argo CD provides a robust, scalable, and maintainable approach to EKS cluster fleet management.

Clean-up

  1. Delete the workload clusters by editing the following file:

    code $WORKSPACE_PATH/$WORKING_REPO/fleet/kro-values/tenants/tenant1/kro-clusters/values.yaml
    

    In the Argo CD UI, synchronize the cluster Applicationset with the prune option enabled, or use the CLI:

    argocd app sync clusters --prune
    

    Known issue: We noticed that some of the VPCs resources (route tables) do not get properly deleted when workload clusters are removed from manifests. If this occurred to you, delete the VPC resources manually to allow for the clean-up to complete till the issue is identified/resolved.

  2. Delete Argo CD App of App:

    kubectl delete -f $WORKSPACE_PATH/$WORKING_REPO/terraform/hub/bootstrap/applicationsets.yaml
    
  3. Delete the EKS capabilities on the management cluster

    aws eks delete-capability \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-argocd
    
    aws eks delete-capability \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-kro
    
    aws eks delete-capability \
    --cluster-name ${CLUSTER_NAME} \
    --capability-name ${CLUSTER_NAME}-ack
    
  4. Make sure all the capabilities are deleted by checking the console or using the describe-capability command. For example:

  5. Delete Management Cluster

    After successfully de-registering all spoke accounts, remove the workload cluster created with Terraform:

    cd $WORKSPACE_PATH/$WORKING_REPO/terraform/hub
    ./destroy.sh
    
  6. Remove ACK IAM Roles in workload accounts

    Finally, connect to each workload account and delete the IAM roles and policies created initially:

    cd $WORKSPACE_PATH/$WORKING_REPO/
    ./scripts/delete_ack_workload_roles.sh ack