AKS CI/CD: Deploying 10+ Java Apps with Zero Manual Steps

The Problem

When I joined this project as Senior DevOps Engineer, there were 10+ enterprise Java Spring Boot applications sitting in Git repos with zero deployment automation. Every release was a manual, nerve-wracking process — SSH into a server, pull the JAR, restart the service, pray.

The goal: full CI/CD from code push to AKS pod, zero manual steps, across 5 environments.

Architecture Overview

Developer pushes code
        ↓
Azure DevOps YAML Pipeline triggers
        ↓
  [Build Stage]
  mvn clean package → JAR
  Docker build → image
  Push to Azure Container Registry (ACR)
        ↓
  [Deploy Stage - per environment]
  Dev → IST → UAT → Stage → Prod
  (approval gates between Stage and Prod)
        ↓
AKS pod running, health checks pass

Step 1 — Dockerizing a Spring Boot App

The key insight most people miss: build the JAR first on the pipeline agent, then COPY it into the image. Don't run Maven inside Docker — it's slower and harder to cache.

FROM eclipse-temurin:17-jre-alpine

WORKDIR /app

# Copy pre-built JAR (built by pipeline, not here)
COPY target/*.jar app.jar

# Non-root user for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

EXPOSE 8080

ENTRYPOINT ["java", "-jar", "app.jar"]

Step 2 — The YAML Pipeline

Here's the full pipeline structure for one application. The pattern scales to any number of apps.

trigger:
  branches:
    include: [main, develop]
  paths:
    include: ["src/**", "pom.xml"]

variables:
  ACR_NAME: "myacr.azurecr.io"
  IMAGE_NAME: "$(ACR_NAME)/$(Build.Repository.Name)"
  IMAGE_TAG: "$(Build.BuildId)"

stages:
  - stage: Build
    displayName: "Build & Containerize"
    jobs:
      - job: BuildJob
        pool:
          vmImage: ubuntu-latest
        steps:
          - task: Maven@4
            inputs:
              mavenPomFile: "pom.xml"
              goals: "clean package -DskipTests"

          - task: Docker@2
            displayName: "Build & Push to ACR"
            inputs:
              command: buildAndPush
              repository: "$(IMAGE_NAME)"
              dockerfile: "Dockerfile"
              containerRegistry: "acr-service-connection"
              tags: |
                $(IMAGE_TAG)
                latest

  - stage: DeployDev
    displayName: "Deploy → Dev"
    dependsOn: Build
    jobs:
      - deployment: DeployToDev
        environment: "dev-aks"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: KubernetesManifest@1
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: "aks-dev-connection"
                    namespace: "dev"
                    manifests: "k8s/deployment.yaml"
                    containers: "$(IMAGE_NAME):$(IMAGE_TAG)"

  - stage: DeployProd
    displayName: "Deploy → Prod"
    dependsOn: [DeployDev]
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployToProd
        environment: "prod-aks"   # ← approval gate configured here in Azure DevOps
        strategy:
          runOnce:
            deploy:
              steps:
                - task: KubernetesManifest@1
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: "aks-prod-connection"
                    namespace: "production"
                    manifests: "k8s/deployment.yaml"
                    containers: "$(IMAGE_NAME):$(IMAGE_TAG)"

Step 3 — Kubernetes Manifests

Keep your manifests simple. One Deployment, one Service, one HPA. Don't over-engineer until you need to.

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0        # zero-downtime rolling update
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: myacr.azurecr.io/my-app:$(IMAGE_TAG)
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 20
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          env:
            - name: SPRING_PROFILES_ACTIVE
              value: "production"
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: app-secrets
                  key: db-password

Step 4 — Rollback Strategy

Don't rely on manual rollback. Build it into the pipeline.

- task: Kubernetes@1
  displayName: "Verify rollout"
  inputs:
    command: rollout
    arguments: "status deployment/my-app -n production --timeout=5m"

# If that fails, the pipeline fails and you can trigger:
# kubectl rollout undo deployment/my-app -n production

Better: use Azure DevOps Release Gates to automatically roll back if your health check endpoint returns errors within 5 minutes of deployment.

Results

After implementing this for all 10+ apps:

Deployment time: 45+ minutes manual → ~12 minutes automated
Failed deployments: dropped by 80% (no more "works on my machine")
Environment parity: Dev/IST/UAT/Stage/Prod all use identical manifests, different configs only
On-call incidents: down 60% in the first month

The pattern is reusable. Once you've done it for one app, the next one takes 30 minutes — swap the image name, adjust resource limits, done.