[WIP] DevOps Learning Path

A comprehensive learning path for DevOps with practical projects, time estimates, and clear milestones. This is my ongoing list which I will update as I progress in this learning path.

Tags: DevOps, learning, path

Last updated on: 10/8/2025


Imporrtant Note: This page is an on going work in progress, and changes every few weeks. If you see any issues, or have suggestions feel free to contact me.

Table of Contents

PhaseKey AreasEst. TimeDifficultyJump to
Phase 1Linux, Programming, Git, Networking, Cloud Fundamentals2-3 monthsBeginner to IntermediateGo to Phase 1
Phase 2Docker, IaC (Terraform), Secrets, CI/CD3-4 monthsNovice to AdvancedGo to Phase 2
Phase 3Kubernetes, GitOps, Observability, API Management4-5 monthsAdvancedGo to Phase 3
Phase 4SRE, Troubleshooting, Security, Chaos Engineering6+ monthsAdvanced to ExpertGo to Phase 4

Phase 1: Core Refresh & Cloud Infrastructure

Time Estimate: 2-3 months | Difficulty: Beginner to Intermediate


1. Operating System (Linux) and Terminal

Time: 3-4 weeks (60-80 hours) | Difficulty: Novice

Key Areas: Permissions, services, logging, systemd, bash scripting, Vim, Nano

Learn at:

Certifications:

You should be able to:

  • Navigate the filesystem (cd, ls, pwd, find)
  • Use file operations (cp, mv, rm, cat, less, head, tail)
  • Edit files with nano or vim
  • Manage users and permissions (chmod, chown, groups, useradd)
  • Use process tools (ps, top, kill, htop)
  • Work with services (systemctl, service)
  • Use package managers (apt, yum, or dnf)
  • Read and understand basic log files in /var/log/
  • Write simple bash scripts with conditionals and loops
  • Understand basic systemd concepts and cron jobs

Milestone Project: Set up a Linux server, configure users, install services, and write automation scripts


2. Learning a Programming Language

Time: 4-6 weeks (80-120 hours) | Difficulty: Novice

Key Areas: Python, Go

  • Learn at:
  • You should at minimum be able to:
    • Write basic scripts/functions
    • Understand data structures and control flow
    • Use libraries for automation or REST API calls
    • For Go, write simple programs with concurrency concepts

Milestone Project: Build a CLI tool that automates a common DevOps task (file processing, API calls, system monitoring)


3. Version Control (Git)

Time: 2-3 weeks (40-60 hours) | Difficulty: Beginner

Key Areas: Git workflows, branching strategies, collaboration

  • Learn at:
  • You should at minimum be able to:
    • Use basic Git commands (clone, add, commit, push, pull)
    • Create and merge branches
    • Resolve merge conflicts
    • Use GitHub/GitLab for collaboration (PRs, issues, reviews)
    • Understand Git workflows (GitFlow, GitHub Flow)

Milestone Project: Contribute to an open-source project or create a portfolio repository with proper branching strategy


4. Networking Fundamentals

Time: 2-3 weeks (40-60 hours) | Difficulty: Novice

Key Areas: TCP/IP, DNS, load balancing, firewalls, cloud networking

Milestone Project: Set up a multi-tier network with load balancers, configure DNS, and implement security groups


5. Cloud Fundamentals

Time: 4-6 weeks (80-120 hours) | Difficulty: Novice

  • Ideally you should have a good understanding of a cloud provider. It really doesn’t matter between the large providers, as their market share isn’t too drastic. My recommendation is AWS.

A. General Cloud Infrastructure

Key Areas: IAM, VPC, EC2, S3, CloudWatch, Lambda, CLI, CloudFormation or Terraform

  • Learn at:
  • Certs: AWS Cloud Practitioner (optional), AWS SAA (core), AWS DevOps Pro (later) You should at minimum be able to:
    • Navigate AWS Management Console and AWS CLI
    • Create/manage IAM users, groups, roles with least privilege
    • Launch, SSH into, and manage EC2 instances
    • Create and configure S3 buckets with policies
    • Understand and configure VPC components: subnets, route tables, security groups, NACLs
    • Use CloudWatch for basic monitoring and alarms
    • Run simple Lambda functions and understand event triggers
    • Write and deploy basic infrastructure as code (IaC) via CloudFormation or Terraform

C. GCP Fundamentals (Alternative)

Key Areas: IAM, Compute Engine, Cloud Functions, GCS, VPC, Cloud Logging

Milestone Project: Deploy a full-stack application to the cloud with proper networking, security, and monitoring


6. CLI Practice & Scripting

Time: 2-3 weeks (40-60 hours) | Difficulty: Novice

Key Areas: Bash, pipes, grep/awk/sed, cron, jq, aws/gcloud CLI

  • Learn at:
  • You should at minimum be able to:
    • Use Unix shell utilities and piping to manipulate text and files
    • Write bash scripts automating common tasks
    • Schedule jobs with cron
    • Use jq for JSON parsing and filtering in CLI workflows
    • Use AWS CLI and gcloud CLI for routine cloud tasks

Milestone Project: Create a comprehensive automation script that manages cloud resources


7. Database Management

Time: 2-3 weeks (Est hours: 40-60) Difficulty: Novice

Key Areas: SQL basics, database migrations, backup/restore, NoSQL options

  • Learn at:
  • You should at minimum be able to:
    • Write basic SQL queries (SELECT, INSERT, UPDATE, DELETE)
    • Design simple database schemas
    • Perform database backups and restores
    • Understand database migrations and versioning
    • Use NoSQL databases (Redis, MongoDB) for caching and document storage

Milestone Project: Set up a database cluster with backup/restore procedures and monitoring.


Phase 1 Completion

Phase 1 Checkpoint - Build a simple web app, deploy to cloud, set up monitoring

  • Can you deploy a full-stack application to the cloud with proper networking, security, monitoring, and database management?

Phase 1 Project

  • Build and deploy a complete web application to the cloud with a database, proper networking, security groups, and monitoring.
  • The application should demonstrate your understanding of Linux administration, programming, Git workflows, networking concepts, and cloud fundamentals.

Project Deliverables:

  • Working web application deployed to cloud
  • Database integration with proper security
  • Network configuration with load balancers
  • Monitoring and logging setup
  • Git repository with proper branching
  • Automation scripts for deployment

Phase 2: Core DevOps Tools, CI/CD & Automation

Time Estimate: 3-4 months Difficulty: Novice to Advanced

1. Docker

Time: 3-4 weeks (60-80 hours) | Difficulty: Novice

Key Areas: Dockerfile, volumes, multi-stage builds, networking

Milestone Project: Containerize a full-stack application with multi-stage builds and proper networking


2. Infrastructure as Code (IaC) with Terraform

Time: 4-5 weeks (80-100 hours) | Difficulty: Intermediate

Key Areas: Providers, resources, variables, modules, state, backends

  • Learn at:
  • Certs:
  • You should at minimum be able to:
    • Write Terraform configurations for common resources (VMs, networking, storage)
    • Use variables and outputs effectively
    • Initialize, plan, apply, and destroy Terraform infrastructure
    • Understand state files and how to manage them (local and remote backends)
    • Organize code into modules for reuse and clarity

Milestone Project: Create a complete infrastructure setup using Terraform with modules and remote state


3. Secrets Management

Time: 1-2 weeks (20-40 hours) | Difficulty: Novice

Tools: AWS Secrets Manager, Vault, GitHub secrets

  • Securely store/rotate secrets and inject into CI/CD
  • You should at minimum be able to:
    • Store secrets securely outside code repositories
    • Inject secrets into pipelines and runtime environments securely
    • Rotate and audit secrets regularly

Milestone Project: Implement a complete secrets management strategy for your infrastructure


4. CI/CD Concepts & Tools

Time: 4-6 weeks (80-120 hours) | Difficulty: Advanced

Key Areas: Writing Jenkinsfiles, GitHub workflows, automating tests, builds, and artifact management.

Milestone Project: Build a complete CI/CD pipeline that automatically tests, builds, and deploys your application


Phase 2 Completion

Phase 2 Checkpoint - Build, test, and deploy with containers and automation

  • Can you build, test, and deploy applications using containers, infrastructure as code, and automated pipelines?

Phase 2 Project

  • Create a complete DevOps pipeline that automatically builds, tests, and deploys a containerized application using Docker, Terraform for infrastructure, and CI/CD tools.
  • The pipeline should include proper secrets management and demonstrate your mastery of core DevOps automation practices.

Project Deliverables:

  • Containerized application with Docker
  • Infrastructure defined with Terraform
  • Automated CI/CD pipeline
  • Secure secrets management
  • Automated testing and deployment

Phase 3: More Containers & Orchestration

Time Estimate: 4-5 months Difficulty: Advanced

1. Kubernetes

Time: 6-8 weeks (120-160 hours) | Difficulty: Advanced

Key Areas: Pods, Deployments, Services, ConfigMaps, Ingress

Milestone Project: Deploy a microservices application to Kubernetes with proper networking, storage, and configuration management


2. GitOps

Time: 2-3 weeks (40-60 hours) | Difficulty: Advanced

Key Areas: ArgoCD, Flux, declarative deployment patterns

  • Learn at:
  • You should at minimum be able to:
    • Set up GitOps workflows with ArgoCD or Flux
    • Implement declarative deployment patterns
    • Manage application deployments through Git

Milestone Project: Implement a complete GitOps workflow for your Kubernetes applications


3. Helm & Observability

Time: 3-4 weeks (60-80 hours) | Difficulty: Advanced

Tools: Helm, Prometheus, Grafana

  • Templating, dashboards, alerts, metrics
  • You should at minimum be able to:
    • Use Helm to package and deploy Kubernetes applications
    • Install and configure Prometheus for metrics collection
    • Create Grafana dashboards for visualizing metrics
    • Set up alerts based on key performance indicators

Milestone Project: Set up a complete observability stack with monitoring, alerting, and dashboards


4. API Management

Time: 2-3 weeks (40-60 hours) | Difficulty: Intermediate

Key Areas: REST/GraphQL, API gateways, rate limiting

  • Learn at:
  • You should at minimum be able to:
    • Design RESTful APIs and understand GraphQL basics
    • Configure API gateways and rate limiting
    • Implement API versioning and documentation

Milestone Project: Design and implement a complete API management solution


Phase 3 Completion

Phase 3 Checkpoint - Orchestrate applications at scale with monitoring

  • Can you orchestrate containerized applications at scale with proper monitoring and API management?

Phase 3 Project

  • Deploy a complete microservices architecture to Kubernetes with GitOps workflows, comprehensive monitoring, and API management.
  • The project should demonstrate your ability to manage complex distributed systems at scale with proper observability and deployment automation.

Project Deliverables:

  • Kubernetes microservices deployment
  • GitOps workflow implementation
  • Comprehensive monitoring and alerting
  • API management and gateway configuration
  • Scalable distributed system architecture

Phase 4: SRE Mindset, Advanced Ops and Specialization

Time Estimate: 6+ months Difficulty: Advanced to Expert

1. Troubleshooting & Debugging

Time: 3-4 weeks (60-80 hours) | Difficulty: Advanced

Key Areas: Log analysis, performance profiling, network debugging

  • Learn at:
  • You should at minimum be able to:
    • Analyze application and system logs effectively
    • Use performance profiling tools (htop, iostat, netstat)
    • Debug network connectivity issues
    • Troubleshoot container and Kubernetes problems
    • Use debugging tools for distributed systems

Milestone Project: Create a comprehensive troubleshooting playbook and demonstrate debugging skills


2. Documentation & Communication

Time: 2-3 weeks (40-60 hours) | Difficulty: Intermediate

Key Areas: Technical writing, runbooks, incident post-mortems

Milestone Project: Create a complete documentation suite for your infrastructure and processes


3. Monitoring & Logging

Time: 4-5 weeks (80-100 hours) | Difficulty: Advanced

Key Areas: SLIs/SLOs, alerting, log pipelines, incident response

Milestone Project: Implement a complete SRE monitoring and alerting system with SLIs/SLOs


4. Advanced Infrastructure Automation and Configuration Management

Time: 3-4 weeks (60-80 hours) | Difficulty: Advanced

Tools: Ansible, Chef, Puppet Key Areas: Configuration Management

  • Learn at:
    • TBD
  • Certs:
    • Puppet Certified Professional
    • Red Hat Certified Specialist in Ansible Automation

Milestone Project: Automate the configuration of a complete infrastructure stack


5. Service Mesh & Microservices

Time: 3-4 weeks (60-80 hours) | Difficulty: Advanced

Key Areas: Istio, Linkerd, microservices communication patterns

  • Learn at:
  • You should at minimum be able to:
    • Understand service mesh concepts and benefits
    • Deploy and configure Istio or Linkerd
    • Implement traffic management and security policies
    • Monitor microservices communication

Milestone Project: Implement a service mesh for a microservices architecture


6. Security & Cost Optimization

Time: 3-4 weeks (60-80 hours) | Difficulty: Advanced

Key Areas: IAM, MFA, budgets, encryption, tagging

  • Learn at:
    • TBD
  • You should at minimum be able to:
    • Apply least privilege principles to IAM roles and policies
    • Enable and enforce MFA for critical accounts
    • Tag cloud resources for cost allocation and tracking
    • Use encryption at rest and in transit
    • Monitor and optimize cloud costs regularly

Milestone Project: Implement a comprehensive security and cost optimization strategy


7. Chaos Engineering

Time: 2-3 weeks (40-60 hours) | Difficulty: Expert

Tools: LitmusChaos, Gremlin

  • Learn failure injection and resiliency testing
  • Learn at:
    • TBD
  • You should at minimum be able to:
    • Understand principles of failure injection and resiliency testing
    • Use tools to simulate failures and validate system behavior
    • Analyze results and improve system fault tolerance

Milestone Project: Design and implement a chaos engineering strategy for your systems


Phase 4 Completion

Phase 4 Checkpoint - Design and maintain reliable, secure systems at scale

  • Can you design, implement, and maintain highly reliable, secure, and cost-optimized systems at scale?

Phase 4 Project

  • Design and implement a complete SRE system with comprehensive monitoring, incident response procedures, security hardening, cost optimization, and chaos engineering practices.
  • This project should demonstrate your mastery of advanced DevOps and SRE principles at an expert level.

Project Deliverables:

  • Complete SRE monitoring and alerting system
  • Incident response procedures and documentation
  • Security hardening and compliance measures
  • Cost optimization strategies and implementation
  • Chaos engineering testing framework
  • Expert-level system design and architecture

Ongoing Practice & Interview Prep

  • Practice explaining:
    • CI/CD pipeline from commit to prod
    • Docker to K8s rollout process
    • Terraform state and modules
  • Mock interviews: deployment scenarios, incident response, YAML debugging
  • Build a public GitHub repo: pipelines, IaC, dashboards, diagrams

Why This Order?

Phase 1 (Foundation): Linux + Programming + Cloud fundamentals create the essential foundation. You need these core skills before automating anything. This phase builds the fundamental knowledge required for all subsequent phases.

Phase 2 (Core DevOps): Docker + IaC + CI/CD form the core DevOps workflow. These tools represent the essential automation pipeline that defines modern DevOps practices. You learn to package, provision, and deploy applications systematically.

Phase 3 (Scale & Orchestration): Kubernetes + Observability handle production-scale systems. Orchestration and monitoring become critical when moving from single applications to distributed systems. This phase prepares you for real-world production environments.

Phase 4 (SRE & Specialization): SRE practices + Advanced certifications differentiate you in the market. This phase focuses on reliability, security, cost optimization, and advanced system design - the skills that separate senior engineers from junior ones.

Dependencies: Each phase builds on the previous one. You cannot effectively learn Kubernetes without understanding containers (Phase 2). You cannot implement SRE practices without understanding monitoring and orchestration (Phase 3). The progression ensures you have the necessary context and skills for each subsequent phase.


TODOs: Work to add to this page

Advanced Improvements (TODO)

  • Add missing topics:
    • Security Fundamentals (move earlier)
    • Testing Strategies (unit, integration, e2e)
    • Performance Optimization
    • Disaster Recovery & Backup Strategies
  • Create learning paths by role:
    • DevOps Engineer path
    • SRE path
  • Include soft skills:
    • Communication
    • Collaboration
    • Problem-solving
    • Leadership
  • Add cost considerations:
    • Free tier limitations
    • Cost effective learning strategies
    • Local alternatives (minikube, kind, etc.)
  • Add practical projects for each phase:
    • Phase 1: Build a simple web app, deploy to cloud, set up monitoring
    • Phase 2: Create a full CI/CD pipeline with Docker + Terraform
    • Phase 3: Deploy a microservices app to Kubernetes with observability
    • Phase 4: Build a complete SRE setup with SLIs/SLOs
  • Add interview preparation section:
    • Interview questions for each phase
    • Scenario-based questions
    • “Tell me about a time when…” examples
  • Add certification strategy:
    • Which certs to prioritize based on career goals
    • Certification timelines and study plans
    • Cost-benefit analysis of each cert
  • Role-specific paths (SRE vs DevOps Engineer vs Platform Engineer)