Yet Another List of Devops Resources - jrott

Yet Another List of Devops Resources

Table of Contents

Big Picture Resources

Security

Databases and Data Modeling

Continous Delivery

Containers

Networking

Cloud

Linux and Operating Systems

Debugging and Reliability

Systems Thinking and Design

Human Factors

Code Review

Agile

Automation

Incident Response

Monitoring and Observability

Introduction

This is a list of DevOps resources that have influenced me. It’s more focused on AWS and data, because that is where I’ve spent the majority of my career, but I’ve included content on a variety of topics.

Big Picture Resources

  • DevOps Handbook: A great overview of the DevOps landscape. It focuses on workflows and how to make them smooth.
  • Dora Report: What an elite software delivery team looks like and does.
  • DevOps Culture: What really matters, in many ways, is having people and a culture that cares about operations. Rouan Wilsenach explains what that means.
  • SRE Vs DevOps: The differences between SRE and DevOps, and how they’re related.
  • Platform Engineering: Where ops is going and my favorite article explaining platform engineering.

Security

  • STRIDE model: I’ve found the stride model really helpful for thinking about security questions. It provides a basic framework for what could go wrong.
  • OWASP Top 10: The most common security vulnerabilities in web applications.

Databases and Data Modeling

  • Use The Index Luke: Database performance is often treated like a dark art. It isn’t. Use The Index Luke is an excellent start to actually understanding how indexes work, and how to improve database performance for your application.
  • Modeling Data In DynamoDB: A step by step walkthrough of how to model your data in DynamoDB. This also applies to other key value stores.
  • Advanced Design Patterns for DynamoDB: The talk that the previous article is based on. Absolutely worth watching in it’s own right, as it shows the real wizardry that is possible.

Continous Delivery

Containers

Networking

Cloud

  • Load Shedding: This is an important technique for designing systems and maintaing high throughput, while allowing avalibility to gracefully degrade instead of rapidly sliding to 0.
  • AWS White Papers: AWS’s white papers are an excellent source for how things in AWS work, and ideas for making your own cloud deploys better.
  • So You Want To Migrate To Another Region: Techniques to make migrating to another AWS region possible.
  • Amazon Builder’s Library: More resources from Amazon for building high scale applications.

Linux and Operating Systems

Debugging and Reliability

  • How I got better at debugging: Julia Evans excellent advice on debugging and how to get better at it.
  • Computers Can Be Understood: Because computers don’t actually require you to sacrifice a goat in a pentagram to make them work - it just feels that way sometimes.
  • SLI SLA and SLO: All three of those acronyms get thrown around far too often in a confused manner. Here is what they mean.
  • Debugging Under Fire: An amazing talk by Bryan Cantrill about debugging a major production outage.

Systems Thinking and Design

Human Factors

Code Review

Agile

Automation

Incident Response

  • Being On Call: Being on call is stressful, this is an excellent guide to setting up teams for success.
  • Incident Management: From Atlassian, on the steps to take when managing an incident, as well as some good cultural practices for incident management.

Monitoring and Observability