I picked up a new book recently that is a great companion to the other SRE books published by O’Reilly. Continue reading “Implementing Service Level Objectives”
NOTAM for SREs
In aviation, NOTAMs are “Notices to Airmen” for conditions that are generally temporary and hence not information published in the usual places. Continue reading “NOTAM for SREs”
Command Line Interface Guidelines
Anyone who knows me, knows that I am most comfortable and at home on the unix/linux command line. Continue reading “Command Line Interface Guidelines”
The Tail at Scale Approximation
This article demonstrates a quick and easy approximation for the probability formulae which I described in two previous articles. Continue reading “The Tail at Scale Approximation”
Riddler: Can you solve the not-so-corn maze?
I love AWK and I’ve written about it before… Continue reading “Riddler: Can you solve the not-so-corn maze?”
The Tail at Scale Revisited
My last article discussed some of the missing math related to setting back-end objectives. This article presents a chart which is useful in understanding the relationship to the user experience and we examine ways to dramatically improve the overall performance. Continue reading “The Tail at Scale Revisited”
The Tail at Scale
The landmark “Tail at Scale”[1] article was missing some of the math. We’re diving into it a bit here to show how the math can be used in setting objectives for latency budgets in back end systems. Continue reading “The Tail at Scale”
Hiring Questions, Problem 3
This was an interesting question, so I thought I’d share it here.. Continue reading “Hiring Questions, Problem 3”
Remote Work
Many companies and individuals are contemplating remote work now. Embrace it! Continue reading “Remote Work”
What is SRE?
The current state of confusion around what a “Site Reliability Engineer” (SRE) role is..
Continue reading “What is SRE?”
BPF Performance Tools
BPF is one of the Swiss Army Knife tools for Performance Engineering on Linux. Continue reading “BPF Performance Tools”
Event Logs and A.I.
Many companies in the logging/monitoring space will try to sell you on AI and ML (Artificial Intelligence and Machine Learning) to find abnormal. Continue reading “Event Logs and A.I.”
Event Logs and K.I.S.S.
I’ve worked with event logs for, well, decades. There are quite a few companies that offer services for managing logs and, afaik, only a few doing it right. Continue reading “Event Logs and K.I.S.S.”
SPOFs and Partial Panel
In both aviation and systems we build in redundancies wherever practical to avoid unpleasantness when components or subsystems fail. Continue reading “SPOFs and Partial Panel”
Traffic At 2 O’clock!
Up in the air, your eyes can’t be everywhere, all the time. You’re trained to scan the skies for “traffic” (other flying machines) as well as scanning instrumentation in the cockpit. Continue reading “Traffic At 2 O’clock!”
Own It !!
We were heading back from the practice area to the airport. I didn’t have my pilot license yet and my instructor says: “Push the throttle to Rental Speed!”. Continue reading “Own It !!”
Systems and Gardening
It’s been awhile since I’ve written. I’ve been busy coming up to speed as an SRE with an awesome new team!
It’s gardening season up here in the Northern Hemisphere and while I was dealing with some trees and bushes that had died with a recent ice storm, I thought about the similarities in dealing with systems. Continue reading “Systems and Gardening”
Reading Week #5
Large numbers are difficult to comprehend. The national debt for example is kind of mind numbing. The fun article this week is about insane numbers; imagining the unimaginable.. Continue reading “Reading Week #5”
Reading Week #4
Monitoring the SRE Golden Signals, an excellent overview by Steve Mushero.. Continue reading “Reading Week #4”
Reading Week #3
Here are some interesting reads if you’re fortunate in having some extra time off this Holiday Season.. Continue reading “Reading Week #3”
Works For Me #1
Hacking thy self. Sharing some productivity tips that I use .. Continue reading “Works For Me #1”
Reading Week #2
First of all, Merry Christmas if you celebrate it, Happy Holidays if you don’t! This week’s interesting read is about a subject I love.. Continue reading “Reading Week #2”
Reading Week #1
I’d like to start a new series of articles based on interesting articles to read for the week.. Continue reading “Reading Week #1”
WordCamp US 2018
#WCUS happening this weekend. WordPress 5.0 and Gutenberg news.. Continue reading “WordCamp US 2018”
Last Week’s Security Earthquake
Last week there were two earthshaking security events. Yes, the Marriott data breach was big, but I’d like to talk about the one you might not have heard of.. Continue reading “Last Week’s Security Earthquake”
Off Topic #1
This is off-topic, but I thought I’d share an image from a recent vacation. Continue reading “Off Topic #1”
Checklists and Runbooks
We’ve been flying planes much longer than we’ve been running systems in production, so it might be instructive to learn what we can from our fellow aviators.. Continue reading “Checklists and Runbooks”
A Steal on O’Reilly DevOps/SRE Books!
This is a limited time deal on 15 O’Reilly books for $15. Go. Buy. Right. Now! Continue reading “A Steal on O’Reilly DevOps/SRE Books!”
Geolocating Your Users
I first became interested in geolocating a few decades ago while designing an email filtering system for some customers and noticing that most of the “spam and malware” originated in half a dozen countries.. Continue reading “Geolocating Your Users”
All Day DevOps 2018
I was lucky to catch most of the SRE track and Keynote speakers with the All Day DevOps event this year. Fortunately, if you missed it or want to watch some of the other tracks, the videos have been made available.