Resources for Getting Started with Distributed Systems Posted on 13.02.202413.02.2024 By caiti335 I’m often asked how to get started with Distributed Systems, so this post documents my path and some of the resources I found most helpful. It is by no means meant to be an exhaustive list. It is worth noting that I am not classically trained in Distributed Systems. I am mostly self taught via independent study and on the job experience. I do have a B.S. in Computer Science from Cornell, but focused mostly on graphics and security in my specialization classes. My love of Distributed Systems and education in it came once I entered industry. The moral of this story is that understanding distributed systems doesn’t require academic intervention to learn and excel at. Books on Theory & Background Introduction to Reliable and Secure Distributed Programming: This book is an excellent introduction to the fundamentals of distributed computing. It definitely takes an academic approach. But is a good place to start to understand the terminology and challenges in the field. Replication: Theory and Practice: This book is a summary of 30 years of distributed systems research on replication up to 2007. Its a great starter and contains all the references to the original work. Each chapter is incredibly dense, and led me down multiple paper rabbit holes. Papers This is by no means an exhaustive list, but these papers I keep coming back to, and they have significantly shaped the way I think about Distributed Systems. Time, Clocks, and the Ordering of Events in Distributed Systems Impossibility of Distributed Consensus with One Faulty Process Unreliable Failure Detectors for Reliable Distributed Systems CAP Twelve Years Later: How the Rules Have Changed Harvest, Yield and Scalable Tolerant Systems Dynamo, Amazon’s Highly Available Key Value Store The Chubby Lock Service for Loosely-Coupled Distributed System Fallacies of Distributed Computing A Note on Reading Papers A note on reading papers: I start with the Abstract, if I find in interesting I’ll proceed onto the Introduction, then the Conclusion. Only then if I am incredibly interested in the implementation or details will I read the whole thing. Also the References are a gold mine, they cite related and foundational work. Often times reading papers is a recursive process. I’ll start on one then find a concept I’m unfamiliar with or don’t understand, so I’ll read the referenced paper and so on. This often times results in going down the paper rabbit holes, and one time resulted in me reading a dissertation from the 1980s but it is a great way to learn. I also highly recommend Michael Bernstein’s blog post “Should I Read Papers?” for more on the motivations and how to read an academic paper. Blog Posts & Talks Below is a list of some of my favorite blog posts and talks that shaped how I think about building Distributed Systems. Most of these are old, but I keep coming back to them, and still find them relevant today. Notes on Distributed Systems for Young Bloods by Jeff Hodges Jepsen Blog Posts by Kyle Kingsbury Everything Will Flow: Distributed Queues & Backpressure by Zach Tellman Bad As I Wanna Be: Coordination and Consistency in Distributed Systems by Peter Bailis Learning from Industry The art of building, operating, and running distributed systems in industry is orthogonal to the theory of Distributed Systems. I truly believe that the best way to learn about Distributed Systems is to get hands on experience working on one. In addition Post Mortems are another great source of information. Large tech companies, like Amazon, Netflix, Google, and Microsoft, often publish a post mortem after a major outage. These are usually pretty dry to read, but contain some hard learned lessons. Tech Insights
A WebSocket Primer Posted on 13.02.202413.02.2024 Over the past year, prior to leaving 343, I spent a large amount of time working with the WebSockets protocol and upgrading the Halo Services to support it. In order to solidify my knowledge and provide a handy refresher for when this information invariably gets context switched out of my… Read More
Origin Story: Becoming a Game Developer Posted on 13.02.202413.02.2024 Over the past few weeks I have been asked over a dozen times how I got into the Games Industry, so I thought I would write it down. TLDR; My first Console was a SNES. I learned to program in High School. I attended Cornell University and got a B.S…. Read More
Design Docs, Markdown, and Git Posted on 13.02.202413.02.2024 About a year ago my software engineering team, the Azure Sphere Security Services (AS3) team, found ourselves struggling with our design document process. So we ran an experiment, moving all our design documents to be written in Markdown, checked into Git, and reviewed via a pull request (PR). The experiment… Read More