On Systems Administration

February 9, 2024


Important: I originally published this article on Medium. If you have a Medium subscription, please consider supporting my writing by reading it on Medium.

Note: Any essay that is titled “On X” is traditionally guaranteed to be arrogant, dismissive, stuffy, and self-absorbed. These types of essays usually portray an opinion as canon, and often leave very little room for spirited debate. I’ll do my best not to buck that trend with this work.

Background

When I was 16 years old, I wanted to be a “systems administrator.” I thought the title sounded cool, and I loved playing with server software and seeing networks come together. I’ll never forget my excitement learning about the very basics of Windows domains. I installed some long-forgotten, GUI-based, OSS domain controller software on an old computer and joined another computer to the domain. I can still remember how I felt watching my network login succeed on a computer connected to that topology.

I was so sold on this systems administration thing that I pursued a degree in “Applied Networking and Systems Administration” within the now-defunct “Networking, Security, and Systems Administration” department at the Rochester Institute of Technology. At some point on this journey, I discovered a passion for layers 1 through 4 of the OSI model, and I pursued networking instead. I landed a job in network operations for a large defense contractor as a “network planning engineer.” I did absolutely no planning nor any engineering, but title inflation is a key tool that the defense contracting industry uses to separate taxpayers from their money, and the position sounded cool. I eventually returned to Rochester, where I took a job as a network engineer with a local consulting company. I started doing less operational work and more “engineering,” whatever that means in I.T.

At this point, I knew that I wanted to go back to school to get an M.S. degree, and a position conveniently opened up at RIT for a systems administrator. Systems administrator. But I was an engineer! My friends were taking jobs in DevOps, Site Reliability Engineering, Cloud Engineering, and other fancy sounding domains. Did I really want to be a “sysadmin,” a term that felt like it ossified as it came out of my mouth?

I ended up taking the job. About 6 months in, I went to lunch with a colleague who taught at RIT and was a few years older than me. We spoke about career aspirations on the drive back to campus, and as we pulled into the parking lot I said: “After my degree, I’d really like to end up doing some kind of DevOps work.”

We’re all just (prisoners) systems administrators here of our own device

Fast forward a few years, and I’ve had the opportunity to see the industry come up with DevOps, SRE, DevSecOps, MLOps, Systems Engineering, Infrastructure Engineering, Cloud Engineering, Platform Engineering, and others. I’ve heard most of these disciplines promise to deliver some type of developer and product nirvana, only to later demonstrate that they were over-hyped fads with a few good ideas. As I’ve watched this happen, I’ve started to realize something: this is all just systems administration. Really, this is all just what good systems administration should be about.

Good systems administrators want to offer secure, robust, scalable, and highly-available systems to their internal customers via interfaces that are force multipliers, allowing teams to work quickly and boosting their productivity. Good sysadmins automate, and good automation must include the full lifecycle of a system or service, from provisioning through operations and troubleshooting, right into deprovisioning. Indeed, many sysadmins in “boring” corporate IT jobs do just this: account creation and deprovisioning in most environments is fully automated, and the common “day 2 operation” of performing a password reset is fully self-service. SCCM, Munki, and other self-service end-user computing tools are all examples of robust automation systems that probably started out as a collection of scripts on someone’s desktop.

These might not be JSON-API-first, written-in-Golang, 40,000-lines-of-code Kubernetes Operators. But they are often very robust automation tools, and the average sysadmin has probably written some scripts that would put anything on OperatorHub to shame. Similarly, the field of systems administration has spent decades figuring out how to monitor systems. Many of the advances made in SRE are a natural evolution of the arguably coarse monitoring that sysadmins have traditionally performed.

Likewise, for all the gum-flapping about security and how sysadmins do stupid things to make their systems insecure, thus justifying enormous budgets for security teams and products that often produce mysteriously difficult to quantify results, the average sysadmin often does care about security. The reason your crusty app running Log4J still isn’t patched in 2024 probably has nothing to do with your sysadmin’s desire to patch, and everything to do with some other kind of mismatched business decision that places priorities elsewhere. Your sysadmin probably knows all of the vulnerable systems and has a script ready to patch them at a moment’s notice.

Each mysterious and new title that has appeared in the systems space still focuses on the underlying importance of robust systems. DevOps introduced the idea of breaking down the wall between operations and developers. Site Reliability Engineering heavily focuses on monitoring, metrics, and mature automation that often resembles software. This new Platform Engineering discipline is all about providing easily consumed interfaces for infrastructure services.

These paradigms all present themselves as specialties that feel very far removed from traditional systems administration. However, they all contain ideas and sub-specialties that fall under the discipline of systems administration. Robust systems administration cuts across these ideas, and it’s particularly bizarre to me that, with all the focus on tearing down silos, we keep building silos around systems administration titles. You won’t find a neurosurgeon who gets upset at being called a doctor, but good luck trying to refer to a cloud engineer as a “systems administrator.”

Why are we distancing?

It’s obvious that the IT field is distancing itself from the title of systems administration. When I was standing on this particular soapbox and ranting to a colleague at a former job, they said something along the lines of “Wow, you’d offend a lot of people by just calling them SysAdmins.” But it obviously begs the question: why are we doing this?

My suspicion is that the idea of a sysadmin conjures up a vision of an old, poorly groomed male sitting in a broom closet, making an art of awkwardness, perfecting the mechanics of saying “no” to product teams, and slowing productivity to a halt. In my experience, this concern is grossly overblown. Setting aside the social characteristics, it’s important to better understand the “sysadmin as a blocker” stereotype that seems so common. Even if this were true (and I’d posit it’s not), this is a silly reason to re-title an entire profession every 6 months.

To a degree, some of the sysadmin stigma has been earned by its practitioners. I have met plenty of ossified admins who refuse to learn anything new and view saying “no” to everything as a point of pride. Again, good admins want to help their customers, and the ideas proffered by the new buzzword-first approach to job titles are often very sound. However we should also consider why these classical admins tend to reach for “no” instead of “yes” or “how,” and we should view that as a competitive advantage of hiring “old school” sysadmins.

Indeed, many of the sysadmins that I know reach for “no” instead of “yes” for very good reasons. They are thoughtful about the decision making process, unwilling to be rushed, and deeply skeptical of new technology that promises to solve all of their problems. You’d be surprised at how often a salty sysadmin will eventually say “yes” when given the time and supporting evidence to properly consider a choice.

Classical systems administrators are also skeptical of change because they know they will ultimately be the ones supporting the result. The promises of DevOps and “tearing down the wall between dev and ops” have largely failed to bear fruit. While the situation in most organizations is much better, the reality is that the ops team still tends to receive the page, and they are often still the ones to actually resolve outages (or to determine that the software, and not the systems, is the problem).

This is all compounded by one of the big fallacies of Agile development: “Working software over comprehensive documentation.” A traditional sysadmin will rule out an entire tool because the man page isn’t good enough. They know that working software without documentation is great…until it stops working. The traditionally salty sysadmin is often an excellent writer of documentation, in many cases going into far too much detail, which can actually be a problem.

These tendencies are built upon decades of shared experience that systems administration, as a discipline, has amassed. When I mention a new, shiny tool to my traditionalist colleagues and they say “That looks neat, but it might not be ideal because it’s written in Python (presenting a packaging nightmare), relies on another fancy tool (introducing unnecessary dependencies), and has no man page or -h menu (suggesting overall product immaturity)", they are providing very useful feedback that can guide decisions in a way that our current practices do not. Between Agile and Kubernetes, technology decision making has become an exercise in blind faith that can otherwise only be witnessed in some mystic religions (with the accompanying speaking in tongues, if you ever listen to a conference talk these days). Skepticism has gone out the window in favor of speed, and I have a suspicion that more alpha and beta technologies have become core parts of production systems than ever before. We could use some of that traditional sysadmin “no” in our organizations.

What are we missing out on?

This entire diatribe can be interpreted as “old man yells at cloud,” but I think there are enormous benefits behind building a cohesive job title for systems and operational work. Many professions find a great deal of pride in their job titles, and we should endeavor to do the same in the systems space. A clearly defined, recognizable job title would go a long way toward building professional solidarity (including the ability to compare salaries across similar job functions) and general recognition of what we do in the systems space.

The current arrangement, where we retitle our profession on a regular six-month cadence, usually to add another three-letter phrase between “Dev” and “Ops,” isn’t serving anyone well. Nobody can look at a person’s resume and have any idea what they actually do. People are being paid largely on the basis of title inflation: “platform engineering” is modern, while “systems administration” is old, even if both teams end up configuring Linux-based web services at their respective organizations. To be clear: we can (and almost certainly must) still have specialties within the systems space, but I disagree with eschewing the umbrella of systems administration.

Many will chafe at the idea of being called anything except for an “engineer,” even though no real standards exist for that term in the realm of I.T. That’s fine. We can be called “systems engineers” instead. But the point still stands: let’s stop retitling our entire profession every few months to account for natural developments in the way we work. Let’s just call it what it is, but let’s decide on something consistent.