Paul's Internet Landfill/ 2015/ Sysadmin Skills Mulligan

Sysadmin Skills Mulligan

I kind of hate my sysadmin skills entry. On the one hand, I feel strongly about the points I tried to make, and feel they are relevant and useful. On the other, the entry itself is a rambly mess, and the rambly story about troubleshooting DFS makes me cringe every time I read it. So I am going to try again, with the hope that I can write something I can link to without feeling bad. So you might want to skip this entry if you read the other one.

I believe that technically proficient "computer people" have the following:

  1. The ability to model systems and explain how they work. As people with this skill gain knowledge about a domain, they can refine and improve their models. They can be shown some system behaviour and come up with stories about what is going on. They can integrate new information into their models so that as they learn more about these systems, their stories become more correct.

  2. The ability to troubleshoot problems by observing systems they don't understand, coming up with hypotheses about what might be going on, and then devising concrete experiments to see whether their hypotheses are correct. People with these skills can break down complex situations and isolate issues. They have enough judgement to know when they do not understand something, and can get some sense of what they would need to learn in order to improve their understanding.

  3. A solid grasp of domain-specific knowledge about one or more technical areas. These are the details and computer trivia that makes computer people sound smart. They are collections of tools that people can use to observe systems and devise tests. These include implementation details about computer systems that are necessary to get computers working in the real world.

My primary argument is that although all of these are necessary to be an effective systems administrator, modelling and troubleshooting skills are much more important than domain-specific knowledge. As people model systems and troubleshoot them, they fill in gaps of their knowledge, and in so doing they develop domain-specific knowledge. Thus, if you hire somebody who has poor domain-specific knowledge but is strong in modelling and troubleshooting, that person will probably be able to learn on the job.

In contrast, hiring somebody who has a lot of domain-specific knowledge but poor modelling or troubleshooting skills is foolish, because computer technology changes so fast that any domain-specific knowledge will be obsolete within a few years. Domain-specific knowledge has to be continually updated and refreshed.

My secondary argument is that we do a terrible job of hiring technical people for these kinds of jobs, and that certain institutions (namely private colleges, but others as well) do a terrible job of training people for these positions.

The fundamental problem is that domain-specific knowledge is easy to test and evaluate in automated ways, while modelling and troubleshooting skills are not. Therefore we use domain-specific knowledge as a proxy for technical proficiency, reasoning that people with good domain knowledge got that knowledge because they have good modelling and troubleshooting skills. This reasoning is catastrophically incorrect, because it is possible to obtain lots of factoids about a domain through memorization or other shallow learning.

Hiring is Awful

In many cases, organizations hire technical people to fill specific roles. Say that an organization runs an Exchange mailserver, and that the existing sysadmin is moving to China. The organization is under pressure to hire a replacement sysadmin, and because organizations want to be efficient, they are not willing to train somebody from scratch. Thus they make job ads looking for specific skills, and since more experience is better, they inflate the qualifications they are looking for: "The ideal candidate will have 15 years of progressive experience administrating Windows Exchange Server 2013." Job ads become lists of technology buzzwords. This makes life easier for human resources departments, which are largely staffed by people who "aren't computer people", and do not have technical proficiency in the fields they are hiring for. So they filter resumes based on the technology buzzwords in the advertisement.

Then comes the hiring interviews. It is easy to grade people on domain-specific factoids, so often the interview process consists of a bunch of domain-specific questions. Some companies go further, coming up with interview questions intended to test creative thinking, algorithm design, and problem solving skills.

All of this is awful:

What about companies that design interview questions designed to exercise creative thinking skills? Companies such as Google and Microsoft do this, but their tests are standardized, and therefore potential applicants can learn the questions and memorize the answers, thus gaming the system. In order to be effective, these kinds of tests have to be obscure.

How do you fix the hiring problem? I don't know how to do so via greater automation. Rather, all the solutions I know of require time and patience:

None of these techniques are foolproof. None of them address problems with resume inflation. In my experience job interviews are deeply misleading and perhaps actively unhelpful; the true character of candidates makes itself clear within days of a new hire.

Although I strongly believe that looking for candidates with specific domain-specific skillsets is largely a mistake, I also believe that candidates should demonstrate that they are proficient in some domain. If people have no technical knowledge in any domain, then how do I know they are capable of gaining domain knowledge? If the only domain knowledge people have gained was decades ago in some obsolete technology, how do I know that they have kept their modelling and troubleshooting skills sharp?

Education is (Often) Awful

The worst IT educations come from programs that are buzzword-heavy, where the curriculum consists of following cookbooks of instructions about technology after technology. Such curricula can push domain knowledge into people's brains (especially if memorization is involved) but the information is often context-free and unapplied. Furthermore, following cookbook IT recipes does nothing to develop modelling or troubleshooting skills.

There may be some certifications that are worthwhile, but certifications that involve memorizing a bunch of factoids are stupid and counterproductive.

I feel that my undergraduate education in Computer Science did a good job of teaching me modelling and troubleshooting skills, even though the technologies in question were never used in industry. Some of the things that helped develop these skills included:

None of the technologies I used during my undergrad are useful to me today. My degree was focused on software development, not systems administration. Nonetheless, so much of the knowledge was transferable.

This is probably part of the reason "an undergraduate degree in computer science" is used as a resume screening credential by employers, but this is a mistake. Firstly, good candidates without undergraduate educations exist. Secondly, university degrees are expensive, stressful, and time-consuming, and not everybody is in a position to obtain one. (As a data point: I almost did not get through my undergrad.) Thirdly, although having an undergrad degree is probably correlated with having good sysadmin skills, it is not a guarantee.

Developing Sysadmin Skills

This raises some uncomfortable questions:

I desperately want to believe that these skills can be taught, and that they can be taught outside of a university context. I have coworkers and former coworkers who never went to university but have these skills, so I do not think university is strictly necessary. But I am worried about whether these skills can be taught; I am shocked by how scarce they seem to be among those whom we interview.

If you are somebody who feels your modelling and troubleshooting skills are weak, you might try the following:

I don't know whether these techniques will work, because I do not know whether modelling and troubleshooting can be taught. However, I feel that some of these techniques have helped me develop my modelling and troubleshooting skills, so I hope they might be useful to others as well.