![]() ![]() To top it off, the freedom which we are given to work on whatever we can help with is extremely refreshing. Of course, being in control of our destiny means that we are also directly and wholly responsible for our failures. There are obviously scenarios where an individual will need to take the responsibility of making a call, but even then those calls are discussed with the entire team first. As a team, we make the calls on how the site should be run from a technical, community, and business perspective. After experiencing how things work at reddit I can't imagine going back to such an environment.Īnother amazing thing about reddit is that we are in direct control of the future of the site. ![]() Places that I've worked at in the past have often had an environment where the development and sysadmin team were each in their own respective silos. If there is a site issue, we don't waste time assigning blame (except maybe in jest), we simply work as a team to get it fixed. ![]() First being that I work with people who are extremely talented, respectful of each other, and dedicated to helping one another. There are a few aspects of the position which are very satisfying. What is the most satisfying thing about being a reddit sysadmin? We still have a long way to go on stability, but we've made considerable progress since the outage-filled months of 2011. With the help of the dev team (which at the time consisted of one person) and a lot of lost sleep, I managed to get things to a relatively stable state during my first year here. There was a long period of time where we had 8+ hours of downtime *a week*. The state of things was of no fault of the existing team they simply lacked the resources and time to address technical debt. When I joined, the infrastructure was in considerable trouble. While I wouldn't characterize it as a disaster, it did sting considerably.īiggest triumph has probably been getting reddit to a more stable state. This resulted in some MySQL data becoming irrecoverably corrupted. There was an incident where I was adjusting the mount flags on a slew of systems and mistakenly mounted filesystems that were in an active MySQL RHCS cluster. I have yet to have caused any major disasters in my career (knock on wood), although I know that day will inevitably dawn. ![]() The one that sticks out in my mind as the most enjoyable is PgCon. I love finding something technical I know very little about and completely tearing it down till I understand how every bell rings and whistle blows. I try to keep tinkering on a personal project or two in order to further my learning. Where do you go for ongoing career development? Favorite conferences? Throughout the years this has included Linus Torvalds (even though he can be an asshole), Larry Wall, Mike Krahulik and Jerry Holkins (of Penny Arcade fame), and John Carmack, to name a notable few. Like my hobby fixation, I tend to focus on specific individuals whom I respect and learn as much about them as I can. I don't really see myself as having one specific role model in mind. I don't expect many complaints will occur in the ensuing downtime.Īnyone that I can learn from. In the event of a planetary cataclysm, we'll just need to wait a bit for a Poincaré recurrence to occur. This of course incurs some considerable time in which the people will be deprived of a crucial cat-picture outlet. In the event of a large disaster, such as the AWS US-East region completely going away, the site data can be restored from the regionally redundant S3 backups. We're still working towards it, but we have a ways to go. Some of those pieces need to be re-factored or completely re-implemented. Unfortunately there are many pieces of the app that make such a split very difficult (mostly centered around the need for global locking and some layer of a low-latency globally persistent state). The first step in addressing this is splitting the site across multiple AWS availability zones. How do you handle disaster recovery - do you have multiple Amazon Web Services (AWS) sites? You can learn some mitigation methods, but in the end your brain is going to do what it wants. I tend to think that each person will handle stress in their own unique way. While I have been under some extremely stressful circumstances in my career, they don't tend to bother me, and the stress fades quickly with time. To be honest I don't really take conscious steps to address the stress. How do you deal with the stress of keeping a site that large up 24/7?īy bottling it all up, ensuring I will explode at some point in the future. He graciously agreed to be interviewed about what it's like being a sysadmin for such a popular site. Opens a new window He's been there for 2.5 years and prior to that he cut his sysadmin teeth at Rackspace, an HR software provider, and an ISP in Alaska. Jason Harvey (aka alienth Opens a new window ) is the senior sysadmin at a small site you might have heard of called reddit. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |