Welcome to the TWC Wiki! You are not logged in. Please log in to the Wiki to vote in polls, change skin preferences, or edit pages. See HERE for details of how to LOG IN.

TWC:Sysadmin info

From TWC Wiki
Revision as of 18:06, 27 June 2010 by Simetrical (talk | contribs) (Hardware: Servers have names)
Jump to navigationJump to search

I was bored and decided to put together this page of handy info for our active sysadmins, currently GED and me. I (Simetrical) started this page on September 20, 2008, and I cynically speculate that it will get updated approximately never after that, so I've tried to keep things relatively nonspecific as to exact versions and stuff, and to provide verification procedures where possible. But take it all with a grain of salt.

If you don't have at least shell access to the TWC server, this page is almost certainly useless to you. If you at least know what shell access is, you're probably capable of understanding at least some of it. Maybe you'll find it interesting, good for you.

Rewritten on January 24, 2010. (It indeed was updated approximately never.)

Updated and expanded on June 27, 2010. (This is getting to be slightly more often than never.)

Hardware

We currently have one dedicated server, known as thor. We own all of the hardware, bought with hard-earned pennies mostly donated by the membership in a 2009 donation drive. I've provided the command necessary to get info about the component in parentheses, where applicable.

Our bandwidth is $50/Mbps/month at the 95th percentile, and we're paying for 10 Mbps uncapped. In principle, if we use more than 10 Mbps at the 95th percentile, we get charged more than $500 for the month.

Software

Linux

The "L" in "LAMP". We use Linux because, as Simetrical will tell you, it is both technically and morally superior to Windows in every conceivable way. (Some people who currently have root access might hold different opinions.) In fact, the old server (loki) ran Linux, and it was all Simetrical knew how to administer when he was picking out the new server (odin), so it was a fairly pragmatic choice even if he wasn't a penguin-hugging open-source hippie. The same logic went for the new new server (thor), since although GrnEyedDvl was around by that point, Simetrical was the one familiar with the existing setup.

To be more precise, we use Ubuntu, mainly because it has a huge and up-to-date package repository, and also Simetrical happens to be familiar with it because he uses it at home. Both of the old servers were RHEL5, and we have no regrets about switching to Ubuntu. The output of lsb_release -a is currently

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 10.04 LTS
Release:	10.04
Codename:	lucid

We were originally on 8.04 LTS (Hardy), but switched to 8.10 to fix kernel problems. Later we upgraded to 10.04 LTS, although under rather unfortunate circumstances. We kind of had to upgrade because support on the non-LTS version 8.10 expired, but a major goal was also to improve disk performance by switching from ext3 to ext4. Ironically, I/O on the new OS version was slower.

lighttpd

The "A" in "LAMP". Yes, this is fairly weird, alphabetically speaking, but nobody wants to try to pronounce "LLMP". We used to use Apache, but it used way too much memory. We switched in March 2008, saved 1.5G of RAM, got perceptibly faster page load times, and never looked back. Fiddling around with random stuff for fun can pay off sometimes.

lighttpd's config file is at /etc/lighttpd/lighttpd.conf. Documentation is at the lighttpd website (config file docs are the most useful). The binary is at /usr/sbin/lighttpd. lighttpd runs as user www-data. Access and error logs are in /var/log/lighttpd/, automagically rotated by logrotate. A handy web-based server status thing is available in the secret stuff thread in the Tech Cathedral, for those with access there (it gives IP addresses of all current connections, so not fit for public consumption).

To restart lighttpd in our current setup, for instance due to new lighttpd.conf or php.ini, use this:

sudo service lighttpd restart

Actually, replace "lighttpd" with the service name (mysql, sphinxsearch, memcached, . . .) and that's basically how all service restarts work.

Restarting lighttpd usually takes a few seconds, so no perceptible downtime for most users, but it will shut down any active connections, so anyone doing a big upload/download will get an error or bogus file or something. So don't get too trigger-happy. If you're feeling really kind, you could check the server status and see if anyone's downloading anything large, but personally I don't bother. Restarts of lighttpd are fairly rare anyway.

lighttpd usually uses a few percent CPU and a few hundred megs of memory, in my experience. It runs on a single thread and doesn't run any scripts itself, so this is pretty reasonable. I wonder what the few hundred megs are for, actually, but thinking back to Apache with mod_php I'm not even going to bother looking into it.

The version is currently:

$ lighttpd -v
lighttpd/1.4.26 (ssl) - a light and fast webserver
Build-Date: Apr  6 2010 11:42:30

We previously compiled the latest version from source to take advantage of better handling of out-of-FastCGI errors, but then we upgraded to Lucid and this became unnecessary.

MySQL

This is the "M" in LAMP. MySQL is nice in some ways, like the server process is very resilient and tends not to ever crash or go into hysterics, or if it does it automatically restarts. In terms of features, such as the ability to optimize queries or use indexes in a better than semi-retarded fashion, it's probably worse than PostgreSQL. But, vBulletin only supports MySQL, like lots of other web apps, so that's what we use.

Wherever possible, we use InnoDB (robust, sophisticated, ACID-compliant, high update concurrency) instead of MyISAM (fragile, simple, non-transactional, low update concurrency, default storage engine). "Wherever possible" means "everywhere except totalwar_vb.phrase, which is a single giant row and InnoDB doesn't like that, but that like never changes anyway so who cares". InnoDB provides lots of the lovely features that users of other database engines assume exist, like transactions, granular locking, and non-blocking reads. Unfortunately vBulletin is written with MyISAM in mind and doesn't actually use, for instance, transactions, but it benefits from the high concurrency anyway. (MyISAM has only table-level locks, so updates are serialized with all other queries, which becomes hellish if you have long-running selects.)

MySQL's config is in /etc/mysql (primary config file: /etc/mysql/my.cnf). The error logging goes to /var/log/mysql/error.log. The actual databases are in /var/lib/mysql, which is on its own logical volume in LVM.

To restart MySQL, you just need to do sudo service mysql restart. This is only necessary on config file changes, and even then it's usually not necessary, you can change most settings live. Restarting MySQL takes several minutes, during which all users will get database errors and no pages requiring the database will load. InnoDB must write all changes from the transaction log to the actual data and index files before it shuts down, and this can take quite a while. Moreover, the site can be sluggish for as much as a couple of hours after a MySQL restart, because its caches will be cleared and have to repopulate.

MySQL, being a database, loves RAM, loves it very, very much. The InnoDB buffer pool (where InnoDB caches data and index pages, being a good DB engine and not trusting the fickle OS for such a sensitive task) is 5G (grep innodb_buffer_pool_size /etc/mysql/my.cnf). Miscellaneous other stuff means that MySQL usually uses around 5.5G or 6G. It will usually a bit of CPU too, although not much, and of course lots and lots of disk space.

We use the packaged version of MySQL, to wit:

$ mysql --version
mysql  Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (x86_64) using readline 6.1

FastCGI

This is the "P" in LAMP. Unlike with lighttpd above, this isn't due to creative spelling. "P" stands for "PHP", and we run PHP using FastCGI. FastCGI is basically a bunch of daemons that hang around, which lighttpd asks to execute PHP scripts for it. In Apache this is typically done with mod_php instead, so Apache executes the scripts itself, but this is a terrible idea for a general-purpose web server.

I don't actually have much of any idea how FastCGI works. I just configured lighttpd to use FastCGI and it handles spawning the processes itself. To restart FastCGI, I just use the command above to restart lighttpd. I mentally categorize FastCGI with lighttpd as "the web server". So this is mostly voodoo magic to me.

The processes are /usr/bin/php-cgi. They run as the www-data user right now, since lighttpd spawns them after it setuid()s. PHP is configured in /etc/php5/cgi/php.ini. It logs warnings and errors to the syslog, /var/log/syslog. FastCGI itself I don't know how to configure, I just set all the relevant stuff (like number of processes) in lighttpd's config file. Some info on APC status (which caches PHP files) is in the secret stuff thread in the Tech Cathedral, since it allows viewing and even changing the values of the cached stuff.

FastCGI uses a crazy lot of CPU and memory. Currently it uses maybe 8G of memory and more than half our CPU at peak. There's some low-hanging fruit to be had in the CPU department, which I hope to pursue when feasible.

We use the packaged PHP version:

$ php --version
PHP 5.3.2-1ubuntu4.2 with Suhosin-Patch (cli) (built: May 13 2010 20:03:45) 
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies

memcached

We started using memcached a little while ago, instead of XCache/APC cache. Actually we still use APC cache for one thing because we're lazy. Not much to say about it.

git

We have some git repositories lying around in various places: /etc, /usr/local/sbin, /var/www, /var/www/forums, and /var/www/fpss, for instance. The first two are owned by root, the second two are owned by www-data. The first contains site config info like the password file, so is secret; the last two contain copyrighted code, so are also secret. /usr/local/sbin and /var/www can be viewed via gitweb, so you can track all the exciting changes I make. We also have a repository in /usr/local/src/vbulletin that I move new versions into, so that the /var/www/forums git repo can use it as a reference point for the dark art of git rebase. And there are some repos in /home/aryeh . . . and maybe others lurking elsewhere. One never knows.

The purpose of git is to be version control software. This records when all changes were made, so 1) we can figure out who made each change (hint: me), 2) we can figure out when and why a change was made (my memory isn't perfect, and I might be hit by a bus), and 3) it's easy to undo changes if they prove problematic (this happens a lot). Every time some files are changed, I try to remember to record a commit at least briefly describing the changes I made. Everyone else with shell access should ideally do this too.

Some basic git commands are:

git log
This shows you a list of all commits, nicely paginated, with the most recent ones first. You can do git log -p to get the exact changes that each commit made (i.e., the diffs).
git diff
Lists what changes have been made, but not yet recorded in git. Normally this should be empty, since people are committing everything to git when they make a change, right? (Okay, kind of tricky if you only have FTP access, so I wind up committing that stuff.) This will ignore any newly-created files: those have to be explicitly committed before they'll show up.
git commit
Creates a new commit. You can do git commit -a to commit all the changes listed by git diff, or you can list exactly which files you want to commit, like git commit file1 file2 file3. (Actually you can commit parts of files too, using git add -i, but let's not go there.) Note that you have to run this command as the repository owner, so prefix it with "sudo" if the repo is owned by root, or "sudo -u www-data" if it's owned by www-data.

vBulletin

The one piece of software we use that's not free and open-source. Website is vbulletin.com. Not too much needs to be said here, because it's much easier to use than the rest of our software, with a web interface and everything. The interesting point is that our vBulletin copy has lots of extra files added, and lots of existing files hacked. You can see for yourself by doing git log in /var/www/forums. There are dozens of changes going back to 2008. When we upgrade, git will semi-magically transfer all of the changes to the new versions, if coaxed by suitable incantations. /usr/local/sbin/vbupgrade does this, but if there's a conflict, you'll have to know what you're doing to resolve it. (Or, alternatively, just get me to do the upgrade.) Minor version changes shouldn't cause conflicts, so any root should be able to do those by following the procedure. Although you'll still have to fix template conflicts.

Other stuff

Most of the rest of our software doesn't have much to discuss. We have no software installed from source right now (not counting web apps like vB), it's all packaged. I wrote some scripts in /usr/local/sbin:

$ ls /usr/local/sbin
7zdl           bwsumm-dl      dlswitch    profile-proc  vbupgrade
bwsumm         bwsumm-totals  groupedsum  reindex.sh
bwsumm-attach  dbbackup.sh    lockupmon   rotate.sh

Actually, a few of those are originally based on scripts I got from elsewhere, namely reindex.sh, rotate.sh, and dbbackup.sh.