Welcome to the TWC Wiki! You are not logged in. Please log in to the Wiki to vote in polls, change skin preferences, or edit pages. See HERE for details of how to LOG IN.

TWC:Sysadmin info

From TWC Wiki
Revision as of 17:01, 24 January 2010 by Simetrical (talk | contribs) (Update to reflect events of last decade)
Jump to navigationJump to search

I was bored and decided to put together this page of handy info for our active sysadmins, currently GED and me. I (Simetrical) started this page on September 20, 2008, and I cynically speculate that it will get updated approximately never after that, so I've tried to keep things relatively nonspecific as to exact versions and stuff, and to provide verification procedures where possible. But take it all with a grain of salt.

If you don't have at least shell access to the TWC server, this page is almost certainly useless to you. If you at least know what shell access is, you're probably capable of understanding at least some of it. Maybe you'll find it interesting, good for you.

Rewritten on January 24, 2010. (It indeed was updated approximately never.)

Hardware

We currently have one dedicated server. We own all of the hardware, bought with hard-earned pennies mostly donated by the membership in a 2009 donation drive. I've provided the command necessary to get info about the component in parentheses, where applicable.

Our bandwidth is $50/Mbps/month at the 95th percentile, and we're paying for 10 Mbps uncapped. In principle, if we use more than 10 Mbps at the 95th percentile, we get charged more than $500 for the month.

Software

Linux

The "L" in "LAMP". We use Linux because, as Simetrical will tell you, it is both technically and morally superior to Windows in every conceivable way. (Some people who currently have root access might hold different opinions.) In fact, the old server (loki) ran Linux, and it was all Simetrical knew how to administer when he was picking out the new server (odin), so it was a fairly pragmatic choice even if he wasn't a penguin-hugging open-source hippie. The same logic went for the new new server (thor), since although GrnEyedDvl was around by that point, Simetrical was the one familiar with the existing setup.

To be more precise, we use Ubuntu, mainly because it has a huge and up-to-date package repository, and also Simetrical happens to be familiar with it because he uses it at home. Both of the old servers were RHEL5, and we have no regrets about switching to Ubuntu. The output of lsb_release -a is currently

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 8.10
Release:	8.10
Codename:	intrepid

We were originally on 8.04 LTS (Hardy), but switched to 8.10 to fix kernel problems. We might upgrade to a later version to get ext4 and improve disk performance.

lighttpd

The "A" in "LAMP". Yes, this is fairly weird, alphabetically speaking, but nobody wants to try to pronounce "LLMP". We used to use Apache, but it used way too much memory. We switched in March 2008, saved 1.5G of RAM, got perceptibly faster page load times, and never looked back. Fiddling around with random stuff for fun can pay off sometimes.

lighttpd's config file is at /etc/lighttpd/lighttpd.conf. Documentation is at the lighttpd website (config file docs are the most useful). The binary is at /usr/sbin/lighttpd. lighttpd runs as user lighttpd. Access and error logs are in /var/log/lighttpd/, automagically rotated by logrotate. A handy web-based server status thing is available in the secret stuff thread in the Tech Cathedral, for those with access there (it gives IP addresses of all current connections, so not fit for public consumption).

To restart lighttpd in our current setup, for instance due to new lighttpd.conf or php.ini, I use this:

sudo killall php-cgi && sudo /etc/init.d/lighttpd restart

This 1) kills the FastCGI processes (not sure if this is necessary), and 2) restarts lighttpd using the system restart scripts.

Restarting lighttpd usually takes a few seconds, so no perceptible downtime for most users, but it will shut down any active connections, so anyone doing a big upload/download will get an error or bogus file or something. So don't get too trigger-happy. If you're feeling really kind, you could check the server status and see if anyone's downloading anything large, but personally I don't bother. Restarts of lighttpd are fairly rare anyway.

lighttpd usually uses a few percent CPU and a few hundred megs of memory, in my experience. It runs on a single thread and doesn't run any scripts itself, so this is pretty reasonable. I wonder what the few hundred megs are for, actually, but thinking back to Apache with mod_php I'm not even going to bother looking into it.

The version is currently:

$ lighttpd -v
lighttpd/1.4.24 - a light and fast webserver
Build-Date: Dec 23 2009 19:15:43

lighttpd is one of the few things where we've compiled from source and aren't using distro packages. We upgraded to the latest version to take advantage of better handling of out-of-FastCGI errors.

MySQL

This is the "M" in LAMP. MySQL is nice in some ways, like the server process is very resilient and tends not to ever crash or go into hysterics, or if it does it automatically restarts. In terms of features, such as the ability to optimize queries or use indexes in a better than semi-retarded fashion, it's probably worse than PostgreSQL. But, vBulletin only supports MySQL, like lots of other web apps, so that's what we use.

Wherever possible, we use InnoDB (robust, sophisticated, ACID-compliant, high update concurrency) instead of MyISAM (fragile, simple, non-transactional, low update concurrency, default storage engine). "Wherever possible" means "everywhere except totalwar_vb.phrase, which is a single giant row and InnoDB doesn't like that, but that like never changes anyway so who cares". InnoDB provides lots of the lovely features that users of other database engines assume exist, like transactions, granular locking, and non-blocking reads. Unfortunately vBulletin is written with MyISAM in mind and doesn't actually use, for instance, transactions, but it benefits from the high concurrency anyway. (MyISAM has only table-level locks, so updates are serialized with all other queries, which becomes hellish if you have long-running selects.)

MySQL's config is in /etc/mysql (primary config file: /etc/mysql/my.cnf). The logging goes to syslog, in /var/log/syslog. The actual databases are in /var/lib/mysql, which is on its own logical volume in LVM.

To restart MySQL, you just need to do /etc/init.d/mysql restart. This is only necessary on config file changes, and even then it's usually not necessary, you can change most settings live. Restarting MySQL takes several minutes, during which all users will get database errors and no pages requiring the database will load. InnoDB must write all changes from the transaction log to the actual data and index files before it shuts down, and this can take quite a while. Moreover, the site can be sluggish for as much as a couple of hours after a MySQL restart, because its caches will be cleared and have to repopulate.

MySQL, being a database, loves RAM, loves it very, very much. The InnoDB buffer pool (where InnoDB caches data and index pages, being a good DB engine and not trusting the fickle OS for such a sensitive task) is 5G. Miscellaneous other stuff means that MySQL usually uses around 5.5G or 6G. It will usually a bit of CPU too, although not much, and of course lots and lots of disk space.

We use the packaged version of MySQL, to wit:

$ mysql --version
mysql  Ver 14.12 Distrib 5.0.67, for debian-linux-gnu (x86_64) using readline 5.2

FastCGI

This is the "P" in LAMP. Unlike with lighttpd above, this isn't due to creative spelling. "P" stands for "PHP", and we run PHP using FastCGI. FastCGI is basically a bunch of daemons that hang around, which lighttpd asks to execute PHP scripts for it. In Apache this is typically done with mod_php instead, so Apache executes the scripts itself, but this is a terrible idea for a general-purpose web server.

I don't actually have much of any idea how FastCGI works. I just configured lighttpd to use FastCGI and it handles spawning the processes itself. To restart FastCGI, I just use the command above to restart lighttpd. I mentally categorize FastCGI with lighttpd as "the web server". They don't seem to die when lighttpd does, but I'm not sure lighttpd uses pre-existing ones if it's restarted or spawns new ones or what. I haven't tested. (I think it spawns new ones.) So this is mostly voodoo magic to me.

The processes are /usr/bin/php-cgi. They run as the lighttpd user right now, since lighttpd spawns them after it setuid()s. PHP is configured in /etc/php5/cgi/php.ini. It logs warnings and errors to the syslog, /var/log/syslog. FastCGI itself I don't know how to configure, I just set all the relevant stuff (like number of processes) in lighttpd's config file. Some info on XCache status (which caches PHP variables between sessions) is in the secret stuff thread in the Tech Cathedral, since it allows viewing and even deleting or changing the values of the cached variables.

FastCGI uses a crazy lot of CPU and memory. Currently it uses maybe 8G of memory and more than half our CPU at peak. There's some low-hanging fruit to be had in the CPU department, which I hope to pursue when feasible.

We use the packaged PHP version:

$ php --version
PHP 5.2.6-2ubuntu4.5 with Suhosin-Patch 0.9.6.2 (cli) (built: Nov 26 2009 14:16:15) 
Copyright (c) 1997-2008 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies
    with XCache v1.2.2, Copyright (c) 2005-2007, by mOo

git

I should write about git here. I use it to store our configuration files, my custom administration scripts, and the source code to the forums, among other things. I'm not going to write a git tutorial here, but it could be helpful to have some basic commands.

vBulletin

The one piece of software we use that's not free and open-source. Website is vbulletin.com. Not too much needs to be said here, because it's much easier to use than the rest of our software, with a web interface and everything. Some notes on our particular setup (particularly how we store mounds of hacks in git) would be worth writing up at some point.

Other stuff

Most of the rest of our software doesn't have much to discuss. It's worth noting what we've installed that's not part of distro packages. ls /usr/local/bin /usr/local/sbin yields:

/usr/local/bin:
git  git-cvsserver  gitk  git-receive-pack  git-shell  git-upload-archive  git-upload-pack  indexer  search  searchd  spelldump
/usr/local/sbin:
7zdl    bwsumm-attach  bwsumm-totals  dlswitch    lighttpd        lockupmon     reindex.sh  vbupgrade
bwsumm  bwsumm-dl      dbbackup.sh    groupedsum  lighttpd-angel  profile-proc  rotate.sh

Most of the scripts in the latter directory are written by me. The upshot is that there are only three pieces of software where we aren't using the packaged version:

git
Due to a bug that was throwing weird error messages during rebases. We could switch back to the packaged version if we upgrade to the next OS version, or maybe we could do it right now. I don't remember what versions had the error.
lighttpd
Upgraded to the latest version so that the site would die a bit less under heavy I/O, as noted above.
Sphinx
This is indexer, search, searchd. We don't use the packaged version because there is none. There should be a Sphinx package in Ubuntu 10.04, but we're well behind that.