Relational databases: please no!

OK. It is completely obvious that any MMO is going to need a way to store data. I understand that the instinctive reaction is to use a relational database, because that’s what relational databases are for. However, I beg of you as the guy who needs to keep the things running fast and smooth, think twice.

Yes, you get the ease of writing code against a mature system. You also get slower response times and more fiddly parts. If you haven’t hired a really good DBA to work on schema design, you are going to find out all about the ways in which relational databases can be slow under load.

Relational databases are really good at transactional integrity. It’s probably worth thinking about whether or not that’s a key feature. Most games don’t implement it very well right now. If the game crashes five seconds after you kill a monster, do you really feel confident that the loot will be in your bags when you log back in? I don’t.

Besides, you can get transactional integrity out of non-relational databases too. I’m going to cite a bunch of systems with weaker integrity later on, but they’re weaker because they’re big distributed systems spanning multiple datacenters. Games typically do not have that problem.

Relational databases do not scale cheaply. They can scale well, particularly if you bite the bullet and pay for Microsoft SQL or Oracle or something. They don’t scale cheaply.

It’s abominably easy to write bad relational database code. The problem here is that the failure state is not obvious. Bad SQL code reveals problems under load when there’s a lock on one table because transaction A is in process, which blocks transaction B, which happens to be the transaction responsible for showing your player what she’s got in her mailbox. She now waits 30 seconds for the mailbox to load and bitches on the forums. These load problems happen to be one of the things that’s hard to test programatically. Experienced SQL programmers won’t make those mistakes, but inexperienced SQL programmers may not realize how easy they are to make.

Relational databases are not maximally fast. They can’t be, because one of the big concerns is the above-mentioned transactional integrity, and that takes time. If you’ve got some dataset that’s primarily read-only and isn’t too horrendously large, use an in-memory data store. For the tr

It is significant that the big players in the Web space all use non-relational databases heavily. I suspect this is in part the legacy of search — back at AltaVista, we’d have been somewhat horrified at the idea of using Oracle to serve up search queries. The smart people who did the original work at AltaVista, Google, and Yahoo wrote their own data stores, optimized for returning search results quickly.

This attitude stuck. I know a possible apocryphal story about Yahoo, which claims that user data was just stored in a standard filesystem. The code was supposedly hacked such that niceties like file update times weren’t stored anywhere. Speed was all. Whether or not that’s true, Yahoo’s still using speed-oriented database code (warning: PDF). So is Google. Amazon does the same sort of thing.

Now, OK, we’re not gonna write our own serious database systems. CouchDB and HBase probably aren’t there yet. I still really wish more people would ask if they need a relational DB. If you really do need one, make sure you’ve hired someone who’s worked with large systems before and remember that 95% of Web work is smaller than anything you’ll be doing.

Related Posts

8 thoughts on “Databases

  1. Interesting!

    If I were at an MMO startup, I’d probably want a relational database for some things. Account info, for one, and payment information. This is what they’re for.

    And I’d probably want some technology to dump from the custom hot-running dbs periodically to a reporting relational database. This is what could power web-based data. And also what you could run ad-hoc data mining queries on for R&D purposes.

    For rapid read-only data, like monster info – really, I’d be tempted to just run it all through memcached or something, flush the cash on server patches. And at that point, eh, you can store it in whatever makes it easier for your tools, since it’s getting served from the cache in active use and that’s _fast_.

    And then that just leaves character data, including inventory, as the lightning fast heavy read write element. And that you optimize like hell. Yeah, wouldn’t use a heavy relational database. I _would_ like to be able to dump it to one, and restore from it, because losing this stuff loses customers.

    Hmm, I’ve never worked on an MMO, but it’s fun to think about the challenge! Actually, this is a good interview question, how would you set up the database architecture for an MMO – candidates already know the space, and you can see how they think, which is what I want to know in an interview. 🙂

  2. Kirby makes a fair point about wanting some of your data in relational DBs, but you’re going to want to seperate those systems from your operational system anyway. There’s very little reason to keep your operational data in the same DB as your customer management data, even if the two are compatible.

    There’s a lot more R/W information then character data – basically anything that moves or can be interacted with. To follow your example, monster info isn’t static – the monster moves, it’s condition (HP, etc) changes, and its behavior changes based on that condition.

    Very interesting topic. I never bothered to think about the data model.

    1. Yep. The whole data collection and analysis task is a really important one that can and should be separated operationally from the critical path of the game. If you follow the sidebar link to Tiny Subversions, you’ll get a guy who does MMO data collection for a living. Really neat topic.

      Memcached is a good idea. I should have thought of that. It’s a cheap effective way to get the in-memory data store you want. I know of one MMO which stored everything including world data in SQL, if I’m remembering correctly, specifically for ease of updates. I was always a bit leery of the idea because of the speed issues and such. Memcached would work really well there, unless of course you wanted a user-modifiable world.

  3. This was a very enlightening post. Thanks for the advice, Bryant.

    Would it make sense for an asynchronous MMO-style game (like a browser game) to rely on something other than a relational database? Regardless of its purpose, I suppose that a browser game would be more like a website than an MMO in terms of database structure, but that might depend on the way the game works.

    1. Well, see the comments in the Elder Game trackback link — there are some good notes there on the perils of database access for a synchronous game. Some of the stuff I’ve said still makes sense for browser-based games, although you’re certainly not dealing with the need to be realtime. On the other hand, you can still shoot yourself in the foot with bad DB design. It’s just harder to do so.

  4. Meridian 59 stores everything in memory. The server technology is old, but it supported up to 200 simultaneous on a Pentium Pro 200 MHz with 253 MB of RAM. Not all that scalable, but it worked, it was super-reliable, and the system didn’t suffer from massive lag.

    The downside is that saves are written out to a file, and there was a long save cycle in the game when that happens. That could probably be avoided by duplicating the memory and having a separate thread write out the data. The other big problem is that a crash meant that the game state has to be rolled back to the last save file. But, as you point out, this isn’t exactly unique to M59.

    Especially if you are talking about the indie scale of things, a database can cause a lot of headaches. DBA isn’t something most people pick up as a hobby. 😉

  5. Not entirely. I’d say it beats knitting but I’m not sure it does. 😉

    Eric Heimburg makes a case for using a database to avoid extra work on the indie scale; for his purpose, yeah, Smart Fox may work out well. I would want to talk to someone who has had problems with their database layer and find out how bad it was. I hate reference customers who have nothing but good things to say — tell me about how tough it is to fix problems.

Leave a Reply to Bryant Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.