I have a fairly good feel for what MySQL replication can do. I’m wondering what other databases support replication, and how they compare to MySQL and others?
Some questions I would have are:
- Is replication built in, or an add-on/plugin?
- How does the replication work (high-level)? MySQL provides statement-based replication (and row-based replication in 5.1). I’m interested in how other databases compare. What gets shipped over the wire? How do changes get applied to the replicas?
- Is it easy to check consistency between master and slaves?
- How easy is it to get a failed replica back in sync with the master?
- Performance? One thing I hate about MySQL replication is that it’s single-threaded, and replicas often have trouble keeping up, since the master can be running many updates in parallel, but the replicas have to run them serially. Are there any gotchas like this in other databases?
- Any other interesting features…
MySQL’s replication is weak inasmuch as one needs to sacrifice other functionality to get full master/master support (due to the restriction on supported backends).
PostgreSQL’s replication is weak inasmuch as only master/standby is supported built-in (using log shipping); more powerful solutions (such as Slony or Londiste) require add-on functionality. Archive log segments are shipped over the wire, which are the same records used to make sure that a standalone database is in working, consistent state on unclean startup. This is what I’m using presently, and we have resynchronization (and setup, and other functionality) fully automated. None of these approaches are fully synchronous. More complete support will be built in as of PostgreSQL 8.5. Log shipping does not allow databases to come out of synchronization, so there is no need for processes to test the synchronized status; bringing the two databases back into sync involves setting the backup flag on the master, rsyncing to the slave (with the database still runnning; this is safe), and unsetting the backup flag (and restarting the slave process) with the archive logs generated during the backup process available; my shop has this process (like all other administration tasks) automated. Performance is a nonissue, since the master has to replay the log segments internally anyhow in addition to doing other work; thus, the slaves will always be under less load than the master.
Oracle’s RAC (which isn’t properly replication, as there’s only one storage backend — but you have multiple frontends sharing the load, and can build redundancy into that shared storage backend itself, so it’s worthy of mention here) is a multi-master approach far more comprehensive than other solutions, but is extremely expensive. Database contents aren’t “shipped over the wire”; instead, they’re stored to the shared backend, which all the systems involved can access. Because there is only one backend, the systems cannot come out of sync.
Continuent offers a third-party solution which does fully synchronous statement-level replication with support for all three of the above databases; however, the commercially supported version of their product isn’t particularly cheap (though vastly less expensive. Last time I administered it, Continuent’s solution required manual intervention for bringing a cluster back into sync.