Paul Chilton
Rubyrep with PostgreSQL DB syncing
I’m using Rubyrep to sync databases between a pair of PostgreSQL instances for this blog. Thought I would share the process that I used with you.
Install Rubyrep
sudo gem install rubyrep
Configuration
Use the following of scripts/rubyrep.conf with:
RR::Initializer::run do |config|
config.left = {
:adapter => 'postgresql',
:database => 'example_development',
:username => 'example',
:password => 'example',
:host => '127.0.0.1',
:port => 5432,
}
config.right = {
:adapter => 'postgresql',
:database => 'example_production',
:username => 'example',
:password => 'example',
:host => '127.0.0.1',
:port => 5555,
}
config.options[:auto_key_limit] = 2
config.options[:rep_prefix] = 'rr_example'
config.include_tables 'schema_migrations', :key => 'version'
config.include_tables /./, :sync_conflict_handling => lambda { |sync_helper,conflict_type,rows|
time_left = rows[0]['updated_at'].to_i
time_right = rows[1]['updated_at'].to_i
if time_left > 0 and time_right > 0 and time_left != time_right
if time_left > time_right
sync_helper.update_record(:right, sync_helper.right_table(), rows[0])
else
sync_helper.update_record(:left, sync_helper.left_table(), rows[1])
end
end
}
end
This gives a local left connection, and a remote right connection when configured with SSH tunneling. Conflict resolution is based on the updated_at column of the database.
SSH Tunneling
I already have SSH keys setup on the remote host. Using the following to create a connection locally on port 5555 that connects to the remote PostgreSQL port of 5432
ssh -f -N -T -C -L 5555:localhost:5432 example@example.com
Scan and sync
I do the scan and sync manually, rather than continual replication.
rubyrep scan -c scripts/rubyrep.conf
rubyrep sync -c scripts/rubyrep.conf
Fun things with replication
It is possible to have a master-master replication setup where databases are not always connected. Requires:
- Application aware that databases are synced intermittently.
- Unique ID columns. I’ve used GUID values. Even/odd sequences may work.
- Use of updated_at column, for resolving conflicts.
- Do not delete rows. These will sync back into existence.
There are some significant constraints on what can be done, but it does enable some unique applications.