Mysql replication delay

September 29, 2011

For a project that I participate we wanted to evaluate the use of MySQL replication for horizontal scaling. The application wasn’t designed with scaling in mind, and make a heavy usage of the database (in a number of not so efficient ways).  Replication seemed like a good solution since there would be only a master server (actually in the setup there is a second server waiting (using heartbeat) in case the first one fails, with drbd to keep them in sync) where all the writes would go, and a number of slaves for the read access. The question that eventually came up was how long whould the replicatin take?

In order to give an answer a custom benchmark script was created that inserted rows on the master and then waited for them to appear on one of the slaves, measuring the time it took, and also the number of retries. The table that was choosen was the biggest table that was in use by the application, since I wanted to do a worst case scenario.
The script used is the following:

#!/usr/bin/perl -w
use DBI;
use DBD::mysql;
use Data::Dumper;
use Time::HiRes qw( gettimeofday );
use Parallel::ForkManager;

my %masterhost = ();
my @slavehost = ();

%masterhost = (
    host => "master",
    user => "user", 
    pass => "pass",

@slavehost = ( {
    host => "slave1",
    user => "user",
    pass => "pass", },
    host => "slave2",
    user => "user",
    pass => "pass", },

sub generate_random_string
        my $length_of_randomstring=shift;# the length of 
                         # the random string to generate

        my @chars=('a'..'z','A'..'Z','0'..'9','_');
        my $random_string;
        foreach (1..$length_of_randomstring) 
                # rand @chars will generate a random 
                # number between 0 and scalar @chars
                $random_string.=$chars[rand @chars];
        return $random_string;

my $numArgs = $#ARGV + 1 ; 

if ($numArgs != 3 ) {
    print "\n Usage: num_of_conc_clients num_of_iterations size_of_extra_data(in bytes)\n";
    exit 1;

my $num_of_clients = $ARGV[0];

my $num_of_iterations = $ARGV[1];

my $size_of_extra_data = $ARGV[2];

my $comments=&generate_random_string($size_of_extra_data);

my $pm = new Parallel::ForkManager($num_of_clients);

mkdir $num_of_clients ;#or die "Failed to mkdir $num_of_clients";

mkdir $num_of_clients . "/". $size_of_extra_data or die "Failed to mkdir $num_of_clients/$size_of_extra_data";

for (1..$num_of_clients) {

    $pm->start and next;

    my $filename=">".$num_of_clients."/". $size_of_extra_data ."/" .$$;
    open(WFILE, $filename) or die "unable to open file $filename";

    my $shost = $$ % @slavehost;
    my $mdsn="DBI:mysql:database=mytests;host=".$masterhost{'host'};
    my $sdsn="DBI:mysql:database=mytests;host=".$slavehost[$shost]{'host'};

    my $mdbh=DBI->connect($mdsn, $masterhost{'user'}, $masterhost{'pass'}) 
        or die "Unable to connect" . DBI->errstr ;

    my $sdbh=DBI->connect($sdsn, $slavehost[$shost]{'user'}, $slavehost[$shost]{'pass'}) 
        or die "Unable to connect" . DBI->errstr ;

   for($i = 0; $i do($statement);

       my $timestamp=gettimeofday();

       $statement = "SELECT * FROM mytests.test1 where test='" . $randvalue ."'"; 
       $rows = $sdbh->do($statement);
       while( $rows == 0 ) {
           $rows = $sdbh->do($statement);
       if( $rows ) {
           $elapsed = gettimeofday() - $timestamp;
           print WFILE $elapsed. ":" . $j. "\n";



my $mdsn="DBI:mysql:database=mytests;host=".$masterhost{'host'};
my $mdbh=DBI->connect($mdsn, $masterhost{'user'}, $masterhost{'pass'}) 
        or die "Unable to connect" . DBI->errstr ;

$statement = "DELETE FROM mytests.test1";

$mdbh->disconnect() or warn "Disconnection failed: $DBI::errstr\n";

print "Done for $num_of_clients $num_of_iterations $size_of_extra_data" . "\n" ;

The script was run for a number of 10 to 100 concurent processes (step value 10) and for data size from 1000 to 20000 (step value 1000) bytes.

for i in $(seq 10 10 100) ; do for j in $(seq 1000 1000 20000); do perl ./ $i 1000 $j; done; done

and produced the files with the values like

With a little bit of awk, shell scripting and octave, I analysed the results and I printed some graphs like the following:

Mean number of retries

Mean number of retries

mean value for waiting time

mean value for waiting time

standard deviation of number of retries

standard deviation of number of retries

standard deviation of waiting time

standard deviation of waiting time

It is clear from the test that mysql replication as a horizontal scaling mechanism under heavy load and an application that is not replication aware is not a good solution. It is not the time that takes for the replication to complete, but the standard deviation that makes the whole process quite unstable.

I understand that the way this test was performed there was a hammering to the database server(s). But I wanted to emulate a heavy usage scenario. I could also try to run the same test with smaller data size written, and also have a small sleep period before each write for each process. Finally all the tests were done using VMs, and I’m quite curious to run the same tests with real servers, to see if virtualization’s effect on replication.

Special thanks go to Apollon, for providing help with perl, GNU Octave to produce all these plots, and help in general 🙂


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: