Snapshot Restore versus pg_restore – Time Comparison

Snapshot Restore versus pg_restore – Time Comparison

Summary

I’ll just cut to the chase here and give detailed analysis immediately after.

The time difference between pg_restoring an 11GB sized PostgreSQL database with 159MB pg_dump file, versus restoring snapshot of its t2.medium RDS Instance:

 

time_table.PNG

Analysis

Important note! The time values are subject to change, because it depends on network, other running processes, intermediary EC2 Instance type etc.

pg_restore

I used a t2.medium SSM Bastion Instance to execute PostgreSQL commands. As I’ve mentioned before, provisioning a more powerful EC2 Instance might increase the performance drastically. You can find information about SSM Bastions in this page.

Imagine you have a PostgreSQL database named “XXX” with 11 GB size. Don’t get distracted with the name, I don’t know why I named it “XXX” at the first hand, but here we go anyway 🙂

rds1

I take dump of it by this pg_dump command (check documentation here);

PGPASSWORD='${PGPASSWORD}' pg_dump -h "restore-test.xxxxxxxxxxxx.eu-central-1.rds.amazonaws.com" -p 54321 XXX XXX --format=c > xxx.sql

And to see how big the file is, I execute;

rds3

You can see that the xxx.sql dump file is 159 MB.

The reason why our dump file is smaller than our XXX database is because we take logical backup of the database, therefore, all of those empty pages, old versions of rows, indexes aren’t going to be included in the dump file. It is also compressed as default, because we used –format=c option. All that greatly reduced the size.

Now, let’s restore our database with this dump file by this pg_restore command (check documentation here);

PGPASSWORD='${PGPASSWORD}' pg_restore -v -h "restore-test.xxxxxxxxxxxx.eu-central-1.rds.amazonaws.com" -p 5432 XXX -d POSTGRESQL -j 4 -O -C "xxx.sql"

As you can see, I used time command with the actual command, so that we can see how long it took;

rds6

“real” value here is what we must be looking for. Details over here. Not bad, huh? However, it is still 14 minutes of downtime, which I don’t recommend for your mental, and probably physical health 🙂 Let’s note the value and proceed with Snapshot Restore operation.

Snapshot Restore

I took a snapshot of our PostgreSQL RDS Instance, which has our prey, XXX database;

rds4

At this point, if you’ve decided to use snapshot restore operation instead of pg_restore, you’re going to need to change your corrupted RDS Instance’s identifier. Because AWS doesn’t allow duplicate RDS identifiers, obviously. I mean, they’re identifiers, that’s what they do, identify. You can change identifier to, let’s say, “original-old”. So the time it takes to change identifier of a Multi-AZ RDS Instance is this;

rds10

Quite long. After changing our corrupted RDS Instance’s identifier, we can now proceed to restore our snapshot and name it as our original RDS identifier, so our applications can automatically connect to it. Let’s see how long it takes to restore a Multi-AZ PostgreSQL RDS Instance;

rds8

WOW! This is not good at all. 20 minutes is enough time to make you looking for a new job! Add that identifier change time(2:21) to this and you get 22:04 minutes! You might as well take a damn nap while this completes.

Conclusion

I highly recommend configuring read-replicas(slaves) of the master RDS Instance, if you are swimming in gold and don’t feel lazy at all. Because in that case, you can make one of the slaves your master DB and keep your job in case of a disaster.

But if you aren’t swimming in gold like normal people, highly recommend taking daily dumps of your RDS Instances and uploading them to S3. You can write a restore script to deal with almost any disaster scenarios, as fastest you can.

Leave a Reply

Close Menu