Sunday, May 25, 2025

๐Ÿ”„ PostgreSQL Database Cloning via Backup and Restore

 When working with PostgreSQL, there are many situations where you need to clone a database:

  • Creating a dev or staging environment from production

  • Testing migrations or data loads in isolation

  • Running experiments on a snapshot of real data

In this post, I’ll walk you through a clean, script-driven approach to PostgreSQL cloning using pg_dump and pg_restore — with a focus on multi-database supportdata-only restores, and sequence syncing.


๐Ÿงฉ Why Use Backup/Restore for Cloning?

Cloning using backup/restore is ideal when:

  • You can’t (or don’t want to) clone entire clusters

  • You're working across environments (e.g., prod → staging)

  • You want selective control over schema, data, or both

Compared to CREATE DATABASE ... TEMPLATE, this gives more flexibility and control.


๐Ÿš€ Introducing pg_multi_backup_restore_interactive.sh

We wrote a shell script that allows you to:

  • Back up multiple PostgreSQL databases

  • Restore them (with schema or just data)

  • Target new or existing databases

  • Handle foreign key and sequence issues

Let’s walk through the process.


๐Ÿ“ฆ Backup Multiple Databases

The script uses pg_dump in custom format (-Fc) to back up databases:

pg_dump -Fc -f ./pg_backups/mydb.dump mydb


This format is preferred because it supports parallel restore, filtering, and more control.

The script will prompt you for:

  • Host, port, and user

  • List of source databases

  • Backup directory

All dumps are saved to individual files like mydb.dumpanotherdb.dump, etc.



๐Ÿ” Restore into New or Existing Databases

The script then lets you choose:

  • Full restore (schema + data)

  • Data-only restore (e.g., when schema already exists)

✅ Data-Only Restore Use Case

This is great when you already have a DB created via migrations, but just want to inject live data from another environment.

It handles:

  • Truncating all user tables

  • Temporarily disabling foreign key constraints (session_replication_role = replica)

  • Using pg_restore --data-only


๐Ÿงน Foreign Key Constraints and Truncation

One big gotcha with data-only restores: foreign key constraint violations.

Example:


ERROR:  insert or update on table "employees" violates foreign key constraint "employees_job_id_fkey"


This happens when you insert data into a child table (employees) before its referenced parent table (jobs) has valid entries.

To avoid this:

  • We truncate all tables

  • Temporarily disable foreign keys and triggers during the restore



๐Ÿงช Real-World Use Case

Imagine you want to clone prod_customers and prod_orders into:

  • staging_customers

  • staging_orders


./pg_multi_backup_restore_interactive.sh

Provide:

  • prod_customers,prod_orders as the source

  • staging_customersstaging_orders as the target

  • Choose data-only restore

The script will:

  • Truncate all tables in target DBs

  • Load data from the corresponding .dump files

  • Reset all sequences to match the restored data

๐Ÿ›ก️ Security Tips

  • Use .pgpass to avoid password prompts securely

  • Ensure the target DB has a recent schema before data-only restore

  • Only superusers can disable triggers (--disable-triggers or session_replication_role = replica)


๐Ÿ“„ Full Script Available

The script is fully interactive and flexible. It’s designed to be used in devops pipelines or manually for one-off migrations.

Want to grab the script?

https://github.com/dev-sincode/postgres_clone


No comments:

Post a Comment

๐Ÿ”„ PostgreSQL Database Cloning via Backup and Restore

 When working with PostgreSQL, there are many situations where you need to  clone a database : Creating a dev or staging environment from pr...