Backup and Restore

Backup and Restore tools

In order to execute a backup or a restore operation, PGMob would issue a pg_dump/pg_restore command on a remote server spawned as a subprocess of the PostgreSQL process. The binary should be available on the server (you can specify an exact path) and should be able to connect to localhost without a password.

Example

Create a backup from one database and restore it as a different database

>>> old_db = "foo"  
>>> new_db = "bar"  
>>> file_backup = FileBackup(cluster=cluster)
>>> file_backup.backup(database=old_db, path="/tmp/db.bak")
>>> file_restore = FileRestore(cluster=cluster)
>>> file_restore.restore(database=new_db, path="/tmp/db.bak")

Backup/restore options

Backup and restore operations are controlled by the Options object, which is available as an .options attribute of the corresponding Backup or Restore object. Adjust settings as you would normally do in the command line, but benefit from being able to use Python syntax.

class pgmob.backup.BackupOptions(shell: _BaseShellEnv | None = None, **kwargs)

Parameters to be used for the pg_dump binary

Class attributes match closely to those displayed by the “pg_dump –help” command.

Parameters:
  • shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment

  • **kwargs (dict) – any supporter attribute can be submitted as kwargs

shell

shell processor that defines pathing and escaping for the current environment

Type:

_BaseShellEnv

format

(c|d|t|p) output file format (custom(default), directory, tar, plain text)

Type:

str

jobs

use this many parallel jobs to dump

Type:

int

verbose

verbose mode

Type:

bool

compress

(0-9) compression level for compressed formats

Type:

int

lock_wait_timeout

fail after waiting TIMEOUT for a table lock

Type:

int

data_only

dump only the data, not the schema

Type:

bool

clean

clean (drop) database objects before recreating

Type:

bool

create

include commands to create database in dump

Type:

bool

schemas

dump the named schemas only

Type:

List[str]

exclude_schemas

do NOT dump the named schemas

Type:

List[str]

no_owner

skip restoration of object ownership in plain-text format

Type:

bool

schema_only

dump only the schema, no data

Type:

bool

superuser

superuser user name to use in plain-text format

Type:

str

tables

dump the named tables only

Type:

List[str]

exclude_tables

do NOT dump the named tables

Type:

List[str]

no_privileges

do not dump privileges (grant/revoke)

Type:

bool

exclude_table_data

do NOT dump data for the named table(s)

Type:

List[str]

add_if_exists

use IF EXISTS when dropping objects

Type:

bool

as_inserts

dump data as INSERT commands, rather than COPY

Type:

bool

no_subscriptions

do not dump subscriptions

Type:

bool

no_publications

do not dump publications

Type:

bool

no_tablespaces

do not dump tablespace assignments

Type:

bool

section

dump named section (pre-data, data, or post-data)

Type:

str

strict_names

require table and/or schema include patterns to match at least one entity each

Type:

bool

set_role

invoke SET ROLE before dump

Type:

str

blobs

Include large objects in the dump

Type:

bool

render_args()

Renders options as command line arguments

Returns:

A list of command line arguments

Return type:

List[str]

class pgmob.backup.RestoreOptions(shell: _BaseShellEnv | None = None, **kwargs)

Parameters to be used for the pg_restore binary

Class attributes match closely to those displayed by the “pg_restore –help” command.

Parameters:
  • shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment

  • **kwargs (dict) – any supporter attribute can be submitted as kwargs

shell

shell processor that defines pathing and escaping for the current environment

Type:

_BaseShellEnv

format

(c|d|t) backup file format (should be automatic)

Type:

str

jobs

use this many parallel jobs to restore

Type:

int

verbose

verbose mode

Type:

bool

data_only

restore only the data, no schema

Type:

bool

clean

clean (drop) database objects before recreating

Type:

bool

create

create the target database

Type:

bool

exit_on_error

exit on error, default is to continue

Type:

bool

indexes

restore named indexes

Type:

List[str]

schemas

restore only objects in these schemas

Type:

List[str]

exclude_schemas

do not restore objects in these schemas

Type:

List[str]

use_list

use table of contents from this file for selecting/ordering output

Type:

str

no_owner

skip restoration of object ownership

Type:

bool

functions

(NAME(args)) restore named functions

Type:

List[str]

schema_only

restore only the schema, no data

Type:

bool

superuser

superuser user name to use for disabling triggers

Type:

str

tables

restore named relations (table, view, etc.)

Type:

List[str]

triggers

restore named triggers

Type:

List[str]

no_privileges

skip restoration of access privileges (grant/revoke)

Type:

bool

single_transaction

restore as a single transaction

Type:

bool

disable_triggers

disable triggers during data-only restore

Type:

bool

add_if_exists

use IF EXISTS when dropping objects

Type:

bool

no_subscriptions

do not restore subscriptions

Type:

bool

no_publications

do not restore publications

Type:

bool

no_tablespaces

do not restore tablespace assignments

Type:

bool

section

restore named section (pre-data, data, or post-data)

Type:

str

no_data_for_failed_tables

do not restore data of tables that could not be created

Type:

bool

strict_names

require table and/or schema include patterns to match at least one entity each

Type:

bool

set_role

invoke SET ROLE before dump

Type:

str

render_args()

Renders options as command line arguments

Returns:

A list of command line arguments

Return type:

List[str]

File backup/restore

File restore works with the files available on the PostgreSQL server. When specifying path, make sure PostgreSQL service user (postgres) has access to them.

You can specify a “base” path for your backup/restore operation, making all the subsequent paths relative to that base path.

class pgmob.backup.FileBackup(cluster: Cluster, base_path: str = '', binary_path: str = 'pg_dump', options: BackupOptions | None = None, shell: _BaseShellEnv | None = None)

File Backup class that performs a Backup operation to a local filesystem. Executes pg_dump on a target postgres cluster spawned as a subprocess of Postgres server process.

Parameters:
  • cluster (cluster.Cluster) – Postgres cluster object

  • base_path (str) – Base backup path. If specified, the backup path would be considered relative to it.

  • binary_path (str) – Path to the pg_dump binary. Use when a specific binary version is needed or the binary is not in PATH.

  • options (BackupOptions) – backup options represented by BackupOptions class

  • shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment

options

backup options represented by the BackupOptions class

Type:

BackupOptions

cluster

Postgres cluster object

Type:

cluster.Cluster

binary

Path to the pg_dump binary

Type:

str

shell

shell processor that defines pathing and escaping for the current environment

Type:

_BaseShellEnv

on_start_commands

commands to execute prior to launching the backup

Type:

List[str]

command

main backup command

Type:

str

on_finish_commands

commands to execute after backup is completed or failed

Type:

List[str]

Example

Backup a database schema for tables “a” and “b”

>>> db = "foo"  
>>> backup = FileBackup(cluster=cluster, base_path="/tmp")
>>> backup.options.schema_only = True
>>> backup.options.exclude_tables = ["a", "b"]
>>> backup.backup(database=db, path=f"{db}.bak")
backup(database, path)

Backup a database to the specified path

Parameters:
  • database (str) – name of the database to backup

  • path (str) – path (relative or absolute) to backup to

Returns:

pg_dump stdout and stderr output

Return type:

str

execute_command(database: str, path: str)

Executes backup/restore commands and replaces the formatted commands with actual values.

Parameters:
  • database (str) – database name

  • path (str) – backup path

Returns:

stdin and stdout of executed command

Return type:

str

class pgmob.backup.FileRestore(cluster: Cluster, base_path: str = '', binary_path: str = 'pg_restore', options: RestoreOptions | None = None, shell: _BaseShellEnv | None = None)

Restore class that performs a Restore operation from a local filesystem. Executes pg_restore on a target postgres cluster spawned as a subprocess of Postgres server process.

Parameters:
  • cluster (cluster.Cluster) – Postgres cluster object

  • base_path (str) – Base backup path. If specified, the backup path would be considered relative to it.

  • binary_path (str) – Path to the pg_restore binary. Use when a specific binary version is needed or the binary is not in PATH.

  • options (RestoreOptions) – restore options represented by the RestoreOptions class

  • shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment

options

restore options represented by the RestoreOptions class

Type:

RestoreOptions

cluster

Postgres cluster object

binary

Path to the pg_restore binary

on_start_commands

commands to execute prior to launching the restore

Type:

List[str]

command

main restore command

Type:

str

on_finish_commands

commands to execute after restore is completed or failed

Type:

List[str]

Example

Restore objects into database bar changing object ownership to “bar”

>>> old_db = "foo"  
>>> new_db = "bar"  
>>> role = "ownerrole"  
>>> backup = FileBackup(cluster=cluster, base_path="/tmp")
>>> backup.backup(database=old_db, path=f"{old_db}.bak")
>>> restore = FileRestore(cluster=cluster, base_path="/tmp")
>>> restore.options.schema_only = True
>>> restore.options.no_owner = True
>>> restore.options.tables = ["a", "b"]
>>> restore.options.set_role = role
>>> restore.restore(database=new_db, path=f"{old_db}.bak")
execute_command(database: str, path: str)

Executes backup/restore commands and replaces the formatted commands with actual values.

Parameters:
  • database (str) – database name

  • path (str) – backup path

Returns:

stdin and stdout of executed command

Return type:

str

restore(database: str, path: str)

Restore a backup into the specified database :param database: name of the database to connect to :type database: str :param path: path (relative or absolute) to the backup file :type path: str

Returns:

pg_restore full output

Return type:

str

GCP Bucket backup/restore

Similar to the File backup/restore, these classes offer to simplify restoration from a GCP Bucket. The postgres system user should be able to use gs_util tooling to access bucket contents. During a restore operation, the database backup is copied to a temporary location first, due to the way gs_util outputs the file into the pipeline, making it impossible to pass it to pg_restore. This folder location can be adjusted.

class pgmob.backup.GCPBackup(cluster: Cluster, bucket: str = '', binary_path: str = 'pg_dump', options: BackupOptions | None = None, shell: _BaseShellEnv | None = None)

GCP Backup class that uploads backups to a GCP bucket. Executes “pg_dump” on a target postgres cluster spawned as a subprocess of Postgres server process. Requires “gsutil” command available on the server host. To configure authentication, run “gcloud auth login” under postgres OS user.

Parameters:
  • cluster (cluster.Cluster) – Postgres cluster object

  • bucket (str) – (Optional) GCP bucket name to work with. If specified, the backup path would be considered relative of the bucket path

  • binary_path (str) – Path to the pg_dump binary. Use when a specific binary version is needed or the binary is not in PATH.

  • options (BackupOptions) – backup options represented by BackupOptions class

  • shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment

options

backup options represented by the BackupOptions class

Type:

BackupOptions

cluster

Postgres cluster object

Type:

cluster.Cluster

binary

Path to the pg_dump binary.

on_start_commands

commands to execute prior to launching the backup

Type:

List[str]

command

main backup command

Type:

str

on_finish_commands

commands to execute after backup is completed or failed

Type:

List[str]

Example

Backup schema “public” of database “foo” with no privileges into the bucket gs://my-bucket/

>>> backup = GCPBackup(cluster=cluster)
>>> backup.options.no_privileges = True
>>> backup.options.schemas = ["public"]
>>> backup.backup(database="foo", path="gs://my-bucket/foo")  
backup(database, path)

Backup a database to the specified path

Parameters:
  • database (str) – name of the database to backup

  • path (str) – path (relative or absolute) to backup to

Returns:

pg_dump stdout and stderr output

Return type:

str

execute_command(database: str, path: str)

Executes backup/restore commands and replaces the formatted commands with actual values.

Parameters:
  • database (str) – database name

  • path (str) – backup path

Returns:

stdin and stdout of executed command

Return type:

str

class pgmob.backup.GCPRestore(cluster: Cluster, bucket: str = '', binary_path: str = 'pg_restore', temp_path: str = '/tmp', options: RestoreOptions | None = None, shell: _BaseShellEnv | None = None)

GCP Restore class that manages restores from GCP buckets. Executes “pg_restore” on a target cluster spawned as a subprocess of Postgres server process. Requires “gsutil” command available on the server host. To configure authentication, run “gcloud auth login” under postgres OS user. GCP bucket file is first copied to temp_path, then restored from disk. The file is removed once restore is finished.

Parameters:
  • cluster (cluster.Cluster) – Postgres cluster object

  • bucket (str) – (Optional) GCP bucket name to work with. If specified, the backup path would be considered relative of the bucket path

  • binary_path (str) – Path to the pg_restore binary. Use when a specific binary version is needed or the binary is not in PATH.

  • temp_path (str) – Path to a temporary folder, to which the backup would be copied to. Direct restores from the bucket via pipe has proven to be unreliable.

options

restore options represented by RestoreOptions class

Type:

RestoreOptions

cluster

Postgres cluster to execute the restore

binary

Path to the pg_restore binary.

on_start_commands

commands to execute prior to launching the restore

Type:

List[str]

command

main restore command

Type:

str

on_finish_commands

commands to execute after restore is completed or failed

Type:

List[str]

temp_path

Path to a temporary folder

Type:

str

Example

Restore two tables into database “bar” using 4 parallel jobs and disregarding tablespaces

>>> restore = GCPRestore(cluster=cluster, bucket="gs://my-bucket/")
>>> restore.options.no_tablespaces = True
>>> restore.options.tables = ["a", "b"]
>>> restore.options.jobs = 4
>>> restore.restore(database="bar", path="foo")  
execute_command(database: str, path: str)

Executes backup/restore commands and replaces the formatted commands with actual values.

Parameters:
  • database (str) – database name

  • path (str) – backup path

Returns:

stdin and stdout of executed command

Return type:

str

restore(database: str, path: str)

Restore a backup into the specified database :param database: name of the database to connect to :type database: str :param path: path (relative or absolute) to the backup file :type path: str

Returns:

pg_restore full output

Return type:

str