Backup and Restore¶
Backup and Restore tools
In order to execute a backup or a restore operation, PGMob would issue a pg_dump
/pg_restore
command
on a remote server spawned as a subprocess of the PostgreSQL process. The binary should be available on the
server (you can specify an exact path) and should be able to connect to localhost without a password.
Example
Create a backup from one database and restore it as a different database
>>> old_db = "foo"
>>> new_db = "bar"
>>> file_backup = FileBackup(cluster=cluster)
>>> file_backup.backup(database=old_db, path="/tmp/db.bak")
>>> file_restore = FileRestore(cluster=cluster)
>>> file_restore.restore(database=new_db, path="/tmp/db.bak")
Backup/restore options¶
Backup and restore operations are controlled by the Options object, which is available as an .options
attribute
of the corresponding Backup or Restore object. Adjust settings as you would normally do in the command line, but
benefit from being able to use Python syntax.
- class pgmob.backup.BackupOptions(shell: _BaseShellEnv | None = None, **kwargs)¶
Parameters to be used for the pg_dump binary
Class attributes match closely to those displayed by the “pg_dump –help” command.
- Parameters:
shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment
**kwargs (dict) – any supporter attribute can be submitted as kwargs
- shell¶
shell processor that defines pathing and escaping for the current environment
- Type:
_BaseShellEnv
- format¶
(c|d|t|p) output file format (custom(default), directory, tar, plain text)
- Type:
str
- jobs¶
use this many parallel jobs to dump
- Type:
int
- verbose¶
verbose mode
- Type:
bool
- compress¶
(0-9) compression level for compressed formats
- Type:
int
- lock_wait_timeout¶
fail after waiting TIMEOUT for a table lock
- Type:
int
- data_only¶
dump only the data, not the schema
- Type:
bool
- clean¶
clean (drop) database objects before recreating
- Type:
bool
- create¶
include commands to create database in dump
- Type:
bool
- schemas¶
dump the named schemas only
- Type:
List[str]
- exclude_schemas¶
do NOT dump the named schemas
- Type:
List[str]
- no_owner¶
skip restoration of object ownership in plain-text format
- Type:
bool
- schema_only¶
dump only the schema, no data
- Type:
bool
- superuser¶
superuser user name to use in plain-text format
- Type:
str
- tables¶
dump the named tables only
- Type:
List[str]
- exclude_tables¶
do NOT dump the named tables
- Type:
List[str]
- no_privileges¶
do not dump privileges (grant/revoke)
- Type:
bool
- exclude_table_data¶
do NOT dump data for the named table(s)
- Type:
List[str]
- add_if_exists¶
use IF EXISTS when dropping objects
- Type:
bool
- as_inserts¶
dump data as INSERT commands, rather than COPY
- Type:
bool
- no_subscriptions¶
do not dump subscriptions
- Type:
bool
- no_publications¶
do not dump publications
- Type:
bool
- no_tablespaces¶
do not dump tablespace assignments
- Type:
bool
- section¶
dump named section (pre-data, data, or post-data)
- Type:
str
- strict_names¶
require table and/or schema include patterns to match at least one entity each
- Type:
bool
- set_role¶
invoke SET ROLE before dump
- Type:
str
- blobs¶
Include large objects in the dump
- Type:
bool
- render_args()¶
Renders options as command line arguments
- Returns:
A list of command line arguments
- Return type:
List[str]
- class pgmob.backup.RestoreOptions(shell: _BaseShellEnv | None = None, **kwargs)¶
Parameters to be used for the pg_restore binary
Class attributes match closely to those displayed by the “pg_restore –help” command.
- Parameters:
shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment
**kwargs (dict) – any supporter attribute can be submitted as kwargs
- shell¶
shell processor that defines pathing and escaping for the current environment
- Type:
_BaseShellEnv
- format¶
(c|d|t) backup file format (should be automatic)
- Type:
str
- jobs¶
use this many parallel jobs to restore
- Type:
int
- verbose¶
verbose mode
- Type:
bool
- data_only¶
restore only the data, no schema
- Type:
bool
- clean¶
clean (drop) database objects before recreating
- Type:
bool
- create¶
create the target database
- Type:
bool
- exit_on_error¶
exit on error, default is to continue
- Type:
bool
- indexes¶
restore named indexes
- Type:
List[str]
- schemas¶
restore only objects in these schemas
- Type:
List[str]
- exclude_schemas¶
do not restore objects in these schemas
- Type:
List[str]
- use_list¶
use table of contents from this file for selecting/ordering output
- Type:
str
- no_owner¶
skip restoration of object ownership
- Type:
bool
- functions¶
(NAME(args)) restore named functions
- Type:
List[str]
- schema_only¶
restore only the schema, no data
- Type:
bool
- superuser¶
superuser user name to use for disabling triggers
- Type:
str
- tables¶
restore named relations (table, view, etc.)
- Type:
List[str]
- triggers¶
restore named triggers
- Type:
List[str]
- no_privileges¶
skip restoration of access privileges (grant/revoke)
- Type:
bool
- single_transaction¶
restore as a single transaction
- Type:
bool
- disable_triggers¶
disable triggers during data-only restore
- Type:
bool
- add_if_exists¶
use IF EXISTS when dropping objects
- Type:
bool
- no_subscriptions¶
do not restore subscriptions
- Type:
bool
- no_publications¶
do not restore publications
- Type:
bool
- no_tablespaces¶
do not restore tablespace assignments
- Type:
bool
- section¶
restore named section (pre-data, data, or post-data)
- Type:
str
- no_data_for_failed_tables¶
do not restore data of tables that could not be created
- Type:
bool
- strict_names¶
require table and/or schema include patterns to match at least one entity each
- Type:
bool
- set_role¶
invoke SET ROLE before dump
- Type:
str
- render_args()¶
Renders options as command line arguments
- Returns:
A list of command line arguments
- Return type:
List[str]
File backup/restore¶
File restore works with the files available on the PostgreSQL server. When specifying path, make sure
PostgreSQL service user (postgres
) has access to them.
You can specify a “base” path for your backup/restore operation, making all the subsequent paths relative to that base path.
- class pgmob.backup.FileBackup(cluster: Cluster, base_path: str = '', binary_path: str = 'pg_dump', options: BackupOptions | None = None, shell: _BaseShellEnv | None = None)¶
File Backup class that performs a Backup operation to a local filesystem. Executes pg_dump on a target postgres cluster spawned as a subprocess of Postgres server process.
- Parameters:
cluster (cluster.Cluster) – Postgres cluster object
base_path (str) – Base backup path. If specified, the backup path would be considered relative to it.
binary_path (str) – Path to the pg_dump binary. Use when a specific binary version is needed or the binary is not in PATH.
options (BackupOptions) – backup options represented by BackupOptions class
shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment
- options¶
backup options represented by the BackupOptions class
- Type:
- cluster¶
Postgres cluster object
- Type:
- binary¶
Path to the pg_dump binary
- Type:
str
- shell¶
shell processor that defines pathing and escaping for the current environment
- Type:
_BaseShellEnv
- on_start_commands¶
commands to execute prior to launching the backup
- Type:
List[str]
- command¶
main backup command
- Type:
str
- on_finish_commands¶
commands to execute after backup is completed or failed
- Type:
List[str]
Example
Backup a database schema for tables “a” and “b”
>>> db = "foo" >>> backup = FileBackup(cluster=cluster, base_path="/tmp") >>> backup.options.schema_only = True >>> backup.options.exclude_tables = ["a", "b"] >>> backup.backup(database=db, path=f"{db}.bak")
- backup(database, path)¶
Backup a database to the specified path
- Parameters:
database (str) – name of the database to backup
path (str) – path (relative or absolute) to backup to
- Returns:
pg_dump stdout and stderr output
- Return type:
str
- execute_command(database: str, path: str)¶
Executes backup/restore commands and replaces the formatted commands with actual values.
- Parameters:
database (str) – database name
path (str) – backup path
- Returns:
stdin and stdout of executed command
- Return type:
str
- class pgmob.backup.FileRestore(cluster: Cluster, base_path: str = '', binary_path: str = 'pg_restore', options: RestoreOptions | None = None, shell: _BaseShellEnv | None = None)¶
Restore class that performs a Restore operation from a local filesystem. Executes pg_restore on a target postgres cluster spawned as a subprocess of Postgres server process.
- Parameters:
cluster (cluster.Cluster) – Postgres cluster object
base_path (str) – Base backup path. If specified, the backup path would be considered relative to it.
binary_path (str) – Path to the pg_restore binary. Use when a specific binary version is needed or the binary is not in PATH.
options (RestoreOptions) – restore options represented by the RestoreOptions class
shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment
- options¶
restore options represented by the RestoreOptions class
- Type:
- cluster¶
Postgres cluster object
- binary¶
Path to the pg_restore binary
- on_start_commands¶
commands to execute prior to launching the restore
- Type:
List[str]
- command¶
main restore command
- Type:
str
- on_finish_commands¶
commands to execute after restore is completed or failed
- Type:
List[str]
Example
Restore objects into database bar changing object ownership to “bar”
>>> old_db = "foo" >>> new_db = "bar" >>> role = "ownerrole" >>> backup = FileBackup(cluster=cluster, base_path="/tmp") >>> backup.backup(database=old_db, path=f"{old_db}.bak") >>> restore = FileRestore(cluster=cluster, base_path="/tmp") >>> restore.options.schema_only = True >>> restore.options.no_owner = True >>> restore.options.tables = ["a", "b"] >>> restore.options.set_role = role >>> restore.restore(database=new_db, path=f"{old_db}.bak")
- execute_command(database: str, path: str)¶
Executes backup/restore commands and replaces the formatted commands with actual values.
- Parameters:
database (str) – database name
path (str) – backup path
- Returns:
stdin and stdout of executed command
- Return type:
str
- restore(database: str, path: str)¶
Restore a backup into the specified database :param database: name of the database to connect to :type database: str :param path: path (relative or absolute) to the backup file :type path: str
- Returns:
pg_restore full output
- Return type:
str
GCP Bucket backup/restore¶
Similar to the File backup/restore, these classes offer to simplify restoration from a GCP Bucket. The postgres
system user should
be able to use gs_util
tooling to access bucket contents. During a restore operation, the database backup is copied to a temporary
location first, due to the way gs_util outputs the file into the pipeline, making it impossible to pass it to pg_restore
. This folder
location can be adjusted.
- class pgmob.backup.GCPBackup(cluster: Cluster, bucket: str = '', binary_path: str = 'pg_dump', options: BackupOptions | None = None, shell: _BaseShellEnv | None = None)¶
GCP Backup class that uploads backups to a GCP bucket. Executes “pg_dump” on a target postgres cluster spawned as a subprocess of Postgres server process. Requires “gsutil” command available on the server host. To configure authentication, run “gcloud auth login” under postgres OS user.
- Parameters:
cluster (cluster.Cluster) – Postgres cluster object
bucket (str) – (Optional) GCP bucket name to work with. If specified, the backup path would be considered relative of the bucket path
binary_path (str) – Path to the pg_dump binary. Use when a specific binary version is needed or the binary is not in PATH.
options (BackupOptions) – backup options represented by BackupOptions class
shell (_BaseShellEnv) – shell processor that defines pathing and escaping for the current environment
- options¶
backup options represented by the BackupOptions class
- Type:
- cluster¶
Postgres cluster object
- Type:
- binary¶
Path to the pg_dump binary.
- on_start_commands¶
commands to execute prior to launching the backup
- Type:
List[str]
- command¶
main backup command
- Type:
str
- on_finish_commands¶
commands to execute after backup is completed or failed
- Type:
List[str]
Example
Backup schema “public” of database “foo” with no privileges into the bucket gs://my-bucket/
>>> backup = GCPBackup(cluster=cluster) >>> backup.options.no_privileges = True >>> backup.options.schemas = ["public"] >>> backup.backup(database="foo", path="gs://my-bucket/foo")
- backup(database, path)¶
Backup a database to the specified path
- Parameters:
database (str) – name of the database to backup
path (str) – path (relative or absolute) to backup to
- Returns:
pg_dump stdout and stderr output
- Return type:
str
- execute_command(database: str, path: str)¶
Executes backup/restore commands and replaces the formatted commands with actual values.
- Parameters:
database (str) – database name
path (str) – backup path
- Returns:
stdin and stdout of executed command
- Return type:
str
- class pgmob.backup.GCPRestore(cluster: Cluster, bucket: str = '', binary_path: str = 'pg_restore', temp_path: str = '/tmp', options: RestoreOptions | None = None, shell: _BaseShellEnv | None = None)¶
GCP Restore class that manages restores from GCP buckets. Executes “pg_restore” on a target cluster spawned as a subprocess of Postgres server process. Requires “gsutil” command available on the server host. To configure authentication, run “gcloud auth login” under postgres OS user. GCP bucket file is first copied to temp_path, then restored from disk. The file is removed once restore is finished.
- Parameters:
cluster (cluster.Cluster) – Postgres cluster object
bucket (str) – (Optional) GCP bucket name to work with. If specified, the backup path would be considered relative of the bucket path
binary_path (str) – Path to the pg_restore binary. Use when a specific binary version is needed or the binary is not in PATH.
temp_path (str) – Path to a temporary folder, to which the backup would be copied to. Direct restores from the bucket via pipe has proven to be unreliable.
- options¶
restore options represented by RestoreOptions class
- Type:
- cluster¶
Postgres cluster to execute the restore
- binary¶
Path to the pg_restore binary.
- on_start_commands¶
commands to execute prior to launching the restore
- Type:
List[str]
- command¶
main restore command
- Type:
str
- on_finish_commands¶
commands to execute after restore is completed or failed
- Type:
List[str]
- temp_path¶
Path to a temporary folder
- Type:
str
Example
Restore two tables into database “bar” using 4 parallel jobs and disregarding tablespaces
>>> restore = GCPRestore(cluster=cluster, bucket="gs://my-bucket/") >>> restore.options.no_tablespaces = True >>> restore.options.tables = ["a", "b"] >>> restore.options.jobs = 4 >>> restore.restore(database="bar", path="foo")
- execute_command(database: str, path: str)¶
Executes backup/restore commands and replaces the formatted commands with actual values.
- Parameters:
database (str) – database name
path (str) – backup path
- Returns:
stdin and stdout of executed command
- Return type:
str
- restore(database: str, path: str)¶
Restore a backup into the specified database :param database: name of the database to connect to :type database: str :param path: path (relative or absolute) to the backup file :type path: str
- Returns:
pg_restore full output
- Return type:
str