Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process changes to docs from: repo: EnterpriseDB/cloud-native-postgres ref: refs/tags/v1.25.0 #6370

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions product_docs/docs/postgres_for_kubernetes/1/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -358,11 +358,12 @@ only write inside a single Kubernetes cluster, at any time.

However, for business continuity objectives it is fundamental to:

- reduce global **recovery point objectives** (RPO) by storing PostgreSQL backup data
in multiple locations, regions and possibly using different providers
(Disaster Recovery)
- reduce global **recovery time objectives** (RTO) by taking advantage of PostgreSQL
replication beyond the primary Kubernetes cluster (High Availability)
- reduce global **recovery point objectives** ([RPO](before_you_start.md#rpo))
by storing PostgreSQL backup data in multiple locations, regions and possibly
using different providers (Disaster Recovery)
- reduce global **recovery time objectives** ([RTO](before_you_start.md#rto))
by taking advantage of PostgreSQL replication beyond the primary Kubernetes
cluster (High Availability)

In order to address the above concerns, EDB Postgres for Kubernetes introduces the concept of
a PostgreSQL Topology that is distributed across different Kubernetes clusters
Expand Down
16 changes: 11 additions & 5 deletions product_docs/docs/postgres_for_kubernetes/1/backup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ On the other hand, EDB Postgres for Kubernetes supports two ways to store physic
the supported [Container Storage Interface (CSI) drivers](https://kubernetes-csi.github.io/docs/drivers.html)
that provide snapshotting capabilities.

!!! Info
Starting with version 1.25, EDB Postgres for Kubernetes includes experimental support for
backup and recovery using plugins, such as the
[Barman Cloud plugin](https://github.com/cloudnative-pg/plugin-barman-cloud).

## WAL archive

The WAL archive in PostgreSQL is at the heart of **continuous backup**, and it
Expand All @@ -69,7 +74,8 @@ as they can simply rely on the WAL archive to synchronize across long
distances, extending disaster recovery goals across different regions.

When you [configure a WAL archive](wal_archiving.md), EDB Postgres for Kubernetes provides
out-of-the-box an RPO <= 5 minutes for disaster recovery, even across regions.
out-of-the-box an [RPO](before_you_start.md#rpo) <= 5 minutes for disaster
recovery, even across regions.

!!! Important
Our recommendation is to always setup the WAL archive in production.
Expand Down Expand Up @@ -121,9 +127,9 @@ including:
- availability of a trusted storage class that supports volume snapshots
- size of the database: with object stores, the larger your database, the
longer backup and, most importantly, recovery procedures take (the latter
impacts RTO); in presence of Very Large Databases (VLDB), the general
advice is to rely on Volume Snapshots as, thanks to copy-on-write, they
provide faster recovery
impacts [RTO](before_you_start.md#rto)); in presence of Very Large Databases
(VLDB), the general advice is to rely on Volume Snapshots as, thanks to
copy-on-write, they provide faster recovery
- data mobility and possibility to store or relay backup files on a
secondary location in a different region, or any subsequent one
- other factors, mostly based on the confidence and familiarity with the
Expand Down Expand Up @@ -190,7 +196,7 @@ In Kubernetes CronJobs, the equivalent expression is `0 0 * * *` because seconds
are not included.

!!! Hint
Backup frequency might impact your recovery time object (RTO) after a
Backup frequency might impact your recovery time objective ([RTO](before_you_start.md#rto)) after a
disaster which requires a full or Point-In-Time recovery operation. Our
advice is that you regularly test your backups by recovering them, and then
measuring the time it takes to recover from scratch so that you can refine
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,9 @@ algorithms via `barman-cloud-backup` (for backups) and
- snappy

The compression settings for backups and WALs are independent. See the
[DataBackupConfiguration](https://pkg.go.dev/github.com/cloudnative-pg/barman-cloud/pkg/api#BarmanObjectStoreConfiguration) and
[DataBackupConfiguration](https://pkg.go.dev/github.com/cloudnative-pg/barman-cloud/pkg/api#DataBackupConfiguration) and
[WALBackupConfiguration](https://pkg.go.dev/github.com/cloudnative-pg/barman-cloud/pkg/api#WalBackupConfiguration) sections in
the API reference.
the barman-cloud API reference.

It is important to note that archival time, restore time, and size change
between the algorithms, so the compression algorithm should be chosen according
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,14 @@ PVC group
belonging to the same PostgreSQL instance, namely the main volume containing
the PGDATA (`storage`) and the volume for WALs (`walStorage`).

<a id="rto"></a>RTO
: Acronym for "recovery time objective", the amount of time a system can be
unavailable without adversely impacting the application.

<a id="rpo"></a>RPO
: Acronym for "recovery point objective", a calculation of the level of
acceptable data loss following a disaster recovery scenario.

## Cloud terminology

Region
Expand Down
162 changes: 114 additions & 48 deletions product_docs/docs/postgres_for_kubernetes/1/bootstrap.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ For more detailed information about this feature, please refer to the
EDB Postgres for Kubernetes requires both the `postgres` user and database to
always exists. Using the local Unix Domain Socket, it needs to connect
as `postgres` user to the `postgres` database via `peer` authentication in
order to perform administrative tasks on the cluster.
order to perform administrative tasks on the cluster.
**DO NOT DELETE** the `postgres` user or the `postgres` database!!!

!!! Info
Expand Down Expand Up @@ -212,36 +212,87 @@ The user that owns the database defaults to the database name instead.
The application user is not used internally by the operator, which instead
relies on the superuser to reconcile the cluster with the desired status.

### Passing options to `initdb`
### Passing Options to `initdb`

The actual PostgreSQL data directory is created via an invocation of the
`initdb` PostgreSQL command. If you need to add custom options to that command
(i.e., to change the `locale` used for the template databases or to add data
checksums), you can use the following parameters:
The PostgreSQL data directory is initialized using the
[`initdb` PostgreSQL command](https://www.postgresql.org/docs/current/app-initdb.html).

EDB Postgres for Kubernetes enables you to customize the behavior of `initdb` to modify
settings such as default locale configurations and data checksums.

!!! Warning
EDB Postgres for Kubernetes acts only as a direct proxy to `initdb` for locale-related
options, due to the ongoing and significant enhancements in PostgreSQL's locale
support. It is your responsibility to ensure that the correct options are
provided, following the PostgreSQL documentation, and to verify that the
bootstrap process completes successfully.

To include custom options in the `initdb` command, you can use the following
parameters:

builtinLocale
: When `builtinLocale` is set to a value, EDB Postgres for Kubernetes passes it to the
`--builtin-locale` option in `initdb`. This option controls the builtin locale, as
defined in ["Locale Support"](https://www.postgresql.org/docs/current/locale.html)
from the PostgreSQL documentation (default: empty). Note that this option requires
`localeProvider` to be set to `builtin`. Available from PostgreSQL 17.

dataChecksums
: When `dataChecksums` is set to `true`, CNP invokes the `-k` option in
: When `dataChecksums` is set to `true`, EDB Postgres for Kubernetes invokes the `-k` option in
`initdb` to enable checksums on data pages and help detect corruption by the
I/O system - that would otherwise be silent (default: `false`).

encoding
: When `encoding` set to a value, CNP passes it to the `--encoding` option in `initdb`,
which selects the encoding of the template database (default: `UTF8`).
: When `encoding` set to a value, EDB Postgres for Kubernetes passes it to the `--encoding`
option in `initdb`, which selects the encoding of the template database
(default: `UTF8`).

icuLocale
: When `icuLocale` is set to a value, EDB Postgres for Kubernetes passes it to the
`--icu-locale` option in `initdb`. This option controls the ICU locale, as
defined in ["Locale Support"](https://www.postgresql.org/docs/current/locale.html)
from the PostgreSQL documentation (default: empty).
Note that this option requires `localeProvider` to be set to `icu`.
Available from PostgreSQL 15.

icuRules
: When `icuRules` is set to a value, EDB Postgres for Kubernetes passes it to the
`--icu-rules` option in `initdb`. This option controls the ICU locale, as
defined in ["Locale
Support"](https://www.postgresql.org/docs/current/locale.html) from the
PostgreSQL documentation (default: empty). Note that this option requires
`localeProvider` to be set to `icu`. Available from PostgreSQL 16.

locale
: When `locale` is set to a value, EDB Postgres for Kubernetes passes it to the `--locale`
option in `initdb`. This option controls the locale, as defined in
["Locale Support"](https://www.postgresql.org/docs/current/locale.html) from
the PostgreSQL documentation. By default, the locale parameter is empty. In
this case, environment variables such as `LANG` are used to determine the
locale. Be aware that these variables can vary between container images,
potentially leading to inconsistent behavior.

localeCollate
: When `localeCollate` is set to a value, CNP passes it to the `--lc-collate`
: When `localeCollate` is set to a value, EDB Postgres for Kubernetes passes it to the `--lc-collate`
option in `initdb`. This option controls the collation order (`LC_COLLATE`
subcategory), as defined in ["Locale Support"](https://www.postgresql.org/docs/current/locale.html)
from the PostgreSQL documentation (default: `C`).

localeCType
: When `localeCType` is set to a value, CNP passes it to the `--lc-ctype` option in
: When `localeCType` is set to a value, EDB Postgres for Kubernetes passes it to the `--lc-ctype` option in
`initdb`. This option controls the collation order (`LC_CTYPE` subcategory), as
defined in ["Locale Support"](https://www.postgresql.org/docs/current/locale.html)
from the PostgreSQL documentation (default: `C`).

localeProvider
: When `localeProvider` is set to a value, EDB Postgres for Kubernetes passes it to the `--locale-provider`
option in `initdb`. This option controls the locale provider, as defined in
["Locale Support"](https://www.postgresql.org/docs/current/locale.html) from the
PostgreSQL documentation (default: empty, which means `libc` for PostgreSQL).
Available from PostgreSQL 15.

walSegmentSize
: When `walSegmentSize` is set to a value, CNP passes it to the `--wal-segsize`
: When `walSegmentSize` is set to a value, EDB Postgres for Kubernetes passes it to the `--wal-segsize`
option in `initdb` (default: not set - defined by PostgreSQL as 16 megabytes).

!!! Note
Expand Down Expand Up @@ -430,44 +481,59 @@ to the ["Recovery" section](recovery.md).

### Bootstrap from a live cluster (`pg_basebackup`)

The `pg_basebackup` bootstrap mode lets you create a new cluster (*target*) as
an exact physical copy of an existing and **binary compatible** PostgreSQL
instance (*source*), through a valid *streaming replication* connection.
The source instance can be either a primary or a standby PostgreSQL server.
The `pg_basebackup` bootstrap mode allows you to create a new cluster
(*target*) as an exact physical copy of an existing and **binary-compatible**
PostgreSQL instance (*source*) managed by EDB Postgres for Kubernetes, using a valid
*streaming replication* connection. The source instance can either be a primary
or a standby PostgreSQL server. It’s crucial to thoroughly review the
requirements section below, as the pros and cons of PostgreSQL physical
replication fully apply.

The primary use case for this method is represented by **migrations** to EDB Postgres for Kubernetes,
either from outside Kubernetes or within Kubernetes (e.g., from another operator).
The primary use cases for this method include:

!!! Warning
The current implementation creates a *snapshot* of the origin PostgreSQL
instance when the cloning process terminates and immediately starts
the created cluster. See ["Current limitations"](#current-limitations) below for details.
- Reporting and business intelligence clusters that need to be regenerated
periodically (daily, weekly)
- Test databases containing live data that require periodic regeneration
(daily, weekly, monthly) and anonymization
- Rapid spin-up of a standalone replica cluster
- Physical migrations of EDB Postgres for Kubernetes clusters to different namespaces or
Kubernetes clusters

Similar to the case of the `recovery` bootstrap method, once the clone operation
completes, the operator will take ownership of the target cluster, starting from
the first instance. This includes overriding some configuration parameters, as
required by EDB Postgres for Kubernetes, resetting the superuser password, creating
the `streaming_replica` user, managing the replicas, and so on. The resulting
cluster will be completely independent of the source instance.
!!! Important
Avoid using this method, based on physical replication, to migrate an
existing PostgreSQL cluster outside of Kubernetes into EDB Postgres for Kubernetes unless you
are completely certain that all requirements are met and the operation has been
thoroughly tested. The EDB Postgres for Kubernetes community does not endorse this approach
for such use cases and recommends using logical import instead. It is
exceedingly rare that all requirements for physical replication are met in a
way that seamlessly works with EDB Postgres for Kubernetes.

!!! Warning
In its current implementation, this method clones the source PostgreSQL
instance, thereby creating a *snapshot*. Once the cloning process has finished,
the new cluster is immediately started.
Refer to ["Current limitations"](#current-limitations) for more details.

Similar to the `recovery` bootstrap method, once the cloning operation is
complete, the operator takes full ownership of the target cluster, starting
from the first instance. This includes overriding certain configuration
parameters as required by EDB Postgres for Kubernetes, resetting the superuser password,
creating the `streaming_replica` user, managing replicas, and more. The
resulting cluster operates independently from the source instance.

!!! Important
Configuring the network between the target instance and the source instance
goes beyond the scope of EDB Postgres for Kubernetes documentation, as it depends
on the actual context and environment.
Configuring the network connection between the target and source instances
lies outside the scope of EDB Postgres for Kubernetes documentation, as it depends heavily on
the specific context and environment.

The streaming replication client on the target instance, which will be
transparently managed by `pg_basebackup`, can authenticate itself on the source
instance in any of the following ways:
The streaming replication client on the target instance, managed transparently
by `pg_basebackup`, can authenticate on the source instance using one of the
following methods:

1. via [username/password](#usernamepassword-authentication)
2. via [TLS client certificate](#tls-certificate-authentication)
1. [Username/password](#usernamepassword-authentication)
2. [TLS client certificate](#tls-certificate-authentication)

The latter is the recommended one if you connect to a source managed
by EDB Postgres for Kubernetes or configured for TLS authentication.
The first option is, however, the most common form of authentication to a
PostgreSQL server in general, and might be the easiest way if the source
instance is on a traditional environment outside Kubernetes.
Both cases are explained below.
Both authentication methods are detailed below.

#### Requirements

Expand Down Expand Up @@ -545,7 +611,7 @@ file on the source PostgreSQL instance:
host replication streaming_replica all md5
```

The following manifest creates a new PostgreSQL 17.0 cluster,
The following manifest creates a new PostgreSQL 17.2 cluster,
called `target-db`, using the `pg_basebackup` bootstrap method
to clone an external PostgreSQL cluster defined as `source-db`
(in the `externalClusters` array). As you can see, the `source-db`
Expand All @@ -560,7 +626,7 @@ metadata:
name: target-db
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:17.0
imageName: quay.io/enterprisedb/postgresql:17.2

bootstrap:
pg_basebackup:
Expand All @@ -580,7 +646,7 @@ spec:
```

All the requirements must be met for the clone operation to work, including
the same PostgreSQL version (in our case 17.0).
the same PostgreSQL version (in our case 17.2).

#### TLS certificate authentication

Expand All @@ -595,7 +661,7 @@ in the same Kubernetes cluster.
This example can be easily adapted to cover an instance that resides
outside the Kubernetes cluster.

The manifest defines a new PostgreSQL 17.0 cluster called `cluster-clone-tls`,
The manifest defines a new PostgreSQL 17.2 cluster called `cluster-clone-tls`,
which is bootstrapped using the `pg_basebackup` method from the `cluster-example`
external cluster. The host is identified by the read/write service
in the same cluster, while the `streaming_replica` user is authenticated
Expand All @@ -610,7 +676,7 @@ metadata:
name: cluster-clone-tls
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:17.0
imageName: quay.io/enterprisedb/postgresql:17.2

bootstrap:
pg_basebackup:
Expand Down Expand Up @@ -691,7 +757,7 @@ instance using a second connection (see the `--wal-method=stream` option for
Once the backup is completed, the new instance will be started on a new timeline
and diverge from the source.
For this reason, it is advised to stop all write operations to the source database
before migrating to the target database in Kubernetes.
before migrating to the target database.

!!! Important
Before you attempt a migration, you must test both the procedure
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -132,14 +132,14 @@ Given the following files:

Create a secret containing the CA certificate:

```
```sh
kubectl create secret generic my-postgresql-server-ca \
--from-file=ca.crt=./server-ca.crt
```

Create a secret with the TLS certificate:

```
```sh
kubectl create secret tls my-postgresql-server \
--cert=./server.crt --key=./server.key
```
Expand Down
Loading
Loading