Releases: PeerDB-io/peerdb
v0.11.2
Full Changelog: v0.11.1...v0.11.2
v0.11.1
What's Changed
This release includes some important bug fixes in our CDC code, some related to how Postgres records LSNs and TID scanning in PostgreSQL 12.
- PullRecords: update replState.Offset in defer by @serprex in #1481
- Move PG/SF simple schema changes to generic test by @serprex in #1477
- Add environment variable to control parallelism of snowflake merge by @serprex in #1487
- Add environment variable to disable one-sync by @serprex in #1488
- Simplify pgtype.Time to time.Time conversion by @serprex in #1492
- UI: New default page and peer buttons by @Amogh-Bharadwaj in #1501
- [snapshot] fallback to full partitions in <PG12, cleanup by @heavycrystal in #1499
- Snowflake: remove ON_ERROR=CONTINUE by @serprex in #1494
Full Changelog: v0.11.0...v0.11.1
v0.11.0
What's Changed
In this version of PeerDB, we restructured our flow architecture resulting in improved reliability (ex: drop mirror) and performance. We added support for a brand new connector in Clickhouse! On the UI side, we now have upgraded validation checks across the board, along with cutting-edge features such as pausing and editing mirrors.
This release sees a major investment in the alerting and monitoring systems, with support for email and slack alerts. We also added support for SSH connections to PostgreSQL peers. When it comes to the core ETL logic, we've solidified our data type handling, memory and storage management, logging and testing.
- 🔄 Split cdc_flow into cdc_flow / sync_flow by @serprex in #1365
- 🛠️ Clickhouse cdc by @pankaj-peerdb in #1096
- 📧 feat(alerting): add email sender by @iamKunalGupta in #1433
- ⏸️ Pause and resume mirror buttons, along with state reflection by @heavycrystal in #1133
- ⚙️ Add Tables Feature: Added dynamic table addition to existing props signal by @heavycrystal in #1106
- 🚀 feat: add telemetry/alerts via sns by @iamKunalGupta in #1411
- 📦 Setup partitioning and clustering for raw table by @iskakaushik in #915
- 🗝️ SSH Support: Use SSHWrappedPool when querying the Postgres Peer by @iskakaushik in #1148
- ⚙️ Custom threshold support for slack alerts by @heavycrystal in #1277
- 🔄 Normalize concurrently with sync flows by @serprex in #893
- 💾 Spill to disk based on flow-worker memory usage by @heavycrystal in #1231
- 🔄 Update max batch size on signal by @iskakaushik in #910
- 🔄 drop_flow: retry until both source/destination succeed by @serprex in #1201
- ⚙️ Adds idletimeout to flow config, ui and temporal signal by @Amogh-Bharadwaj in #952
- ⚙️ Added capability for BQ CDC across datasets by @heavycrystal in #904
- 💾 Support dynamic numeric with defaults by @Amogh-Bharadwaj in #1194
- 🖥️ UI: Edit Page and Refactor Edit Mirror Route by @Amogh-Bharadwaj in #1156
- 🔄 Retry when WAL segment has not been found by @iskakaushik in #930
- 🗺️ HStore and Geospatial for Postgres by @Amogh-Bharadwaj in #1091
- ⚙️ Dynamically add new tables to CDC mirrors by @heavycrystal in #1084
- 🔄 cdc_flow: listen for shutdown request while sync flow in progress by @serprex in #1103
- 🔄 Go 1.22 by @serprex in #1219
- 🖥️ Clickhouse UI by @pankaj-peerdb in #1022
- 🛠️ Validate Mirror: PostgreSQL Checks by @Amogh-Bharadwaj in #1110
- ⚙️ Support specifying host key for ssh config by @serprex in #1125
- 🗑️ drop_flow: drop destination/source concurrently by @serprex in #1101
- 📦 BQ Peer: Support project dot dataset by @Amogh-Bharadwaj in #1073
- 🛠️ Validate peer: permission check for snowflake by @Amogh-Bharadwaj in #1126
- 🗑️ Removing deprecated params from QRep by @heavycrystal in #1154
- 💓 HeartbeatRoutine: use explicit ticker by @serprex in #1157
- 🛠️ Validate peer: check bigquery permissions by @Amogh-Bharadwaj in #1119
- 🖥️ UI: Show peer configuration on clicking peer by @Amogh-Bharadwaj in #1168
- 🔄 Always set application_name when connecting to postgres by @serprex in #1169
- 🖥️ UI: Resync button by @Amogh-Bharadwaj in #1178
- 🖥️ UI: Line chart for slot growth by @Amogh-Bharadwaj in #1184
- 🗑️ Remove configurable postgres metadata database by @serprex in #1189
- 🛠️ Snowflake: stop storing metadata on warehouse; store in catalog by @serprex in #1179
- 🛠️ BigQuery: stop storing metadata on warehouse; store in catalog by @serprex in #1191
- 🔄 Clickhouse cdc data types by @pankaj-peerdb in #1210
- 📝 Better CDC error logging by @iskakaushik in #1275
- 🛠️ Connectors: build on a single GetConnector using generic GetConnectorAs function by @serprex in #1281
Full Changelog: v0.10.2...v0.11.0
v0.10.2
v0.10.2 release
v0.10.1
v0.10.1 release
v0.10.0
What's Changed
In this version of PeerDB, we improved the way in which pull rows from Postgres, and we also track the WAL extensively. This release includes support for Temporal Cloud. Of course, it comes with exciting new features for PeerDB mirror such as Mirror Resync, major UI upgrades, and heavy focus on development experience.
- 🔄 Postgres cdc: update mirror lsn_offset when wal processing raises consumedXLogPos by @serprex in #823
- ❄️ Snowflake: Run merges in parallel during normalize flow by @iskakaushik in #662
- 🔄 Replication should now work from PG 16 Standbys by @saisrirampur in #858
- 🚨 Basic alerting, refactored to use slack-go instead by @heavycrystal in #866
- 💼 Xmin rep by @serprex in #747
- ✨ Support Temporal Cloud by @Amogh-Bharadwaj in #692
- 🎯 Central slot collecting function for UI and monitoring by @Amogh-Bharadwaj in #771
- 💻 S3 Peer UI by @Amogh-Bharadwaj in #668
- 📋 Column Exclusion by @serprex in #601
- 🔄 full table resync for Snowflake QRep by @heavycrystal in #617
- 🔁 CDC/QRep full resync support for BigQuery by @heavycrystal in #639
- 🔄 schema changes for QRep BigQuery by @heavycrystal in #633
- 🔧 PG,BQ,SF CDC: PeerDB Columns by @Amogh-Bharadwaj in #845
- 📄 added metadata_schema option for SF and PG peers by @heavycrystal in #560
- 🔄 [eventhubs] Add more logs and retries by @iskakaushik in #622
- 🔁 RESYNC MIRROR for QRep Snowflake mirrors by @heavycrystal in #618
- 💻 UI for BigQuery Peer by @Amogh-Bharadwaj in #620
- ⏸️ PAUSE MIRROR support by @heavycrystal in #605
- 📝 Add screen to view mirror activity by @pankaj-peerdb in #62
- 🔄 Parallel Snowflake tests by @iskakaushik in #640
- 🛠️ Allow configurable column names for soft-delete and synced-at by @iskakaushik in #653
- 🖥️ Mirror Overview UI by @Amogh-Bharadwaj in #664
- 🔄 handling soft deleted rows during resync by @heavycrystal in #676
- ⏱️ Runs SendWalHeartbeat in parallel by @Amogh-Bharadwaj in #675
- 🗄️ Use catalog as external store by @Amogh-Bharadwaj in #680
- 🔍 [cdc] Flush at the beginning of CDC until consumedXLogPos by @iskakaushik in #688
- 🔄 SyncedAt Column for QRep by @Amogh-Bharadwaj in #854
- 📊 New Graph UI by @Amogh-Bharadwaj in #684
- 🌳 Heirarchical UI For CDC Table Picker And Refactoring by @Amogh-Bharadwaj in #700
- 🔒 UI auth by @serprex in #699
- 💓 WalHeartBeat across peers by @Amogh-Bharadwaj in #708
- 📚 Reads cert and key as base64 for Temporal Cloud by @Amogh-Bharadwaj in #725
- 🔄 support mixed case table names pg->pg by @iskakaushik in #722
- 🔄 Get task queues from a function based on deployment UID variable by @Amogh-Bharadwaj in #729
- 💻 Support for mirror name filter by @Amogh-Bharadwaj in #731
- 🔢 Use number of records synced as decider for CDC tests by @Amogh-Bharadwaj in #738
- 📝 add better logging for cdc flow by @iskakaushik in #740
- 🗄️ filtering system schemas from all tables and all schemas queries by @heavycrystal in #741
- 🏷️ added Dropdown to filter peers by type by @heavycrystal in #742
- 🛡️ Better peer checks by @Amogh-Bharadwaj in #572
- 🔄 add an endpoint to expose the PeerDB version from Flow API by @iskakaushik in #749
- 📊 Metrics For XMIN by @Amogh-Bharadwaj in #762
- ➕ Added spill to disk using Pebble for CDC records by @heavycrystal in #760
- 📊 Improve: slot size monitoring by @Amogh-Bharadwaj in #782
- 🛠️ Address GH code scanning by @serprex in #773
- ⚡ Make tests faster by @iskakaushik in #783
- 🔄 Track wal_status in slot info by @Amogh-Bharadwaj in #790
- 🔄 Update go modules by @serprex in #787
- 🚫 Avoid having multiple catalog connection pools by @serprex in #793
- 🔑 Allow the ability to connect to Postgres via an SSH tunnel by @iskakaushik in #800
- 🔄 Make sync batch size dynamic by @iskakaushik in #806
- ➕ Adding IF NOT EXISTS for pg and bq by @heavycrystal in #808
- 🛠️ BigQuery: Avro loader improvements by @iskakaushik in #810
- ⏱️ Changing idle timeout to 60 seconds by @heavycrystal in #791
- 🔄 Migrate to Go 1.21's native slog by @Amogh-Bharadwaj in #764
- 🔍 BigQuery: add wait for table by @iskakaushik in #817
- 🔧 Move configuration params to a central place by @iskakaushik in #818
- 🖥️ UI: Advanced section with pull batch size by @Amogh-Bharadwaj in #822
- 🖥️ Improve loading, mirrors page by @pankaj-peerdb in #841
- 📜 logs schema deltas to catalog as soon as they are read by @heavycrystal in #842
- 🔄 Update go dependencies: CVE-2023-48795 by @serprex in #850
- 🛠️ Only run nexus/flow/ui CI when PR affects their directory by @serprex in #848
- 🔧 Remove direct dependency on pkcs1/pkcs8 by @serprex in #853
- 🔄 Refactor replica identity type and primary key column retrieval in Postgres by @iskakaushik in #860
Full Changelog: v0.9.2...v0.10.0
v0.9.2
What's Changed
In this version of PeerDB the performance of streaming has improved all around by ~30%. This is because of push-while-pull rather than pull-then-push architecture we had before. We also spent a good chunk of time on the UI.
- 🚀 Either go through gRPC gateway or use prisma by @iskakaushik in PR#519
- 🔄 Removed ENABLE_STATS option, checking for catalog connectivity by @heavycrystal in PR#517
- 🐛 Fix docker builds for ui by @iskakaushik in PR#520
- 🌍 Geospatial support for Snowflake by @Amogh-Bharadwaj in PR#516
- 🛠️ Make qrep status more useful by @iskakaushik in PR#522
- 🔧 Add the ability to push to eventhubs in an asynchronous way by @iskakaushik in PR#523
- 📦 [ui] Minor bugfixes and improvements by @iskakaushik in PR#524
- 📝 Better COPY command quoting by @Amogh-Bharadwaj in PR#526
- 📊 Add tabs for cdc mirror status page by @iskakaushik in PR#525
- 🔑 Composite primary key support for SF, PG, and BQ by @heavycrystal in PR#499
- 📁 Optionally create watermark table on destination for qrep mirrors by @heavycrystal in PR#528
- 📈 Some refinements to the status pages by @iskakaushik in PR#529
- 🔗 Improve connection params for postgres connector by @iskakaushik in PR#530
- 💡 Optimize Avro Streaming with
zstd
Compression for Snowflake by @iskakaushik in PR#527 - 🖥️ UI for Create QRep Mirror by @Amogh-Bharadwaj in PR#532
Full Changelog: v0.8.1...v0.9.0
v0.8.1
🚀 Release Notes v0.8.1
🌟 Highlights:
- Major improvements and features added to EventHub GA.
- Introduction of GCS as a new destination.
- Important schema changes and several new features to boost your workflow.
- Miscellaneous changes for overall system improvements.
- Warm welcome to our new contributors!
🎉 EventHub GA:
- 📝 Parameters for EH CDC by @Amogh-Bharadwaj in #375
- 📊 Event Hub CDC Logs by @Amogh-Bharadwaj in #374
- ⏲ Metrics, timing for Eventhub CDC by @Amogh-Bharadwaj in #390
- 🗑️ Adding DROP MIRROR support for EventHub by @heavycrystal in #402
- ❤️ Update heartbeat for eventhub by @iskakaushik in #409
... [and more]
🌐 GCS as a destination:
- 📦 CDC to S3/GCS by @Amogh-Bharadwaj in #507
- 🔄 Support GCS via S3 API for query replication by @Amogh-Bharadwaj in #502
🔧 Basic Schema Changes:
- 📐 Basic ADD COLUMN replay support for PG, BQ, and SF by @heavycrystal in #368
✨ New Features:
- 🛑 Support
DROP MIRROR
for Query Replication by @iskakaushik in #481 - 🕑 XMIN for Query Replication by @Amogh-Bharadwaj in #403
- ✂️ Added QRep overwrite mode, to truncate destination table by @heavycrystal in #385
... [and more]
🐛 Bug Fixes:
⚠️ [Important] Return only at commit message in Postgres CDC by @iskakaushik in #503- ⌛ Wait for the workflow to actually close before proceeding by @iskakaushik in #417
- ✅ Replica Identity Check by @Amogh-Bharadwaj in #392
... [and more]
🚀 Performance:
- ⬆️ Pass QValue by pointer to significantly reduce gc pressure by @iskakaushik in #381
- 📤 [postgres] Copy to destination not staging by @iskakaushik in #498
🔧 Miscellaneous:
- 🔄 Split Connector interface to better represent capabilities by @heavycrystal in #376
- 📊 Optimizing flow tests by separating and parallelizing test suites by @heavycrystal in #356
... [and more]
🆕 New Contributors:
- 🎉 @serprex made their first contribution in #451
- 🎉 @iamKunalGupta made their first contribution in #508
Full Changelog: v0.7.1...v0.8.1
v0.7.1
What's Changed
As a part of this release the major changes are around supporting initial loads via CDC Mirrors. We have also added basic monitoring and metrics for CDC and query replication mirrors. Here are some of the highlights of this release.
Highlights
- 🚀 Blazing fast initial load supported for CDC Mirrors.
- 📦 Migrate Flow API to gRPC for better type support.
- 🌐 Add Azure Eventhub Peer support for CDC.
- 🗄️ Support CTID as a partition column for Postgres.
- 📋 Improved JSON and Array support across all the Peers.
- 📊 Add monitoring for Mirrors (Prometheus and Grafana).
- 📈 Add metrics on catalog for CDC and query replication.
- 🗂️ Add SQL server as a source for query replication.
- 🚨 Add AVRO sync mode for CDC Mirrors - Snowflake and Bigquery.
- 🖥️ Improve logging - added partition and flow job information as applicable.
- ❤️ Add heart beat support for all the activities.
- 🔒 Support TLS connections on Postgres.
New Contributors
- @arajkumar made their first contribution in #363
Full Changelog: v0.6.3...v0.7.1
v0.7.0
new release