Posts on Nikita Ryanov

Different ways to setup CDC

Wed, 31 Dec 2025 04:30:00 +0300

CDC is an important part of data processing. Using CDC you can achieve many goals from simple data replication to audit and complex ETL jobs. But implementing CDC is still a tough task (especially when considering not only happy-path). In this article I want to show different ways of implementing CDC. Also, I’ll try to not only show how to setup each variant, but also compare them with each other and highlight pros and cons of each option.

PostgreSQL: Log-based CDC using debezium

Fri, 10 May 2024 04:30:00 +0300

In this little article I’ll show different ways to set up debezium for log-based CDC. Before diving into details about debezium, I’ll shortly describe CDC and why it may be helpful in some tasks.

CDC: Change Data Capture

In the Internet the CDC is described as a design pattern which allows to track data changes (deltas). Let’s consider this approach on table user_balances. Initial state of table is:

Delivery and processing semantics: overview

Wed, 01 May 2024 04:30:00 +0300

In this article I want to make an overview of a delivery semantics in messaging systems, describe delivery guarantee and add my own thoughts about all of this.

Delivery semantics: overview

So, what exactly is delivery semantics and why this is important? Delivery semantics is about guarantees provided by messaging system or delivery protocol. These guarantees are about message order (delivery and processing), delivery reliability, duplication allowance and so on. In other words delivery semantic determines how exactly message will be handled in terms of delivery.

Kafka-connect: overview

Tue, 11 Jul 2023 04:30:00 +0300

Kafka-connect: overview

Imagine you have a task where you need to fetch some data from a database and incrementally store it in kafka or read the consumed data from kafka and store it in the database. You can solve both tasks using plain kafka consumer/producer API or even use kafka streams library, but if you don’t need comprehensive data transformations (e.g. enrichment, stream joining) then you can use Kafka connect for it.

PostgreSQL: Log shipping Replication

Wed, 04 Jan 2023 22:50:00 +0300

Prerequisite

All examples assume that postgresql is already installed on your machine. Also, all examples are created using PostgreSQL 14.1 on aarch64-apple-darwin20.6.0, compiled by Apple clang version 13.0.0 (clang-1300.0.29.3), 64-bit.

Log shipping replication

Log shipping replication (i will use a short name for it LSR) is another one method to physically replicate data between multiple database clusters. As name says this method is about to replicate data through WAL-files (segment) which is transferred between instances. This is probably the most simple and straightforward method for data replication, but this simplicity comes with price and compromises which also should be accounted.

PostgreSQL: Streaming Replication

Thu, 10 Feb 2022 22:50:00 +0300

Prerequisite

Streaming replication

Streaming replication is a built-in mechanism in PostgreSQL to replicate data between multiple servers. It is a low-level replication mechanism as it streams WAL data from primary server to the replica through the physical replication slot, so it is highly recommended to replicate data between servers using similar PostgreSQL major version (minor versions could be different). Also, it is a good idea to have equal servers in terms of server configuration such as CPU, RAM and Disks, especially if you consider to promote replica to master if primary server goes down.

PostgreSQL: Logical Replication

Fri, 04 Feb 2022 22:50:00 +0300

Prerequisite

Logical replication

Logical replication is another method to replicate data between multiple nodes. This replication uses publish-subscribe model. Each publisher may have multiple subscribers and each subscriber can subscribe to multiple publisher. Also, each subscriber may be a publisher for another node which make it possible to create a cascading replication.

How to create a small json lib using antlr and shapeless

Sat, 08 May 2021 22:17:26 +0300

In this article i will show how antlr4 and shapeless can be used to create a small json library (not for production, of course ^_^) with ability to decode arbitrary json strings into case classes and encode them back with some scala magic.

Project setup

Let’s begin with a project setup.

Generally speaking, it doesn’t really matter which IDE you will use, but i’ll use a Intellij Idea. Community edition is more than enough for it. Also, i recommend to instal antlr4 plugin for intellij – it’s not necessary, but it really helps to create and debug antlr grammar.