All you Need to Know about Store and Forward System

Plural Online by Pine Labs

A lossless messaging platform to deliver events reliably is needed for applications, especially payments, where events cannot be lost. While the system can be built in any number of ways, following the Transaction Outbox pattern and implementing this capability is easy to achieve.

Context

In the context of Payments within Plural, with multiple services in the ecosystem, there is a requirement to reliably send events /messages to other systems in a reliable fashion following at-least-once delivery semantics. These events are processed non-real-time and will not affect the transactional flow. One real use case in Payments is processing Payments real-time while sending the payment records to the Settlement systems reliably, which can later process settlements. Settlements usually are non-real-time in nature and this use case can employ the store and forward systems for reliable processing.

The Pattern

The above-mentioned use case can be solved using the Transactional Outbox Pattern. Transaction Outbox solves the following issues in reliably publishing a message to the message broker. Before discussing this pattern, we should understand the Dual Write problem in distributed systems, which results from data inconsistencies in Distributed Systems.

Dual Writes

In the Monolithic world, Distributed transactions using 2 Phase Commit protocol. Microservices is not an option as it involves a complex locking mechanism that is difficult to operate at scale. There are two approaches here, Synchronous and Asynchronous, to update data across multiple systems.

Synchronous updates through the invocation of multiple API of other services are an option. The issue with this is that synchronous updates will require all the participating systems to be always available. In distributed systems, always available networks are a fallacy. Though this can be circumvented through retry policies, tight coupling is not recommended.

With an Asynchronous approach, we can use a database and a messaging broker like Kafka. However, ensuring the transnationality of a data write across the database and Kafka is not possible.

Approach 1

  1. Write to the database first, commit and then publish to the message broker

2. This seems to be an easy approach but doesn’t ensure reliable message delivery if Kafka goes down (as in Step 2)

3. Tracking the committed but undelivered messages is not straightforward and would require some offset to be maintained and are complex

Approach 2

  1. Publish to the Message Broker first and then write to the database

2. When the database goes down after the publish to Kafka, data inconsistency will creep in as the messages published to the message broker would have already been consumed by Service B, whereas the data is not available in the database

The Transactional Outbox

1. This pattern involves writing to an Outbox table along with the Domain table under a single transaction context.

2. This ensures that data consistency guarantees are met

3. A separate database log tail system that tracks the inserts into the Outbox table and publishes the messages to the message broker

While the Database and Database Log Tail systems are generic, specific providers like Debezium can be used, which works through Change Data Capture.

Idempotency Management

The above pattern can implement a reliable message delivery platform that guarantees at least once delivery. However, the consuming service should handle duplicate message deliveries by employing idempotency checks to prevent duplicate processing of the same message.

Scroll to Top