(deprecated) Optimizing KafkaMirrorMaker2 translated offset resolution for cluster failover scenarios

  • application code needs to contain custom logic that decides when the consumer needs to use (normal) saved group offsets, or the data from checkpoint topic;
  • to resolve the translated offsets, the application needs to pull in connect-mirror-client dependency;
  • the process of figuring out offset for a group-partition pair involves reading all messages in the checkpoint topic, what might be wasteful in environments where there are multiple consumers with different group ids (as we would be reading the same messages over and over).
  1. stop all the consumers in source clusters,
  2. perform our work — rewrite data stored in checkpoint topics into __consumer_offsets in target cluster,
  3. start the consumers in target clusters, using the same group ids, consuming from replica partitions.
  1. in computeConsumerGroupOffsets we read all messages present in checkpoint topic, so we have the most recent offsets for each consumer group-partition pair;
  2. in updateConsumerGroupOffsets, we update the target cluster’s consumer group offsets using AdminClient’s alterConsumerGroupOffsets
  • the consumers in target cluster are NOT active when translation is happening (alterConsumerGroupOffsets requires groups to be empty to succeed);
  • the producers have stopped producing to topics and all messages have replicated before we run our tool (as we get endOffsets only once);
  • the replication policy ensures that replicated records end up in its own dedicated topic (otherwise us modifying consumer group offsets could also skip over locally-produced messages);
  • as a good practice, the consumers should remember to configure auto.offset.reset property — we have inserted the offset data, but the records in replica partitions might have been deleted in the meantime (e.g. by user action or due to Kafka’s time/volume-based retention policies).




Software developer, Java mostly, C & C++ sometimes (https://www.linkedin.com/in/adam-kotwasinski/)

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Enabling IAM user and role access to your cluster Using RBAC

Create multi-cloud Kubernetes cluster

Accelerate Your Inner Dev Loop for Kubernetes Services

How to Create a Writing Portfolio When You Have Zero Experience

TryHackMe | Introductory Researching

Oracle Database Performance Tuning Problems- The Root Cause

FOSS Analytics Toolkit: Getting started

Hello everyone!!!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Kotwasinski

Adam Kotwasinski

Software developer, Java mostly, C & C++ sometimes (https://www.linkedin.com/in/adam-kotwasinski/)

More from Medium

Spring Events at VISO

Retryable Topics with Spring Kafka

Integrating NoSQL Database With Mule 4 (OOTB Cassandra Connector)

JVM part 01 — What is JVM?