Overview

A very nicely designed and abstracted library that solves the message passing between bounded contexts. However, the performance of this library is not enough if you service emits > 500 messages/sec to Kafka.

Design Analysis

  1. In the case of MySQL backend, the library uses this mysql CDC tool. However, our past experience with mysql CDC is not overly positive. That is, it is stable enough for data analytics pipelines, but debatable for latency sensitve (<5s) cross-service communication. This suggests us to use polling instead of CDC to detect message to send.
  2. The library uses a published column in its events table and message table. Right now it has no logic the evict/purge published records. Moreover, the code uses a SELECT * FROM %s WHERE %s = 0 ORDER BY %s ASC LIMIT :limit statement to find message to publish,i.e., it is a naive full table scan
  3. The library stores messages for different topics in the same events or message table. The lock contention on the table’s PK column will be a major problem
  4. ‘events or message` table uses a long VARCHAR as the id column type. In reality, a 128-bit field,e.g., one generated by snowflake is enough.
  5. The producer uses the following code the send the message
    this.producer.send(aggregateTopic, this.publishingStrategy.partitionKeyFor(event), json).get(10L, TimeUnit.SECONDS);
    

    This means it sends each message synchronously. Then if we estimate the latency

    • 1 call to broker > 0.5 ms
    • broker does majority write > 0.5ms network + 0.1 ms SSD write
    • 0.5m network + 0.1 ms SSD write for DB to update the published column.
    • Add the above steps up, The hard limit time alone is already 2 ms per message,i.e., this pseudo code can support 500 msg/sec max