In the distributed streaming data processing scenario, most of the frameworks implement minimal variations of the Publish-Subscribe pattern, where message passing happens directly between each Publishers and the group of its Subscribers. This work introduces a novel pattern, named Journal, that exploits a so called Editor for filtering or modifying the data stream in a principled manner. The Editor can be integrated into the Publish-Subscribe pattern with two different schemata, and has been used to implement multiple subsampling strategies, so to reduce the volume of the forwarded data, create new communication channels and match the ingestion capacity of the consumers. An actual test using Apache Kafka with a stream of simulated data has confirmed the viability of the Editor integration into Pub-Sub. We evidence that with the Journal pattern the risk of saturation of a channel can be significantly lowered and the latency of processing from clients can be notably reduced. We stress that the Journal pattern is very general and can be extended to multiple other purposes.
File in questo prodotto:
Non ci sono file associati a questo prodotto.