Today’s business processes are rarely accomplished inside the companies domains. More often they involve entities geographically distributed which interact in a loosely coupled cooperation. While cooperating, these entities generate transactional data streams, such as sequences of stock-market buy/sell orders, credit-card purchase records, Web server entries, and electronic fund transfer orders. Such streams are often collections of events stored and processed locally, and they thus have typically ad-hoc, heterogeneous formats. On the other hand, elements in such data streams usually share a common semantics and indeed they can be profitably mined in order to obtain combined global events. In this paper, we present an approach to the parsing of heterogeneous data streams based on the definition of format-dependent grammars and automatic production of ad-hoc parsers. The stream-dependent parsers can be obtained dynamically in a totally automatic way, provided that the appropriate grammar, written in a common format, is fed into the system. We also present a fully working implementation, that has been successfully integrated into a telecommunication environment for real-time processing of billing information flows.
Adaptable parsing of real-time data streams
COPPOLINO, Luigi;ROMANO, LUIGI
2007-01-01
Abstract
Today’s business processes are rarely accomplished inside the companies domains. More often they involve entities geographically distributed which interact in a loosely coupled cooperation. While cooperating, these entities generate transactional data streams, such as sequences of stock-market buy/sell orders, credit-card purchase records, Web server entries, and electronic fund transfer orders. Such streams are often collections of events stored and processed locally, and they thus have typically ad-hoc, heterogeneous formats. On the other hand, elements in such data streams usually share a common semantics and indeed they can be profitably mined in order to obtain combined global events. In this paper, we present an approach to the parsing of heterogeneous data streams based on the definition of format-dependent grammars and automatic production of ad-hoc parsers. The stream-dependent parsers can be obtained dynamically in a totally automatic way, provided that the appropriate grammar, written in a common format, is fed into the system. We also present a fully working implementation, that has been successfully integrated into a telecommunication environment for real-time processing of billing information flows.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.