Scientific workflows are increasingly characterized by complex task dependencies and large-scale data exchanges, which place significant pressure on the input/output (I/O) systems of traditional Workflow Engines (WFEs). These challenges are particularly evident in data-intensive and real-time processing contexts, where conventional disk-based I/O mechanisms often become performance bottlenecks. This paper presents an approach to enhancing the DAGonStar scientific workflow engine by integrating CAPIO, a middleware designed to support memory-based streaming I/O. The integration combines DAGonStar's orchestration capabilities with CAPIO's efficient data handling to better support workflows operating on continuous or large-scale datasets. We describe the architectural modifications introduced to enable this collaboration and provide an analysis of the resulting system. The proposed solution aims to improve the responsiveness and flexibility of scientific workflows by streamlining data transfers and simplifying task coordination. This work contributes to the evolution of workflow systems toward more efficient and scalable models for scientific computing.

Streaming I/O for scientific workflow engine acceleration

Perrotta S.;De Vita C. G.;Mellone G.;Montella Raffaele
2026-01-01

Abstract

Scientific workflows are increasingly characterized by complex task dependencies and large-scale data exchanges, which place significant pressure on the input/output (I/O) systems of traditional Workflow Engines (WFEs). These challenges are particularly evident in data-intensive and real-time processing contexts, where conventional disk-based I/O mechanisms often become performance bottlenecks. This paper presents an approach to enhancing the DAGonStar scientific workflow engine by integrating CAPIO, a middleware designed to support memory-based streaming I/O. The integration combines DAGonStar's orchestration capabilities with CAPIO's efficient data handling to better support workflows operating on continuous or large-scale datasets. We describe the architectural modifications introduced to enable this collaboration and provide an analysis of the resulting system. The proposed solution aims to improve the responsiveness and flexibility of scientific workflows by streamlining data transfers and simplifying task coordination. This work contributes to the evolution of workflow systems toward more efficient and scalable models for scientific computing.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11367/163187
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact