iStreams: Searching and analyzing on-line high
volume industrial streams
This work is funded by VINNOVA.In many industrial contexts, there is an increasing need for collection of different kinds of measurement data from various technological systems for performance monitoring, diagnostics, fault detection, and use pattern analysis. Using state-of-the-art communication technology, data may be transmitted from the embedded monitoring systems to some other location for processing and analysis. As a consequence, when the volume of the collected data grows, a high performance dynamic search engine for searching and analyzing (data mining) real-time data streams is needed.
For example, modern vehicles generate large volumes of digital data while in operation. This data is a valuable asset for the vehicle manufacturers, for purposes of diagnostics, testing and verification, and as input to simulation models when designing the next generation products.
The goal of the iStreams project is to develop software infrastructures to efficiently monitor, filter, mine, and analyze large volumes of data originating in vehicles or other products and industrial equipments. It is investigated how large industrial data streams can be searched and analyzed directly in the streams flowing from the embedded monitoring systems, via WLAN or other wireless access technologies, to computers at test sites, and further over broadband Internet to remote computers at engineering sites. A distributed so called Data Stream Management System (DSMS) is being developed for specifying the desired computations and subsequent filtering based on such streams. Only transformed and filtered interesting stream sequences are immediately delivered all the way to the remote analysis sites. The filters are expressed in terms of predefined filter functions that the engineer can use when searching on-line measurement data. The DSMS provides means for executing the filters at different locations of the distributed system utilizing state-of-the-art hardware to obtain acceptable performance in terms of response time.
The use of a system such as the one proposed for
development here
would enable industrial users to significantly cut test times and test
related costs. At the same time, such a system would enable gathering
of more data faster then before, hence enabling increasing the quality
of the developed products. The developed search engine will enable
specialized companies to offer customized IT services for industrial
enterprises. This enables producers of advanced technical systems to
offer customers a function rather than a hardware or software.
As technological basis in the iStreams project
we are extending the SCSQ
prototype, developed at UDBL, to support search and analyzes of high
volume industrial data streams.
New: The highly scalable
parallel implementation SCSQ-PLR
of the Linear Road
benchmark for DSMS
now achieves L=64, which is substantially improved scalability over any
previously published results for the Linear Road Benchmark:
E.Zeitler and T.Risch: Massive scale-out of expensive continuous queries, presented at 37th International Conference on Very Large Databases, VLDB 2011, in Proceedings of the VLDB Endowment, Vol. 4, No. 11, 2011.
E. Zeitler and T.Risch: Scalable Splitting of Massive Data Streams, presented at Proc. 15th Conf. on Database Systems for Advanced Application, DASFAA 2010., Tokyo, Japan, 1-4 April, 2010 (abstract).
Responsible for this project is Tore Risch. It is a
collaboration with Lennart
Karlsson at Luleå University of Technology.
The following researchers from the Department of
Information Technology at Uppsala University are actively working on
the project:
Tore Risch, Professor, project leader. Main interests: Scalable and high-performing DSMS technology.
Erik
Zeitler, MSc, Doctoral Student. Main interests: Scalable and
parallell DSMS technology
© 2008 Uppsala Universitet, Department of Information Technology, Box 337, 751 05 Uppsala, Sweden | This page is maintained by Tore Risch