Research at Uppsala DataBase Laboratory (UDBL)
Georgios FakasDepartment of Information Technology
Uppsala University, Uppsala, Sweden
Current active directions include big data, keyword search and ranking on (semi) structure data and (attributed) graphs, semantic data, spatial data, online (geo) social networks. We are also interested in workflow management.
Department of Information Technology
Uppsala University, Uppsala, Sweden
Other research direction of the group concentrates on developing methods for representation and scalable extensible processing of queries analyzing different kinds of distributed data in terms of semantic 'NoSQL' data representations. Of particular interest is scalable processing of high level queries analyzing high volume data streams. A challenge is to provide scalable processing as the data volume increases and the analyzes become increasingly costly. Our approach is to develop smart query transformation techniques and distributed execution strategies in an extensible platform where external systems, algorithms, and data managers can be plugged-in. More details and projects can be found here.
Our research system Amos II provides a platform for scalable processing of queries to many different kinds of heterogeneous data sources. The queries are expressed in terms of a high level semantic data model. The system enables integration of external storage managers, databases, and computational systems through APIs in several programming languages.
Department of Information Technology
Uppsala University, Uppsala, Sweden
Within the combined fields of Database
Technology and Data Analytics there are several research
challenges that currently attracts a lot of interest and
that are part of our
current research plan. Research in database technology is
commonly dealing with
high-level and scalable data management and an important
aspect of data analytics involves data analysis of
large-scale data sets
and streams. Key issues are to provide the data processing
both at the edge in
cyber-physical systems (industrial equipment) and at an
aggregated level in the
cloud (or in a corresponding environment). In addition to
processing
performance, high level access to data and data analytics
(i.e. query-based
numerical operators) capabilities are vital for the overall
process performance.
Another important aspect is to support efficient query
optimization and indexing
techniques for data, data streams and for corresponding
analytical queries
since these mechanisms can reduce the computational
complexity and thus are
important for supporting green computing.
We are
investigating and developing capability to efficiently handling data
and data streams in industrial
processes which is critical for transforming the current
manufacturing industry
with the overall goal of improving productivity and quality
of industrial
processes and products. A critical area within this context
is scalable
capability to collect, process, analyse, and visualize data
streams to support
cyber-physical systems, exemplified by machining and
production processes,
hydraulic power systems, and heavy vehicles in production.
For this purpose, we
work on enabling scalable query-based data analytics and
visualization of data
and streaming data based on for example computational, array
and NoSQL data and
data stream management systems that can be deployed both in
an edge and cloud
environment. Figure 1 below illustrates the initial idea of
industrial internet
where (aggregated) data analytics is supplied in a cloud
environment where as
Figure 2 completes the overall data analytics process with
an edge-based
perspective with a much tighter analytics loop that can
react to sudden changes
in industrial processes.
Figure
1: Industrial internet
(source https://www.ge.com/reports/post/76430585563/
Figure 2: An example of an edge analytics
architecture
new-industrial-internet-report-from-ge-finds/)
This type
of knowledge and skills are also being carried over to our
students in our
educations and are already being planned to be part of the
suggested new
educational programs in the area of industrial analytics
(Civilingenjörsprogram
and MSc program). Also PhD courses are being developed for
main-memory dbms’s,
array dbms’s and also for streaming dbms’s.