Big Data
How is it that the supermarket you shop at can give you deals on baby diapers even before your child has been born? Or that your favorite search engine often suggest the phrase you intended to search for after you?ve barely started typing? Data mining in large data sets, popularly known as Big Data, makes it possible.
What makes it possible for a computer to figure out what's going on in your life or what you are interested in right now based on what you buy in a shop or start typing in a search field is the basic assumption that you are exactly the same as all the others. To compare your data with others and do increasingly sophisticated analyzes has become possible thanks to research and development on how we can store our ever increasing amounts of data, algorithms that allow us to analyze the data with fewer computations, and computers that can perform these computations ever faster.
At the Department of Information Technology there is ongoing research regarding both the technical aspects of large-scale data mining, but also on ethical issues arising from the fact that we now have the ability to perform analyses that many would perceive as intrusive or offensive. That a supermarket can "know" that someone in your family is pregnant because of changes in what products you buy is merely one example of this.
However, there are many application areas where the advantages of effectively being able to analyze large amounts of data are clear. Our ability to monitor and analyze climate change, global public health and community development are a couple of these. This is an area of research that is moving very fast, and we may only speculate on what will be possible to do just ten years into the future.
Curious about this Research?
- Research into the design and programming of the computers that perform this data management takes place at the Uppsala Programming for Multicore Architectures Research Center (UPMARC)
- The DCA Research Group is an interdisciplinary arena for researchers interested in large-scale distributed and data-intensive computing, data science and computational science and engineering software.
- The Uppsala University Information Laboratory (InfoLab) develops data mining methods to analyze large online human-generated data, and is responsible for advanced and graduate education in Data Mining and Network Analysis.