Skip to main content
Department of Information Technology

Multicore Programming Frameworks

Motivation

The shift to universal parallelism has dramatically increased the complexity of developing software that can exploit modern hardware efficiently. These difficulties arise from the need to leverage parallel hardware through explicit program concurrency, but are often most apparent in the resulting need to minimize communications by managing memory system and network usage. Unfortunately today's programming methodologies do a very poor job of both exposing optimization costs and opportunities to the programmer. The result is a severe lack of performance portability, which both increases the cost of software development and prevents existing software from leveraging emerging hardware to its fullest.

The goal of this project is to develop programming frameworks that can overcome this hurdle by leveraging the domain-specific high-level application information available from the programmer and a detailed knowledge of the hardware. By combining this information we will be able to provide performance portability and efficient development through a combination of improved programmability and optimization.

Long Term Goals

  • Understand the interactions between programming models, application domains, and hardware.
  • Build frameworks for providing performance portability and efficient implementation of real-world applications.
  • Investigate how to leverage high-level program information for optimizing parallelization and communication.

Expected Results

  • Develop a suite of benchmarks across a range of programming models to enable comparisons of ease of implementation and optimization.
  • Build an efficient, performance-portable framework for solving PDEs using radial basis function approximation methods.
  • Develop a task-based framework to leverage high-level program structure to optimize parallelism and data movement.
  • Leverage runtime hardware performance information in a task-based programming framework to efficiently create and manage tasks.
  • Demonstrate optimization of regular (data-parallel, static) and irregular (data-dependent, runtime) application parallelism.

Approach

Our approach to investigating programming frameworks is highly application-centric and aims to identify the key issues in developing and optimizing performance portable applications.

  • Identify and implement key benchmark applications.
  • Compare application implementations across different programming frameworks to gain insight into programming and optimization challenges.
  • Extend existing frameworks and develop new ones as needed to explore new optimizations.
  • Leverage tools from the architecture group to understand and optimize performance automatically.

Results

Software

  • SuperGlue. A library for data-dependency driven task parallelism.
  • DuctTeip. A library for distributed data-dependency driven task parallelism.

Refereed publications

  1. DuctTeip: An efficient programming model for distributed task-based parallel computing. Afshin Zafari, Elisabeth Larsson, and Martin Tillenius. In Parallel Computing, volume 90, 2019. (DOI, fulltext:postprint).
  2. SuperGlue: A shared memory framework using data versioning for dependency-aware task-based parallelization. Martin Tillenius. In SIAM Journal on Scientific Computing, volume 37, pp C617-C642, 2015. (DOI, fulltext:print).
  3. A scalable RBF–FD method for atmospheric flow. Martin Tillenius, Elisabeth Larsson, Erik Lehto, and Natasha Flyer. In Journal of Computational Physics, volume 298, pp 406-422, 2015. (DOI, fulltext:postprint).
  4. Resource-aware task scheduling. Martin Tillenius, Elisabeth Larsson, Rosa M. Badia, and Xavier Martorell. In ACM Transactions on Embedded Computing Systems, volume 14, number 1, pp 5:1-25, 2015. (DOI, Fulltext).
  5. Programming models based on data versioning for dependency-aware task-based parallelisation. Afshin Zafari, Martin Tillenius, and Elisabeth Larsson. In Proc. 15th International Conference on Computational Science and Engineering, pp 275-280, IEEE Computer Society, Los Alamitos, CA, 2012. (DOI).
  6. Using hardware transactional memory for high-performance computing. Karl Ljungkvist, Martin Tillenius, David Black-Schaffer, Sverker Holmgren, Martin Karlsson, and Elisabeth Larsson. In Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, pp 1660-1667, IEEE, Piscataway, NJ, 2011. (DOI).
  7. Information quality testing. Anna Wingkvist, Morgan Ericsson, Welf Löwe, and Rüdiger Lincke. In Perspectives in Business Informatics Research, volume 64 of Lecture Notes in Business Information Processing, pp 14-26, Springer-Verlag, Berlin, 2010. (DOI).
  8. Analysis and visualization of information quality of technical documentation. Anna Wingkvist, Welf Löwe, Morgan Ericsson, and Rüdiger Lincke. In Proc. 4th European Conference on Information Management and Evaluation, pp 388-396, Academic Conferences, Reading, UK, 2010.
  9. An efficient task-based approach for solving the <em>n</em>-body problem on multicore architectures. Martin Tillenius and Elisabeth Larsson. PARA 2010: State of the Art in Scientific and Parallel Computing, University of Iceland, Reykjavík, 2010. (fulltext:postprint).
  10. Current practice in mobile learning: A survey of research method and purpose. Anna Wingkvist and Morgan Ericsson. In Proc. 8th World Conference on Mobile and Contextual Learning, pp 103-111, University of Central Florida, Orlando, FL, 2009.
  11. Sharing experience from three initiatives in mobile learning: Lessons learned. Anna Wingkvist and Morgan Ericsson. In Proc. 17th International Conference on Computers in Education, pp 613-617, Asia-Pacific Society for Computers in Education, Jhongli City, Taiwan, 2009.
  12. Thinking ahead in mobile learning projects: A survey on risk assessment. Anna Wingkvist and Morgan Ericsson. In Proc. 8th International Conference on Perspectives in Business Informatics Research, pp 57-66, Kristianstad Academic Press, Sweden, 2009.
  13. A meta-model describing the development process of mobile learning. Anna Wingkvist and Morgan Ericsson. In Advances in Web Based Learning – ICWL 2009, volume 5686 of Lecture Notes in Computer Science, pp 454-463, Springer-Verlag, Berlin, 2009. (DOI).

Theses

  1. Advances in Task-Based Parallel Programming for Distributed Memory Architectures. Afshin Zafari. Ph.D. thesis, Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology nr 1621, Acta Universitatis Upsaliensis, Uppsala, 2018. (fulltext, preview image).
  2. Scientific Computing on Multicore Architectures. Martin Tillenius. Ph.D. thesis, Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology nr 1139, Acta Universitatis Upsaliensis, Uppsala, 2014. (fulltext, preview image).
  3. Leveraging multicore processors for scientific computing. Martin Tillenius. Licentiate thesis, IT licentiate theses / Uppsala University, Department of Information Technology nr 2012-006, Uppsala University, 2012. (fulltext).

Other publications

  1. Distributed dynamic load balancing for task parallel programming. Afshin Zafari and Elisabeth Larsson. 2018. (arXiv:1801.04582).
  2. DuctTeip: A task-based parallel programming framework for distributed memory architectures. Afshin Zafari, Elisabeth Larsson, and Martin Tillenius. Technical report / Department of Information Technology, Uppsala University nr 2016-010, 2016. (fulltext).
  3. A task parallel implementation of an RBF-generated finite difference method for the shallow water equations on the sphere. Martin Tillenius, Elisabeth Larsson, Erik Lehto, and Natasha Flyer. Technical report / Department of Information Technology, Uppsala University nr 2014-011, 2014. (fulltext).
  4. SuperGlue: A shared memory framework using data versioning for dependency-aware task-based parallelization. Martin Tillenius. Technical report / Department of Information Technology, Uppsala University nr 2014-010, 2014. (fulltext).
  5. A task parallel implementation of a scattered node stencil-based solver for the shallow water equations. Martin Tillenius, Elisabeth Larsson, Erik Lehto, and Natasha Flyer. In Proc. 6th Swedish Workshop on Multi-Core Computing, pp 33-36, Halmstad University, Halmstad, Sweden, 2013.
  6. Resource-aware task scheduling. Martin Tillenius, Elisabeth Larsson, Rosa M. Badia, and Xavier Martorell. In 4th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA), p 6, Tech. Univ. Berlin, Germany, 2013. (fulltext:postprint).
  7. A simple model for tuning tasks. Marcus Holm, Martin Tillenius, and David Black-Schaffer. In Proc. 4th Swedish Workshop on Multi-Core Computing, pp 45-49, Linköping University, Linköping, Sweden, 2011.
  8. Early results using hardware transactional memory for high-performance computing applications. Karl Ljungkvist, Martin Tillenius, Sverker Holmgren, Martin Karlsson, and Elisabeth Larsson. In Proc. 3rd Swedish Workshop on Multi-Core Computing, pp 93-97, Chalmers University of Technology, Göteborg, Sweden, 2010. (fulltext:postprint).
  9. Dealing with stakeholders in mobile learning: A study of three initiatives. Anna Wingkvist and Morgan Ericsson. In Proc. 32nd Information Systems Research Seminar in Scandinavia, pp A72:1-14, Molde University College, Norway, 2009.

Presentations

Updated  2017-12-14 18:38:11 by Elisabeth Larsson.