Performance and Programmability Comparison of the Thick Control Flow Architecture and Current Multicore Processors

News – 02/05/24

Abstract: Commercial multicore central processing units (CPU) integrate a number of processor cores on a single chip to support parallel execution of computational tasks. Multicore CPUs can possibly improve performance over single cores for independent parallel tasks nearly linearly as long as sufficient bandwidth is available. Ideal speedup is, however, difficult to achieve when dense intercommunication between the cores or complex memory access patterns is required. This is caused by expensive synchronization and thread switching, and insufficient latency toleration. These facts guide programmers away from straight-forward parallel processing patterns toward complex and error-prone programming techniques. To address these problems, we have introduced the Thick control flow (TCF) Processor Architecture. TCF is an abstraction of parallel computation that combines self-similar threads into computational entities. In this paper, we compare the performance and programmability of an entry-level TCF processor and two Intel Skylake multicore CPUs on commonly used parallel kernels to find out how well our architecture solves these issues that greatly reduce the productivity of parallel software development. Code examples are given and programming experiences recorded.

Reference: M. Forsell, S. Nikula, J. Roivainen, V. Leppänen and J. L. Träff, Performance and Programmability Comparison of the Thick Control Flow Architecture and Current Multicore Processors, Journal of Supercomputing 78, 3 (2022), 3152-3183.
Link: https://doi.org/10.1007/s11227-021-03985-0