References

[Bloch59] E. Bloch, The engineering design of the Stretch computer, Proc. of the Fall Joint Computer Conference, 1959, 48-59.

[Chang91] P. Chang, S. Mahlke, W. Chen, N. Warter and W. Hwu, IMPACT: An architectural framework for multiple-instruction-issue processors, Proceedings of the 18th Annual International Symposium on Computer Architecture, Association for Computing Machinery, New York, 1991, 266-275.

[Cook73] S. Cook and R. Reckhow, Time Bounded Random Access Machines, Journal of Computer and System Sciences 7, 354-375 (1973).

[Culler99] D. Culler , J. Singh , Parallel Computer Architecture—A Hardware/ Software Approach, Morgan Kaufmann Publishers Inc, San Fransisco, 1999 .

[Denard74] R. Dennard, F. Gaensslen, H. Yu, V. Rideout, E. Bassous, A. LeBlanc, Design of ion-implanted MOSFET's with very small physical dimensions, IEEE Journal of Solid-State Circuits SC-9, 5 (October 1974): 256–268.

[Fisher81] J. Fisher, Trace Scheduling: A technique for global microcode compaction, IEEE Transactions on Computers C-30, (1981), 478-490.

[Fisher83] J. Fisher, Very Long Instruction Word Architectures and ELI-512, Proc. 10th Annual Int. Symp. on Computer Architecture, Computer Society Press, Washington, 140-150.

[Flynn72] M. Flynn, Some Computer Organizations and their Effectiviness, IEEE Trans. Comput. 21, 9 (1972), 948-960.

[Forsell94] M. Forsell, Are Multiport Memories Physically Feasible?, Computer Architecture News 22, 4 (September 1994), 47-54.

[Forsell02a] M. Forsell, Architectural differences of efficient sequential and parallel computers, Journal of Systems Architecture 47, 13 (July 2002), 1017-1041.

[Forsell02b] M. Forsell, A Scalable High-Performance Computing Solution for Network on Chips, IEEE Micro 22, 5 (September-October 2002), 46-55.

[Forsell10] M. Forsell, On the performance and cost of some PRAM models on CMP hardware, International Journal of Foundations of Computer Science 21, 3 (2010), 387-404.

[Forsell13] M.Forsell and V. Leppänen, An Extended PRAM-NUMA Model of Computation for TCF Programming, International Journal of Networking and Computing 3, 1 (2013), 98-115.

[Forsell16] M. Forsell, J. Roivainen and V. Leppänen, Outline of a Thick Control Flow Architecture, Proc. 5th Workshop on Parallel Programming Models Special Edition on Task Parallelism, October 26-28, 2016, Marina del Rey Marriott, Los Angeles, USA.

[Forsell18] M. Forsell, J. Roivainen and V. Leppänen, REPLICA MBTAC - Multithreaded Dual ModeProcessor, Journal of Supercomputing 74, 5 (2018), 1911-1933.

[Forsell20] M. Forsell, J Roivainen and J. Träff, Optimizing Memory Access in TCF Processors with Compute-Update Operations, In the Proceedings of 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM’20) in conjunction with the 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS’20), May 18 – 22, 2020, New Orleans, Louisiana, USA.

[Forsell22] M. Forsell, S. Nikula, J. Roivainen, V. Leppänen and J. L. Träff, Performance and Programmability Comparison of the Thick Control Flow Architecture and Current Multicore Processors, Journal of Supercomputing 78, 3 (2022), 3152-3183. Available at https://doi.org/10.1007/s11227-021-03985-0.

[[Forsell23] M. Forsell, J. Roivainen, V. Leppänen and J. L. Träff, Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors, In the Proceedings of 2023 IEEE Nordic Circuits and Systems Conference (IEEE NORCAS'23), October 31 - November 1, 2023, Aalborg, Denmark.

[Forsell25] M. Forsell, Flow Computing—A New Way to Boost the Performance of CPUs for Parallel Functionalities, Scalable Approaches to High Performance and High Productivity Computing (ScalPerf’25), September 14-19, 2025, Bertinoro, Italy.

[Fortune78] S. Fortune and J. Wyllie, Parallelism in Random Access Machines, Proc. 10th Annual ACM symposium on Theory of computing (STOC’78), San Diego, California, USA — May 1-3, 1978, 114-118.

[HiPEAC13] The HiPEAC Vision for Advance Computing in Horizon 2020, HiPEAC Network of Excellence publication, 2013; http://www.hipeac.net/system/files/hp-roadmap-2013.pdf.

[Intel06] Research at Intel From a Few Cores to Many: A Tera-scale Computing Research Overview, White Paper, Intel, 2006.

[ITRS03] International Technology Roadmap for Semiconductors (ITRS), the 2003 edition, Semiconductor Industry Association (SIA); http://www.itrs.net.

[Jouppi18] N. P. Jouppi, C. Young, N. Patil, D. Patterson, A domain-specific architecture for deep neural networks, Communications of the ACM 61, 9 (2018), 50-59.

[Keller01] J. Keller , C. Keßler , J. Träff, Practical PRAM Programming, Wiley, New York, 2001 .

[Kowalik85] J. Kowalik (editor), Parallel MIMD computation: the HEP supercomputer and its applications, MIT Press, Cambridge, 1985.

[Krill14] P. Krill, Stroustrup highlights next C++ goals: Parallelism, concurrency, InfoWorld, October 29, 2014.

[Lenoski92] D. Lenoski, J. Laudon, K. Gharachorloo, W. Weber, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam, The Stanford Dash Multiprocessor, IEEE Computer 25, (March 1992), 63-79.

[Leppänen96] V. Leppänen, Studies on the realization of PRAM, Dissertation 3, Turku Centre for Computer Science, University of Turku, Turku, 1996.

[Leppänen11] V. Leppänen, M. Forsell and J-M. Mäkelä, Thick Control Flows: Introduction and Prospects, In the Proceedings of the 2011 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’11), July 18-21, 2011, Las Vegas, USA, 540-546.

[Mazke97] D. Mazke, Will Physical Scalability Sabotage Performance Gains, Computer 30, 9 (September 1997), 37-42.

[Mudge01] T. Mudge, Power: A First-Class Architectural Design Constraint,
Computer 34, 4 (April 2001), 52-58.

[Moore65] G. Moore, Cramming more components onto integrated circuits, Journal of Electronics 38, 8 (1965), 114-117.

[Moore96] S.Moore,Multithreaded Processor Design, Kluwer Acaemic Publishers, Boston, 1996.

[Patterson10] D. Patterson. The Trouble With Multicore, IEEE Spectrum 47, 7 (2010), 28–32.

[Ranade91] A. Ranade. How to Emulate Shared Memory. Journal of Computer and System Sciences, 42, 307–326, 1991.

[Sohi01] G. Sohi, A. Roth, Speculative multithreaded processors, IEEE Computer 34, 4 (2001) 66–73.

[Swan77] R. Swan, S. Fuller and D. Siewiorek, Cm*—A Modular Multiprocessor, In the Proceedings of NCC, 645-655, 1977.

[Tendler02] J. M. Tendler; J. S. Dodson; J. S. Fields, Jr.; H. Le & B. Sinharoy, POWER4 system microarchitecture, IBM Journal of Research and Development 46, 1 (2002), 5–26.

[Tomasulo67] R. Tomasulo, An efficient algorithm for exploiting multiple arithmetic units, IBM Journal of Research and Development 11, 1 (1967), 25-33.

[Tullsen95] D. Tullsen, S. Eggers, H. Levy, Simultaneous multithreading: maximizing on-chip parallelism, in the Proceedings of the 23th Annual International Symposium on Computer Architecture, 1995, pp. 533-544.

[Valiant90] L. G. Valiant, A Bridging Model for Parallel Computation, Communications of the ACM 33, 8 (1990), 103-111.

[Vishkin11] U. Vishkin, Using Simple Abstraction to Reinvent Computing for Parallelism, Communications of the ACM 54, 1 (January 2011), 75-85.

[Vishkin14] U. Vishkin, Is multicore hardware for general-purpose parallel processing broken?, Communications of the ACM 57, 4 (2014), 35-39.