Is General-Purpose Many-Core Parallelism Imminent?

Uzi Vishkin, University of Maryland, Baltimore

Tuesday June 18th, 10:30am, IRMACS theatre, ASB 10900.

The challenge of reinventing mainstream general-purpose computing for parallelism came into focus in 2003, once processor clock
frequencies generally stopped improving. This challenge is yet to be met, particularly for applications for which run-time
 of a single computational task, and the productivity of its parallel programming are an issue. As mobile platforms are catching
up on performance, and the vendors' field is getting crowded, competition will hopefully drive vendors to meet the challenge.
I will argue that the explicit multi-threaded (XMT) on-chip platform, developed by my research team, provides the missing link 
in the type of heterogeneous systems needed for meeting today's opportunities and constraints. XMT can do better by 
order-of-magnitude over vendors' many-cores on both ease-of-programming and speedups over best serial solutions and support 
both claims by experimental data. For ease-of-programming teaser anecdotal data include: (i) teaching graduate material at high
 schools, and (ii) a joint UIUC/UMD course in which no student was able to get speedups over serial on OpenMP running on 
commercial SMP hardware, while their speedups on XMT were in the range 7X to 25X. For speedups, stress tests of XMT relative to 
state-of-the-art CPUs and GPUs for irregular fine-grained problems show speedups of up to 43X; these results assume similar 
silicon area and power, but much simpler algorithms. To facilitate these advantages, XMT was set up as a clean-slate design 
supporting the foremost theory of parallel algorithms.