Concurrency, Parallelism, and Their Implementations
Updated: May 29
Before I began writing this paper, I assumed that the distinction between the two was very distinct, at least in the scientific communities. Posts like this (https://www.quora.com/What-), however, have made me more cautious about making the distinction.
Regardless, I am going to elaborate on the differences between the terms ‘concurrent’ and ‘parallel’ as they are presented in a talk by Rob Pike (https://vimeo.com/49718712). Pike states that although the two are occasionally used interchangeably, concurrency is not necessarily parallelism. Concurrency, as he defines it, is the simultaneous execution of two or more processes over a period of time. Because computers can perform operations at such a rapid rate, it is easy to take for granted the importance of the phrase “period of time”. While an operating system may seem to be doing many things simultaneously to a layman, such as running a web browser, playing music, and tracking the movements of your mouse, the execution of each individual program may not necessarily be ‘simultaneous’ on the micro or nanosecond time frame. Thus the definition of the word ‘concurrent’ has to allow for this difference between operations that are perceived to be simultaneous, and those that are in fact simultaneous on any time scale. This is where parallelism comes into play. Parallelism is inflexible in its regard to the difference between truly simultaneous operations and those are merely perceived to be simultaneous; for something to be considered parallel, it must be executing across multiple different locations at the exact same time.
The implementation of concurrency in programming is another interesting topic. In a single core machine, the illusion of parallelism can be achieved through multithreading. Multithreading exists on multiple core machines as well, but it is vital to concurrent operation in single core machines. Threads are units of execution which consist of a program counter, a stack, and a set of registers. Because each thread has its own resources, a processor can alternate the execution of many threads to give the appearance of parallelism. There are two different types of threads, user threads and kernel threads. User threads are above the kernel and without kernel support. These are the threads that application programmers use in their programs. Kernel threads are supported within the kernel of the OS itself. All modern operating systems support kernel level threads, allowing the kernel to perform multiple simultaneous tasks and/or to service multiple kernel system calls simultaneously.
Many languages now abstract away the need to think in terms of operating systems threads. For example, C# uses async/await keywords to hide the manipulation of the Task Parallel Library. Go, a programming language designed by Pike and others at Google, goes a step beyond this as well and allows in-process execution threads, which are referred to as ‘goroutines’, to be multiplexed onto the much heavier operating system threads previously described. Therefore, even on single-core machines, the design of programs with concurrency in mind may speed up the total execution of the program.
The screen shot above of the Activity Monitor for macOS shows the number of operating system threads that each process is occupying. The number of threads is shown to be 1334, while the number of processes is 282. By allowing a single process to occupy multiple threads, the operating system is enhancing the ability of that process to multi-task. There is a limit to this optimization however. The kernel will only allow a certain number of operating system threads to be created. Programs will not be able to create more than what is limited by the operating system. So, for example, a Linux process cannot create thousands of operating system threads, even if that process desperately wants to increase its speed through concurrency. The workaround to this is to make the in-process execution threads mentioned earlier, such as goroutines in Go. A process can spin up thousands of goroutines yet still only use a few dozen operating system threads. This is possible because the vast majority of the goroutines will be idle at any one time, but they can be multiplexed onto operating system threads when they need to become active.
Elegant Software Solutions