Skip to content

Slides: 04 - Threads

Motivation Thread Creation/Manipulation/Termination Threading models

Traditional Process Model: One Thread per Process

Section titled “Traditional Process Model: One Thread per Process”

So far, each process had only one “thread of control” A thread is a “unit of execution” (hardware execution state: PC, general purpose registers, stack and SP)

For major OSs such as Unix and Windows, this has been the process model until mid-1990s This model is not efficient & is difficult to use for many non-trivial tasks!

Stack SP DATA (static data & heap)

PC

Assume you need to compute the sum of N numbers in a parallel machine with 4 CPU cores. How to implement this? Have the main application read the numbers into an array fork() 4 copies of itself, each computing the sum of N/4 numbers Merge the results N/4 N/4 N/4 N/4 data Fork 4 worker child processes Main Application Worker3 Worker4 Worker1 Worker2 Send the partial sum to the main application (master)

Computer running a POSIX OS

Worker1 Worker2 Worker3 Worker4 Master Process

Stack Stack Stack Stack Stack SP SP SP SP SP DATA DATA DATA DATA DATA PC

PC PC PC PC

Kernel partialSum Pipe

04-Threads/ex1_pipe.c

It works (was the model of computation for a long time), BUT:

Creating a new heavy-weight process using fork() is a very time-consuming operation. For each child process, fork() must create: A new PCB with resources shared with the parent (files, sockets, etc.) A new copy of the address space (code & data) A new execution state (PC, registers, stack and stack pointer)

Processes must explicitly share data via shared memory or message passing

What’s similar in these heavy-weight processes? they all share the same code and data (address space) they all share the same resources (files, sockets, etc.) they all share the same privileges

What’s different? each has its own hardware execution state PC, registers, stack pointer, and stack

Key idea: Separate the concept of a process (code, address space, etc.) from that of a “unit of execution” (hardware execution state: PC, registers, etc.) this “unit of execution” is usually called a thread, or a lightweight process

2-8 Each thread has a separate hardware execution state: PC, general purpose registers, stack, stack pointer

DATA partialSum T4 (doSum) T3 (doSum) T1 (doSum) T2 (doSum) Main

PC

PC PC PC PC Stack (T4) Stack (T1) Stack (T2) Stack (T3) Stack (Main) Observe that threads share everything (static & dynamic data) No need for explicit utilities to share data

Recall: A single-threaded process’s address space

Section titled “Recall: A single-threaded process’s address space”

0xFFFFFFFF stack (func. params & local vars)

SP

heap (dynamic allocated mem) address space

static data (data segment) code (text segment) PC

0x00000000

A multi-threaded process’s address space

Section titled “A multi-threaded process’s address space”

0xFFFFFFFF thread 1 stack SP (T1)

thread 2 stack SP (T2)

thread 3 stack SP (T3)

address space

heap (dynamic allocated mem)

static data (data segment) code (text segment) PC (T2)

PC (T1)

PC (T3)

0x00000000

Most modern OS’s (Mach, Chorus, Windows, Solaris, Linux, MacOS) therefore support two entities: the process, which defines the address space and general process resources (such as open files, sockets, etc.) the thread, which defines a sequential unit of execution within a process

A thread is bound to a single process processes, however, can have multiple threads executing within them All threads within a process share the same resources: code, data & heap

A thread is the unit of scheduling processes are just containers in which threads execute

Defined by IEEE in 1995 Provides a standardized API (specification) for: thread creation, management, and synchronization

FunctionDescription
pthread_createCreates a new thread
pthread_exitTerminates the calling thread
pthread_joinWaits for a specific thread to terminate
pthread_detachMarks a thread for automatic resource cleanup upon termination (no need to call pthread_join for cleanup)
pthread_selfReturns a handle to the calling thread
pthread_cancelRequest the cancelation of another thread

04-Threads/ex2.c

T1 (printMessage) Main

PC PC Stack (Main) Stack (T1)

04-Threads/ex3.c

gv

T1 (func1) T2 (func1) T3 (func2) Main

PC

PC PC PC

Stack (T1) Stack (Main) Stack (T2) Stack (T3)

04-Threads/ex4.c

Thread Termination/Multi-threaded Process Termination

Section titled “Thread Termination/Multi-threaded Process Termination”

To terminate, the thread calls pthread_exit() The thread becomes a Zombie Only cleaned up after another thread calls pthread_join() on it The thread can call pthread_detach(), which makes it a detached thread, which means that when the thread terminates, all resources are deleted automatically without the need to call pthread_join() on the thread

A multi-threaded process terminates under two conditions: All threads of the process terminate by calling pthread_exit() That is, the process will terminate only after the last thread calls pthread_exit() One of the threads invokes exit() system call terminating the process

You can also call pthread_cancel() to signal a thread to terminate A thread only responds to a cancellation request when it reaches a cancellation point or explicitly checks for cancellation Examples of standard cancellation points include: read, write and other blocking I/O operations, pthread_mutex_lock, pthread_cond_wait

handle = CreateThread(...) Create a new thread

WaitForSingleObject(handle) Wait for thread termination

Why multi-threading even in a uniprocessor?

Section titled “Why multi-threading even in a uniprocessor?”

Multi-threading is still very useful in a uniprocessor system even if only one of the threads can be running in the CPU at any given time Handling concurrent events (e.g., web servers, web browsers, editors, etc.) One thread to handle each incoming request (Web server) One thread downloading a page, one thread refreshing the GUI (Web browser) One thread reading user input, one thread refreshing the GUI, one thread saving the file to the disk in the background (Microsoft Word) One thread reading data from the keyboard, another thread reading data from a network socket, one thread updating the GUI (Chat application)

Thus, multi-threading greatly improves and simplifies program structure This is especially true when you have to deal with multiple blocking I/O operations

ConcurrencyParallelism
Ability of a system to handle multiple tasks or threads, often by interleaving their executionAbility of the system to execute multiple threads or tasks simultaneously
Does NOT necessarily mean that threads are executed at the same time; it can involve a single-core CPU switching between threads quicklyParallelism requires multiple execution units, such as multi-core CPUs, where each task/thread can run on a different core
is about structure (dealing with many things at once)is about execution (doing many things at once)

Concurrent execution of 4 threads T1, T2, T3 and T4 on a single-core system

Parallel execution of 4 threads T1, T2, T3 and T4 on a dual-core system

Legends:

older UNIXes address space

MS/DOS

one thread/process one thread/process

one process many processes thread

Mach, Chorus, Windows, Linux, MacOS

Java

many threads/process many threads/process one process many processes

Threading Models (Windows, Linux): One-to-One

Section titled “Threading Models (Windows, Linux): One-to-One”

Each user-level thread maps to a kernel thread Creating a user-level thread creates a kernel thread Thread creation and management requires system calls (clone to create) Adv: When a thread blocks for I/O, the kernel can schedule another thread leading to true concurrency Also: The kernel CAN schedule multiple threads in parallel

Many user-level threads mapped to single kernel thread A user-level library manages thread creation/management But: When a user-level thread blocks for I/O, all the other user-level threads get blocked with it Also: The kernel can NOT schedule multiple user-level threads in parallel Rarely used (Solaris Green Threads, GNU Portable Threads)

Many user-level threads mapped to many kernel thread OS allocates a sufficient number of kernel threads to run many user-level threads concurrently and in parallel Windows with the ThreadFiber package Difficult to implement and is NOT very common

Representing Threads in the Kernel: Windows

Section titled “Representing Threads in the Kernel: Windows”

Threads are implemented as a separate DS (Thread Control Block - TCB), which is then linked to the Process Control Block (PCB)

The basic unit, task, (task_struct) actually represents a thread! The task structure has pointers to common process resources Thus, two threads in the same process will point to the same resource structure instance

Does fork() duplicate only the calling thread or all threads? In Linux, the forked child process will have only one thread by default, which is the thread that called fork()

It is possible to clone all existing threads of the parent process when creating a new process using the clone system call, but it is very low-level and requires careful handling of synchronization issues

exec() works as expected: replace the running process including all threads and have a single thread that starts executing the “main” function