06th June 2024
Concurrency: Understanding Multithreading in C++
In today's computing world, optimizing performance and ensuring smooth user experiences are crucial. Concurrency and multithreading are powerful tools in a developer's arsenal to achieve these goals. In this blog, we will explore the concepts of concurrency and parallelism, delve into the basics of multithreading in C++, and learn how to effectively manage threads to avoid common pitfalls such as race conditions and deadlocks.
What is Concurrency?
Concurrency is essentially multi-tasking, where multiple operations are executed simultaneously. The order of these tasks can matter, as some tasks may depend on others and may even wait for shared resources.
For instance, when you're typing a document, your email client can check for new messages in the background, and your music player can play your favorite songs, all at the same time. The operating system manages these tasks, allowing them to make progress without any one task blocking the others indefinitely.
Concurrency vs. Parallelism
It's essential to differentiate concurrency from parallelism, as these terms are often mistakenly used interchangeably.
What is Parallelism?
Parallelism is about multiple tasks or subtasks of the same task that literally run at the same time on hardware with multiple computing resources like multi-core processors.
Key Difference:
Execution:
- Concurrency Tasks start, run, and complete in overlapping periods but not necessarily simultaneously.
- Parallelism: Tasks run simultaneously on multiple cores or processors.
Resources:
- Concurrency Can occur on a single-core processor through time slicing.
- Parallelism: Requires multiple cores or processors.f
Achieving Concurrency Through Threads
Before we dive into the details, let's understand the basics. We'll start by looking at what a process is and then explore how threads fit into the picture to make programs run faster.
What is a Process?
A process is an instance of a program that's being executed. Each process has its own memory space, containing the program code, data, and runtime resources.
For example, imagine you have multiple instances of a web browser open on your computer. Each of these instances is a separate process, with its own memory allocation, running independently from the others.
What is a Thread?
A thread is a basic unit of CPU utilization, representing a single sequence of execution within a process or a lightweight process. Threads share the same memory space and resources as the process that spawned them.
The key differences between a thread and a process are:
- Process: Has its own memory space.
- Thread: Shares memory space with other threads within the same process.
If one process is blocked, other processes will continue to work. However, if a thread is blocked, it can potentially stall the entire process.
Why? When a process is blocked, other processes can continue their execution unaffected because they operate in separate memory spaces. However, if a thread within a process is blocked, it can potentially stall the entire process, as all threads within the process share the same memory space and resources. This tight coupling between threads within a process necessitates careful synchronization and resource management to avoid bottlenecks and deadlocks.
What is Multi-Threading?
Multithreading is a programming concept that allows multiple threads to execute within a single process concurrently. In simpler terms, it's like having multiple tasks running at the same time within a program. Each thread represents a separate flow of execution, allowing different parts of the program to perform tasks simultaneously.
Multithreading Example: Web Browser
- Process: When you open a web browser like Chrome or Firefox, it starts a new process. Each tab you open within the browser is a separate process. This means if one tab crashes, it doesn't affect the other tabs, as they are running in their own isolated processes.
- Thread: Within each tab's process, there are multiple threads handling different tasks. For instance, one thread might be responsible for rendering the webpage, another for handling user input, and yet another for downloading resources like images or scripts.
- Multithreading: Multithreading comes into play when the browser needs to handle multiple tasks simultaneously within each tab. For example, while one thread is downloading an image, another thread can continue rendering the webpage, ensuring a smoother user experience.
Achieving Multithreading in C++ with the Standard Library
In this section, we'll explore how to utilize the standard library for threading in C++.
Creating Our First Thread
Let's dive in and create our first thread. But first, let's understand the syntax.
#include <iostream>
#include <thread>
void print() {
std::cout << "I am from thread" << std::endl;
}
int main() {
std::thread t(print);
std::cout << "I am from the main Thread" << std::endl;
return 0;
}
“You've successfully created your first thread! However, did you encounter any errors like the following?”
I am from main Thread
I am from a thread
abort() has been called
Debug Error!
Understanding the Error
If you encounter an error like "abort() has been called," it occurs because the main program terminates before the thread completes its execution. This happens due to the nature of concurrency – multiple operations occurring simultaneously, with some tasks dependent on others or sharing resources.
Let’s understand the below concepts.
Understanding Join and Detach
When you examine the order of the print statements, you might notice that the main thread executes first, followed by the thread. This might seem counterintuitive since we wrote the thread to execute first.
Joining Threads
By calling join() on the thread object in the main function, we wait for the thread to finish its execution before allowing the main program to proceed. This ensures that the main program doesn't terminate prematurely, avoiding errors related to thread execution.
Detach Threads
However, sometimes we may want our threads to execute independently, without waiting for the main program to finish. This is where detach() comes in. Let's see how we can use it:
void print(int a)
{
cout << "I am from thread" << endl;
}
int main() {
thread t(&print);
cout << "I am from the main Thread" << std::endl;
t.detach(); // Detach the thread for independent execution
// Note: Detached threads are not joinable and should manage their own resources
return 0;
}
Once detached, the thread executes independently of the main program.
Solving the Problem
It depends on your requirements whether to use join or detach.
Creating Multiple Threads
Let's create multiple threads to see how they work together.
#include <iostream>
#include <thread>
#include <vector>
void task(int id) {
std::cout << "I am task " << id << std::endl;
}
int main() {
const int num_threads = 5; // Number of threads to create
std::vector<std::thread> threads; // Vector to hold thread objects
// Create threads dynamically using a loop
for (int i = 0; i < num_threads; ++i) {
threads.push_back(std::thread(task, i + 1));
}
// Join all the threads
for (auto& thread : threads) {
thread.join();
}
std::cout << "All tasks completed!" << std::endl;
return 0;
}
Observing the Output
When you run the above code, you might observe output similar to the following:
I am task I am task I am task I am task I am task 1
4
5
2
3
All tasks completed! The order will be different for different users due to the nature of multithreading.
Note:The above order will be different each time.
Why the Output Order Differs
The output order of the tasks may not match the order in which the threads were created (i.e., task 1, task 2, task 3, task 4, task 5). This discrepancy is due to the nature of multithreading. Here are some reasons why the output order may vary:
- Concurrency: When multiple threads are created, they run concurrently. This means that the operating system can switch between threads at any time. Therefore, the thread that prints first is not necessarily the one that was created first.
- Scheduling by the Operating System The operating system's scheduler decides the order in which threads are executed. This scheduling is influenced by various factors, including the current load on the system, thread priorities, and specific scheduling algorithms used by the OS.
- Execution Time: The time taken to execute the task function for each thread might vary slightly, causing threads to complete their tasks at different times.
- Resource Contention: If multiple threads are trying to access the same resources (e.g., the console for output), there can be contention, leading to unpredictable order of execution.
Ensuring a Specific Order
Introduction to Mutex Guards
When working with multithreaded programs, one common issue is the need to control access to shared resources. If multiple threads try to modify the same resource simultaneously, it can lead to race conditions, where the output or behavior of the program becomes unpredictable. This is where mutexes (mutual exclusions) come in handy.
A mutex ensures that only one thread can access a particular resource at a time. In C++, the 'mutex' tag library provides this functionality.
Critical Section
A critical section is a part of the code that accesses a shared resource that must not be concurrently accessed by more than one thread. Accessing shared resources without proper synchronization can lead to data races and undefined behavior.
Using std::mutex with lock() and unlock()
A std::mutex is a synchronization primitive provided by the C++ Standard Library to protect shared data from being simultaneously accessed by multiple threads.
mutex.lock()
The lock() member function of std::mutex locks the mutex. If the mutex is already locked by another thread, the calling thread will block (i.e., wait) until the mutex becomes available.
mutex.unlock()
The unlock() member function of std::mutex unlocks the mutex, making it available for other threads to lock.
Using std::lock_guard
While lock() and unlock() are straightforward, they require careful handling to ensure that unlock() is called even if an exception occurs, which can be error-prone. This is where std::lock_guard comes in.
std::lock_guard
std::lock_guard is a RAII (Resource Acquisition Is Initialization) mechanism that ensures a mutex is properly locked and unlocked. It locks the mutex when the lock_guard is created and automatically unlocks it when the lock_guard goes out of scope.
Adding Mutex Guards to the Code
Let's modify the previous code to use a mutex to ensure that the console output is thread-safe.
Here's the modified code:
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
std::mutex cout_mutex;
void task(int id) {
std::lock_guard<std::mutex> guard(cout_mutex);
std::cout << "I am task " << id << std::endl;
}
int main() {
const int num_threads = 5; // Number of threads to create
std::vector<std::thread> threads; // Vector to hold thread objects
// Create threads dynamically using a loop
for (int i = 0; i < num_threads; ++i) {
threads.push_back(std::thread(task, i + 1));
}
// Join all the threads
for (auto& thread : threads) {
thread.join();
}
std::cout << "All tasks completed!" << std::endl;
return 0;
}
Explanation
Declaring a Mutex:
std::mutex cout_mutex;We declare a std::mutex named cout_mutex that will be used to protect access to std::cout.
Using std::lock_guard:
std::lock_guard 'std::mutex' tag guard(cout_mutex);Inside the task function, we create a std::lock_guard object named guard, passing cout_mutex to its constructor. This ensures that the mutex is locked when the guard object is created and automatically unlocked when the guard object goes out of scope (i.e., at the end of the task function). This approach guarantees that std::cout operations are protected by the mutex, preventing multiple threads from writing to the console simultaneously.
Critical Section:
std::cout << "I am task " << id << std::endl;The code within the critical section (the std::cout statement) is protected by the mutex, ensuring that only one thread can execute it at a time.
Observing the Output
With the mutex guard in place, the output will still show task IDs in potentially different orders because the threads are still running concurrently. However, the output for each task will be complete and not interleaved with others, thus avoiding mixed or garbled output:
I am task 1
I am task 3
I am task 2
I am task 5
I am task 4
All tasks completed!Introducing a Data Race Condition
We'll modify our program to include a shared variable that multiple threads will access and modify, creating a data race condition. Let's add a simple counter that each thread increments.
Here's the modified code with a data race condition:
#include <iostream>
#include <thread>
#include <vector>
int counter = 0; // Shared counter variable
void task(int id) {
for (int i = 0; i < 100; ++i) {
++counter; // Increment counter
}
std::cout << "Task " << id << " completed." << std::endl;
}
int main() {
const int num_threads = 5; // Number of threads to create
std::vector<std::thread> threads; // Vector to hold thread objects
// Create threads dynamically using a loop
for (int i = 0; i < num_threads; ++i) {
threads.push_back(std::thread(task, i + 1));
}
// Join all the threads
for (auto& thread : threads) {
thread.join();
}
std::cout << "Final counter value: " << counter << std::endl;
std::cout << "All tasks completed!" << std::endl;
return 0;
}
Explanation of Data Race Condition
In the above code, multiple threads are incrementing the shared counter variable without any synchronization. This leads to a data race condition, where multiple threads simultaneously read and write to counter, causing inconsistent and unpredictable results. The final value of counter may not be what you expect because some increments might be lost due to the race condition.
Task 1: Solving the Data Race Condition with a Mutex
Hint :To solve this issue, we need to ensure that increments to counter are done atomically, which means only one thread can update counter at a time. We can achieve this by using a std::mutex to protect the critical section where counter is incremented.
Task 2: Creating a Deadlock
In this task, we will not use lock_guard and instead manually lock the mutex without unlocking it. Run the program and observe what happens.
ObservationDid your program hang? This is called a deadlock.
Deadlock ExplainedA deadlock occurs when a thread is waiting for a resource (in this case, a mutex) that will never become available because it is held indefinitely by another thread (or even the same thread). This situation can cause the program to hang, as the waiting thread cannot proceed. To avoid this, we can use lock guards.
Why Multithreading Matters
Multithreading is crucial for modern software development due to its numerous benefits:
1. Enhanced Performance:
- Multithreading enables parallel execution of tasks on multi-core processors, maximizing CPU utilization and reducing task completion time.
- By distributing the workload among threads, programs can achieve higher throughput and responsiveness.
2. Improved Responsiveness:
- Applications can remain interactive and responsive to user input even while performing computationally intensive or time-consuming tasks in the background.
- This is particularly important in user interfaces, where responsiveness enhances the user experience.
3. Concurrent Operations:
- Multithreading facilitates the concurrent execution of tasks, allowing programs to handle multiple operations simultaneously.
- This is beneficial for I/O-bound tasks (e.g., reading/writing files, network communication) and server applications handling multiple clients concurrently.
4. Modularity and Simplified Logic:
- Multithreading promotes modular design by enabling the separation of different functionalities into separate threads.
- Threads can focus on specific tasks, leading to simpler and more manageable code.
Practical Examples
- Web Servers: Multithreading enables web servers to handle multiple client requests concurrently, improving responsiveness and throughput.
- Desktop Applications: Applications like word processors, web browsers, and media players use multithreading to perform background tasks without blocking the user interface.
- Real-time Systems: Multithreading ensures timely processing of critical tasks in real-time systems such as flight control software and medical devices.
Conclusion
Concurrency and multithreading are essential concepts for building efficient, responsive, and high-performing applications. By understanding and properly managing threads, developers can harness the full power of modern multi-core processors, delivering better user experiences and more robust software. However, it is crucial to handle synchronization and avoid common pitfalls like race conditions and deadlocks to ensure the correctness and reliability of multithreaded programs.