Posted by Bartosz Milewski under Programming
May 26, 2009
[27] Comments
Most languages were caught unaware by the multicore revolution. C++, predictably, developed a portable assembly language for multiprocessors (C++0x atomics). Out of sheer desperation some programmers turned to Erlang. Many still try to ignore the elephant in the room, while others try to shoot it with BB guns provided by their favorite languages.
I’ve looked at many languages and decided that none provided the right combination of performance and safety. We are still waiting for the next big one to break into the multicore territory.
Towards this goal, I’ll present a series of posts in which I’m going to develop a
threading model for that hypothetical language. The model is based on several papers that I reviewed in my previous blogs posts. My own contribution is putting the best of those ideas together into one package. I will use syntax similar to that of the D programming language, but C++ and Java programmers shouldn’t have problems following it.
Teaser #1
In one of my previous posts I described a concurrent data structure based on Haskell’s MVar. It’s essentially a one-element message queue. Let’s have a peek at the implementation that uses my proposed multithreaded scheme. Notice the total absence of special annotations–even the synchronized keyword is missing. The only unusual thing is the move operator, :=, introduced in my previous blog. It is needed in case we want to pass unique objects through MVar. You are probably asking yourself, How the heck is this supposed to work in a multithreaded environment? Read on!
class MVar<T> {
private:
T _msg;
bool _full;
public:
// put: asynchronous (non-blocking)
// Precondition: MVar must be empty
void put(T msg) {
assert (!_full);
_msg := msg; // move
_full = true;
notify();
}
// take: If empty, blocks until full.
// Removes the message and switches state to empty
T take() {
while (!_full)
wait();
_full = false;
return := _msg;
}
}
Let’s instantiate this template as a monitor–that is an object accessible from multiple threads. We do it by specifying the owner as “self” (the this owner is not explicitly listed as a template parameter, but it can be passed to the template during instantiation).
auto mVar = new MVar<owner::self, int>;
In a self-owned object all methods are automatically synchronized. The move operator acting on an int simply copies it.
That was easy! How about instantiating a self-owned MVar for passing unique objects? Piece of cake!
auto q = new MVar<owner::self, unique Foo>;
auto foo = new unique Foo;
q.put(:= foo); // move foo
assert (foo is null);
// another thread
unique Foo f2 = q.get(); // implicit move of rvalue
So far data races have been avoided by implicit synchronization in self-owned objects and by the use of move semantics for unique objects.
Of course you know how to break the MVar, right? Instantiate it with a regular, non-unique class objects and watch aliases spread through threads. Let’s try it:
auto mVar = new MVar<owner::self, Foo>;
auto myFoo = new Foo;
mVar.put(myFoo);
// now let me race through my local alias!
myFoo.method();
Surprise! This code will not compile because a self-owned object cannot have a thread-local member. Because of a richer type system, the compiler can easily detect such violations. I’ll explain the details of the ownership scheme in my future posts–for now let me assure you that, no matter how hard you try, you can’t create a data race in this system (unless you bypass it with explicit casts).
How to making concurrent programming safer?
I propose to do this by extending the type system. (You’ve already seen some examples above.) I realize that there is strong resistance to extending a language’s type system. The main problem is increased complexity. The programmers are reluctant to study and use complex annotation schemes. I will argue that, in the case of multithreading, the complexity argument is reversed. Shared-memory multithreading is hard! Figuring out how to avoid all the pitfalls of sharing is harder than learning a type system that prevents them.
If a programmer is not willing to learn or cannot understand the race-free type system, he or she should not be writing multithreaded programs.
How to limit complexity?
To limit the complexity of the type system, I came up with several simple principles. The most important one is that multithreaded extensions should not require modifications to single-threaded programs. That means coming up with reasonable defaults for all new annotations.
If necessary, the compiler should be able to derive some of the defaults from program analysis. Because modularity is very important, such analysis should not creep across compilation unit boundaries. Whole-program analysis is out of the question.
How to keep goals reasonable?
There are two major challenges in multithreaded programming:
Avoiding races
Preventing deadlocks
The first problem is approachable, whereas the second seems to be a pie in the sky. I will therefore concentrate on designing a race-free system that is based on existing academic research.
Such system would be incomplete without some support for lock-free programming. There are very good reasons for the language to enforce sequential consistency. For comparison, C++ committee, after a long struggle, decided to allow non-SC primitives (weak atomics). Java only enforces SC for volatile variables.
Teaser #2
Speaking of lock-free programming, here’s the implementation of the infamous Double-Checked Locking Pattern:
class Singleton<T>
{
private:
lockfree T _value;
public:
T get() lockfree
{
if (_value is null)
{
synchronized(this)
{
if (_value is null)
_value = new T;
}
return _value;
}
}
}
A lockfree variable is guaranteed to be always atomically accessed in a sequentially consistent way. A lockfree method is not synchronized, but it can only operate on lockfree variables (or use a synchronized section). Even though the use of lockfree doesn’t introduce low-level races, it may create high-level races when accessing more that one lockfree variable at a time. But we all know that lock free programming is only for desperadoes.
Lessons learned from previous work
There are many commonalities in various approaches to race freedom.
Ownership should be built into the type system. In OO languages each object must have an owner. In C-like languages each piece of data must have an associated lock.
There should be efficient mechanisms for passing data between threads
By value
By reference. (The object being passed must be a monitor.)
As immutable data, by const reference
By unique reference using move semantics
Most of those ideas are expressible through type qualifiers.
Higher-level models
Explicit synchronization is hard, no doubt about it. In my proposal, the type system makes it safe. There are however other models of concurrency that, although more restrictive, are often easier to use.
One such model is based on message passing. In my opinion, support for message passing should be library-based. The race-free type system will make it safe and flexible. It will, for instance, support passing messages not only by value but also by reference–without sacrificing safety. In traditional type systems there is no way to express the requirement that a message passed by reference must either be a monitor itself (have it’s own lock) or behave like a C++ unique_ptr (an object that leaves no aliases behind). This requirement should be expressible through the type system, allowing the compiler to check for its violations.
I’ve been paying a lot of attention to software transactional memory (STM). I believe that STM could be built into the language–again, with the support of the type system. It looks like the hardest problem with STM is how it should inter-operate with other types of multithreaded access (both locked and lock-free). The simplest solution is to enforce full isolation–no STM object can ever be accessed outside of a transaction. It’s not clear how practical such approach is, but it definitely simplifies things to the point that STM becomes orthogonal to other types of synchronization. And that means it will be possible to tackle it at a later time, after the race-free multithreading model is firmly established.
In the next installment, I will describe the ownership scheme that is the foundation of the race-free type system.
Please vote for this article on reddit.
Bibliography
C. Flanagan, M. Abadi, Object Types against Races. The seminal paper introducing ownership-based type systems.
See also my previous posts on Guava and GRFJ that discuss race free dialects of Java.
分享到:
相关推荐
《.NET多线程编程》是一本专注于C# .NET平台下多线程技术的专业书籍。多线程是现代软件开发中的重要概念,特别是在多核处理器普及的今天,利用多线程可以有效提升应用程序的性能和响应性。本书旨在帮助读者理解和...
`python CVE-2018-2628-MultiThreading.py` Usage: Place the 'ip:port' to be detected into the url.txt file in the same directory. Then run: `python CVE-2018-2628-MultiThreading.py` 运行环境: ...
1,Real-Time Embedded Multithreading Using ThreadX and MIPS 2,Real-Time_Embedded_Multithreading_Using_ThreadX 3,Real-Time Embedded ...5,(CMP) Real-Time Embedded Multithreading--Using ThreadX & ARM
- **第1章:并发世界的“Hello World”**:本章介绍了C++并发编程的基本概念,并通过一个简单的示例程序展示了如何创建和管理线程。这为读者提供了一个初步的认识,帮助理解并发编程的基本思想。 - **第2章:线程...
1 ■ Hello, world of concurrency in C++! 2 ■ Managing threads 3 ■ Sharing data between threads 4 ■ Synchronizing concurrent operations 5 ■ The C++ memory model and operations on atomic types 6 ...
1. **RTOS基础知识**:了解RTOS的基本概念,包括任务、调度器、信号量、互斥锁、事件标志组、消息队列等,这些是实现多线程并发的基础。 2. **ThreadX架构**:ThreadX的核心组件包括任务管理、内存管理、定时器服务...
1. `Application.java`: 这是Spring Boot的主入口类,通常会包含`@EnableAsync`注解,用于开启异步支持。这个注解会触发Spring扫描并配置异步方法。 2. `ThreadPoolConfig.java`: 这个类可能用来自定义`...
1. **基础概念**:介绍多线程的基本原理,包括线程的创建、同步和通信机制,以及为何在现代计算中使用多线程。 2. **C++标准库中的线程支持**:讲解std::thread的使用,如何启动、管理和终止线程,以及线程与主程序...
npm install react-native-multithreading npx pod-install 需要包括的react-native-reanimated版本。 您可以自己打补丁,也可以等到它发布后再发布。 :warning: 警告:这仍然只是概念证明-请勿在生产中使用该库...
该代码不需要selenium,直接使用requests大规模爬取指定商品的评论,并保存到csv中,效率高,同时使用多线程进一步提高效率。
1. 继承Thread类: 当你需要创建一个新的线程时,可以创建一个新的类来继承Thread类。在子类中重写Thread类的run()方法,将要执行的任务放入run()方法中。然后创建该子类的实例并调用start()方法启动新线程。例如:...
在L06-Multithreading - 1的课件中,主要讲解了两种创建和启动线程的方法:继承Thread类和实现Runnable接口。 首先,我们来看通过扩展Thread类来创建线程的方式。自定义一个类(如CustomThread)继承Thread,然后...
react-native-multithreading using使用JSI的React Native的快速简便的多线程处理。 安装npm install react-native-multithreading npx pod-i react-native-multithreading using使用JSI进行React Native的快速简便...
一个比较好的多线程例子 用于学习线程的号厘子
《实时嵌入式多线程应用ThreadX与MIPS架构》 在当今高度发达的科技社会,嵌入式系统无处不在,它们隐藏在我们日常使用的各种设备之中,如消费电子、汽车、政府设施、军事装备、通信工具以及医疗设备等。...
1. **类与对象**:C#是一种面向对象的语言,游戏中的角色、赛道、障碍物等都可以抽象为类,通过实例化这些类创建游戏中的对象。每个类都包含了其特有的属性(如位置、速度、形状)和方法(如移动、碰撞检测)。 2. ...
1. **Thread类**:直接使用Java的Thread类创建新线程,重写run()方法放置下载逻辑。但这并不推荐,因为它不提供线程管理功能,可能导致内存泄漏。 2. **Handler和Looper**:创建一个后台线程,利用Looper和Handler...
本压缩包“PYthon-multithreading-Test.rar”包含了有关Python多线程测试的源码,旨在帮助用户深入理解和实践Python的线程操作。 Python中的多线程是通过`threading`模块实现的,这个模块提供了基本的线程和同步...