- 浏览: 558103 次
- 性别:
- 来自: 杭州
-
文章分类
最新评论
-
GGGGeek:
看完了博主的博文,如果没猜错的话应该是浙大吧?很多优秀的人因为 ...
转《D君的故事》 以时刻警示自己 -
游牧民族:
楼主写的不错,学习了,最近对爬虫比较感兴趣,也写了些爬虫相关的 ...
通用爬虫框架及heritrix爬虫介绍 -
jimmee:
jerome_s 写道ice 你怎么看? 粗略的看了一下ice ...
MessagePack, Protocol Buffers和Thrift序列化框架原理和比较说明 -
jerome_s:
ice 你怎么看?
MessagePack, Protocol Buffers和Thrift序列化框架原理和比较说明 -
jimmee:
nk_tocean 写道照着做了,但是不行啊,还是乱码.先确认 ...
hive编写udf处理非utf-8数据
3.1. Visibility(可见性)
Visibility is subtle because the things that can go wrong are so counterintuitive. In a single-threaded environment, if you write a value to a variable and later read that variable with no intervening writes, you can expect to get the same value back. This seems only natural. It may be hard to accept at first, but when the reads and writes occur in different threads, this is simply not the case. In general, there is no guarantee that the reading thread will see a value written by another thread on a timely basis, or even at all. In order to ensure visibility of memory writes across threads, you must use synchronization.
可见性可能是非常微妙的,因为经常会有违反直觉的错误发生。在单线程环境中,如果你在某一个时刻写入变量,过了一段时间之后读取变量,如果这之间没有写入的话,你应该可以得到相同的值。这看上去非常自然。尽管看上去好像无法理解,当在多线程环境中进行读写操作的时候,事情就有有一些不同。基本上,无法保证读线程可以准时的获取到写线程写入的值。为了确保写操作对于其他线程的内存可见性,你必须使用同步机制。
NoVisibility in Listing 3.1 illustrates what can go wrong when threads share data without synchronization. Two threads, the main thread and the reader thread, access the shared variables ready and number. The main thread starts the reader thread and then sets number to 42 and ready to true. The reader thread spins until it sees ready is true, and then prints out number. While it may seem obvious that NoVisibility will print 42, it is in fact possible that it will print zero, or never terminate at all! Because it does not use adequate synchronization, there is no guarantee that the values of ready and number written by the main thread will be visible to the reader thread.
Listing3.1中的代码展现了如果不使用同步,线程共享数据的时候可能会出错。主线程和读线程都会访问共享变量ready和number。主线程创建了Reader线程,然后把number设置成42,ready设成true。当Reader线程会陷入死循环中,一直等到发现准备好为止。很明显NoVisibility类会打印42.实际上也有可能会打印0或者永远不会停止。因为没有利用足够的同步机制,这样就无法确保ready和number的值被主线程的修改对读线程是可见的。
Listing 3.1. Sharing Variables without Synchronization. Don't Do this.
NoVisibility could loop forever because the value of ready might never become visible to the reader thread. Even more strangely, NoVisibility could print zero because the write to ready might be made visible to the reader thread before the write to number, a phenomenon known as reordering. There is no guarantee that operations in one thread will be performed in the order given by the program, as long as the reordering is not detectable from within that thread even if the reordering is apparent to other threads.[1] When the main thread writes first to number and then to done without synchronization, the reader thread could see those writes happen in the opposite order or not at all.
NoVisibility类可能会无限循环下去,因为ready值可能对Reader线程来说一直是不可见的。在更为特殊的情况下,NoVisibility甚至可能会打印出0.因为可能会在写入number之前写入ready值,这是由于著名的“reordering”现象。尽管“recordering”对其他线程可能是可以被察觉的,但是只要在某一个线程内部该现象无法被察觉,那么就无法确保操作以程序中给定的顺序执行。当主线程在没有同步机制的情况下,首先写入数值,然后再去设置ready的值的话,Reader线程可能会察觉到写入是以相反的顺序发生的,甚至可能根本就没有发生。
[1] This may seem like a broken design, but it is meant to allow JVMs to take full advantage of the performance of modern multiprocessor hardware. For example, in the absence of synchronization, the Java Memory Model permits the compiler to reorder operations and cache values in registers, and permits CPUs to reorder operations and cache values in processor-specific caches. For more details, see Chapter 16.
这可能看上去是一个很差的设计,但是却可以使得JVM充分的利用现代多处理器硬件的所有优势。例如,在没有同步机制的情况下,Java的内存模型允许编译器打乱操作顺序并且可以在寄存器中缓存数值。允许cpu打乱指令的操作顺序,并且可以在处理器级别的缓存中缓存数值。第十六章中,有更详细的描述。
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions “must” happen in insufflciently synchronized multithreaded programs will almost certainly be incorrect.
在没有同步机制的情况下,编译器、处理器以及运行时环境可能会彻底将表明的执行指令弄乱。在没有足够同步机制的多线程程序中,要想弄清楚内存动作的顺序可能会是错误的。
NoVisibility is about as simple as a concurrent program can get two threads and two shared variables and yet it is still all too easy to come to the wrong conclusions about what it does or even whether it will terminate. Reasoning about insufficiently synchronized concurrent programs is prohibitively difficult.
NoVisibility类非常简单的并发程序,只拥有两个线程和两个共享的变量,即便如此,我们还是很容易就得到了错误的结果,甚至可能会陷入死循环。并发程序同步机制不够的问题是极其难以发现的。
This may all sound a little scary, and it should. Fortunately, there's an easy way to avoid these complex issues: always use the proper synchronization whenever data is shared across threads.
可能听上去有些吓人,但是事实上的确是这样。幸运的是,有一种简单的方法可以避免这种复杂性,只要数据在线程之间被共享,就要永远使用合适的同步机制。
3.1.1. Stale Data(过期数据)
NoVisibility demonstrated one of the ways that insufficiently synchronized programs can cause surprising results: stale data. When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable. Worse, staleness is not all-or-nothing: a thread can see an up-to-date value of one variable but a stale value of another variable that was written first.
NoVisibility类为我们展示了缺少足够的同步机制可能会引起令人诧异的结果:过期数据。当Reader线程检查ready值的时候,可能会看到过期值。除非在每个变量被访问的时候都是用同步机制,否则都有可能会看到某个变量的过期值。更为糟糕的是,过期并不是“all-or-nothing”的,一个线程可能会看到一个变量的最新值和另外一个变量的过期值。
When food is stale, it is usually still edible just less enjoyable. But stale data can be more dangerous. While an out-of-date hit counter in a web application might not be so bad,[2] stale values can cause serious safety or liveness failures. In NoVisibility, stale values could cause it to print the wrong value or prevent the program from terminating. Things can get even more complicated with stale values of object references, such as the link pointers in a linked list implementation. Stale data can cause serious and confusing failures such as unexpected exceptions, corrupted data structures, inaccurate computations, and infinite loops.
当食物过期时候,食物仍然是可以吃的,只不过是没有那么美味而已。如果数据过期的话,那危险就来了。如果一个web应用的“点击计数”过期的话,情况或许并不算太糟糕。过期数据可能会引起严重的安全和存活性问题。在NoVisibility类中,过期数据可能会引起数据错误或者导致死循环。如果对象的应用变成过期数据的话,情况会更加复杂。
[2] Reading data without synchronization is analogous to using the READ_UNCOMMITTED isolation level in a database, where you are willing to trade accuracy for performance. However, in the case of unsynchronized reads, you are trading away a greater degree of accuracy, since the visible value for a shared variable can be arbitrarily stale.
在没有同步机制的情况下读取数据有点儿像在数据库中使用“未提交”的隔离级别,这样做可以获得比较高的性能。但是对于非同步读来说,你将会失去比较高的准确度,因为某一个共享变量的可见值可能是过期的。
MutableInteger in Listing 3.2 is not thread-safe because the value field is accessed from both get and set without synchronization. Among other hazards, it is susceptible to stale values: if one thread calls set, other threads calling get may or may not see that update.
由于值域被get和set方法在没有同步机制的情况下访问,Listing3.2中的MutableInteger类不是线程安全的。与其他并发威胁相比,过期数据可能是易受影响的。如果一个线程调用set方法,那么其它线程可能会也可能不会看到数据的更改。
We can make MutableInteger tHRead safe by synchronizing the getter and setter as shown in SynchronizedInteger in Listing 3.3. Synchronizing only the setter would not be sufficient: threads calling get would still be able to see stale values.
可以通过使用同步的getter和setter方法将MutableInteger设置成线程安全的。只同步setter方法是不够的,调用get方法的线程还是会看到过期数据。
Listing 3.2. Non-thread-safe Mutable Integer Holder.
Listing 3.3. Thread-safe Mutable Integer Holder.
3.1.2. Nonatomic 64-bit Operations(非原子的64比特操作)
When a thread reads a variable without synchronization, it may see a stale value, but at least it sees a value that was actually placed there by some thread rather than some random value. This safety guarantee is called out-of-thin-air safety.
当线程在没有同步机制的情况下读取变量的时候,线程可能会看到过期数据,但是至少这个数据是曾经被后一个线程放上的,而至于是一个随机数据。这种安全保证被称为“out-of-thin-air”。
Out-of-thin-air safety applies to all variables, with one exception: 64-bit numeric variables (double and long) that are not declared volatile (see Section 3.1.4). The Java Memory Model requires fetch and store operations to be atomic, but for nonvolatile long and double variables, the JVM is permitted to treat a 64-bit read or write as two separate 32-bit operations. If the reads and writes occur in different threads, it is therefore possible to read a nonvolatile long and get back the high 32 bits of one value and the low 32 bits of another.[3] Thus, even if you don't care about stale values, it is not safe to use shared mutable long and double variables in multithreaded programs unless they are declared volatile or guarded by a lock.
[3] When the Java Virtual Machine Specification was written, many widely used processor architectures could not efficiently provide atomic 64-bit arithmetic operations.
“Out-of-thin-air”适用于所有的变量操作,但是还是有例外,这就是那些没有被设置成volatile的64比特的数字操作(double和long)。JVM要求读取和存储数据必须是原子的,但是对于非volatile的long型和double型,JVM允许当做两个单独的32比特进行读写操作。如果读写操作出现在不同的线程中的时候,这就有可能会出现读取到一个nonvolatile的长整型或者得到高位的32比特或者低位的32比特。这样即使你不去关心过期数据的问题,在多线程环境中使用可变的long和double变量也是不安全的。除非他们被声明为volatile的或者被锁守护。当Java虚拟机规范制定的时候,很多被广泛使用的处理器架构无法有效地提供原子的64比特的算术操作。
3.1.3. Locking and Visibility(锁和可见性)
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
内在的锁机制可以用来保证一个线程使用一种可以预期的方式看到另外一个线程的结果,这种方式在图3.1中有所体现。当线程A执行完同步代码块之后,接着线程B进入了被同一把锁所保护的同步代码块,当B获得锁的时候,在释放锁前对线程A可见的数据都被赋予给线程B。也就是说,线程A在同步锁被授予线程B过程中和之前的所有事情当线程B执行该同步代码块的时候都是可见的。如果没有同步机制,就不会有这样的保证。
Figure 3.1. Visibility Guarantees for Synchronization.
We can now give the other reason for the rule requiring all threads to synchronize on the same lock when accessing a shared mutable variable to guarantee that values written by one thread are made visible to other threads. Otherwise, if a thread reads a variable without holding the appropriate lock, it might see a stale value.
当访问可变的共享变量时,所有线程使用同一把锁进行同步以保证其中一个线程对变量的更改对其他线程课件。现在我们有了另外一个使用这个规则的理由。否则,如果一个线程在没有使用恰当的锁的时候读取变量,他就将看到一个过期变量。
Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.
锁机制并非不仅仅对于互斥操作有意义,它同样对于内存共享有意义。为了确保所有线程能够看到大多数共享可变变量的正确数值,读写线程都必须使用同一把锁进行同步。
3.1.4. Volatile Variables(volatile变量)
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Java语言提供一种可选的,非标准的同步机制-volatile变量,来保证对某个变量的修改以可以预见的形式被其他线程获得。当一个变量被声明为volatile类型之后,编译器和运行时环境就会住注意到该变量是被共享的,这样这个变量之上的操作就不会与其他内存操作打乱时序。Volatile变量不会再寄存器中缓存也不会对其他处理器隐藏,因此读取一个volatile类型的变量将肯定会返回被线程修改后的最新值。
A good way to think about volatile variables is to imagine that they behave roughly like the SynchronizedInteger class in Listing 3.3, replacing reads and writes of the volatile variable with calls to get and set.[4] Yet accessing a volatile variable performs no locking and so cannot cause the executing thread to block, making volatile variables a lighter-weight synchronization mechanism than synchronized.[5]
[4] This analogy is not exact; the memory visibility effects of SynchronizedInteger are actually slightly stronger than those of volatile variables. See Chapter 16.
[5] Volatile reads are only slightly more expensive than nonvolatile reads on most current processor architectures.
当时volatile变量没有锁机制,因此不能将线程变成阻塞状态,这使得volatile变量成为同步机制的一种轻量级实现。
这种模拟其实并不精确,SynchronizedInteger类的内存可见效果实际上稍稍强于使用volatile变量,详情可以查看第十六章。
在大部分处理器架构下,Volatile方式的读取的时间效率只比非volatile形式的读取稍微第一点儿。
The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable,
the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block. However, we do not recommend relying too heavily on volatile variables for visibility; code that relies on volatile variables for visibility of arbitrary state is more fragile and harder to understand than code that uses locking.
Volatile变量的可见性效果超过了volatile变量本身。当线程A写入一个volatile类型的变量,而线程B接着去读取同一个变量的时候,在修改volatile变量之前对A所有可见的所有变量值在线程B读取到volatile变量值后都变成可见的。依赖于volatile变量来获取任意状态可见比使用锁机制更加更脆弱,也更加难以理解。
Use volatile variables only when they simplify implementing and verifying your synchronization policy; avoid using volatile variables when veryfing correctness would require subtle reasoning about visibility. Good uses of volatile variables include ensuring the visibility of their own state, that of the object they refer to, or indicating that an important lifecycle event (such as initialization or shutdown) has occurred.
当代码逻辑的实现非常简单,或者验证你的同步策略的时候,才会用到volatile变量。如果代码的正确性验证需要考虑到可见性的时候,不要使用volatile变量。对于volatile变量的正确使用方式包括:确保volatile变量自身状态的可见性-也就是他们所在的对象的状态,或者指示一个重要的生命周期事件(例如初始化或者关闭)的发生。
Listing 3.4 illustrates a typical use of volatile variables: checking a status flag to determine when to exit a loop. In this example, our anthropomorphized thread is trying to get to sleep by the time-honored method of counting sheep. For this example to work, the asleep flag must be volatile. Otherwise, the thread might not notice when asleep has been set by another thread.[6] We could instead have used locking to ensure visibility of changes to asleep, but that would have made the code more cumbersome.
Listing3.4中是一个volatile变量的典型应用:通过检查一个flag状态决定什么时候跳出loop。在这个例子中,人格化的线程通过传统的数数的方法来进入睡眠。这个例子能够实现,asleep标识位必须是volatile的。否则,当asleep被别的线程修改的时候,人格化的线程可能并不会注意到这种修改。我们可以使用锁机制来确保asleep标识的可见性,但是这将会使得代码非常显得笨重。
[6] Debugging tip: For server applications, be sure to always specify the -server JVM command line switch when invoking the JVM, even for development and testing. The server JVM performs more optimization than the client JVM, such as hoisting variables out of a loop that are not modified in the loop; code that might appear to work in the development environment (client JVM) can break in the deployment environment (server JVM). For example, had we "forgotten" to declare the variable asleep as volatile in Listing 3.4, the server JVM could hoist the test out of the loop (turning it into an infinite loop), but the client JVM would not. An infinite loop that shows up in development is far less costly than one that only shows up in production.
调试提示:对于服务器应用来说,即使在开发和测试过程中,当需要激活JVM的时候,一定要确保指定-server命令的使用。Server端的JVM比客户端的JVM实现了更多的优化,例如提升在一个循环中没有被修改的变量,在开发环境中可用的代码可能会在部署环境中出错。例如我们可能会忘记向Listing3.4中那样忘记声明volatile变量。Server模式的JVM将可能把检查从循环中提取出来(变成一个无限循环),但是client模式的JVM不会这样做。出现开发中的无限循环所带来的代价远远低于在产品中出现。
Listing 3.4. Counting Sheep.
volatile boolean asleep;
...
while (!asleep)
countSomeSheep();
Volatile variables are convenient, but they have limitations. The most common use for volatile variables is as a completion, interruption, or status flag, such as the asleep flag in Listing 3.4. Volatile variables can be used for other kinds of state information, but more care is required when attempting this. For example, the semantics of volatile are not strong enough to make the increment operation (count++) atomic, unless you can guarantee that the variable is written only from a single thread. (Atomic variables do provide atomic read-modify-write support and can often be used as "better volatile variables"; see Chapter 15.)
Voltile类型的变量既有其方便之处,也有其使用限制。Volatile变量通常会用来作为竞争、中断、状态标记,比如Listing3.4中的用法。当Volatile变量用于其他类型的状态信息时,就需要格外小心。例如,在不能保证递增操作由单线程执行的情况下,Volatile的语义学定义不足以保证递增操作的原子性。能够提供“read-modify-write”操作的原子性变量可原子变量可以被当做“better volatile variables”使用,见第十五章。
Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.
锁机制可以同时保证可见性和原子性,volatile变量只能够保证可见性。
You can use volatile variables only when all the following criteria are met:
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• The variable does not participate in invariants with other state variables; and
• Locking is not required for any other reason while the variable is being accessed.
只有在下列标准都遵循的情况下,才可以使用volatile类型的变量。
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• 对变量的修改并不依赖于变量的当前值,或者你能够保证只有一个线程可以修改该变量值。
• The variable does not participate in invariants with other state variables; and
• 该变量不会与其他状态一起参与不变性的维护。
• Locking is not required for any other reason while the variable is being accessed.
• 在变量被访问的时候,没有其他对锁机制的需求。
你去买本中文版的吧,我都买了一本,中文电子版我也没有呢.
Visibility is subtle because the things that can go wrong are so counterintuitive. In a single-threaded environment, if you write a value to a variable and later read that variable with no intervening writes, you can expect to get the same value back. This seems only natural. It may be hard to accept at first, but when the reads and writes occur in different threads, this is simply not the case. In general, there is no guarantee that the reading thread will see a value written by another thread on a timely basis, or even at all. In order to ensure visibility of memory writes across threads, you must use synchronization.
可见性可能是非常微妙的,因为经常会有违反直觉的错误发生。在单线程环境中,如果你在某一个时刻写入变量,过了一段时间之后读取变量,如果这之间没有写入的话,你应该可以得到相同的值。这看上去非常自然。尽管看上去好像无法理解,当在多线程环境中进行读写操作的时候,事情就有有一些不同。基本上,无法保证读线程可以准时的获取到写线程写入的值。为了确保写操作对于其他线程的内存可见性,你必须使用同步机制。
NoVisibility in Listing 3.1 illustrates what can go wrong when threads share data without synchronization. Two threads, the main thread and the reader thread, access the shared variables ready and number. The main thread starts the reader thread and then sets number to 42 and ready to true. The reader thread spins until it sees ready is true, and then prints out number. While it may seem obvious that NoVisibility will print 42, it is in fact possible that it will print zero, or never terminate at all! Because it does not use adequate synchronization, there is no guarantee that the values of ready and number written by the main thread will be visible to the reader thread.
Listing3.1中的代码展现了如果不使用同步,线程共享数据的时候可能会出错。主线程和读线程都会访问共享变量ready和number。主线程创建了Reader线程,然后把number设置成42,ready设成true。当Reader线程会陷入死循环中,一直等到发现准备好为止。很明显NoVisibility类会打印42.实际上也有可能会打印0或者永远不会停止。因为没有利用足够的同步机制,这样就无法确保ready和number的值被主线程的修改对读线程是可见的。
Listing 3.1. Sharing Variables without Synchronization. Don't Do this.
public class NoVisibility { private static boolean ready; private static int number; private static class ReaderThread extends Thread { public void run() { while (!ready) Thread.yield(); System.out.println(number); } } public static void main(String[] args) { new ReaderThread().start(); number = 42; ready = true; } }
NoVisibility could loop forever because the value of ready might never become visible to the reader thread. Even more strangely, NoVisibility could print zero because the write to ready might be made visible to the reader thread before the write to number, a phenomenon known as reordering. There is no guarantee that operations in one thread will be performed in the order given by the program, as long as the reordering is not detectable from within that thread even if the reordering is apparent to other threads.[1] When the main thread writes first to number and then to done without synchronization, the reader thread could see those writes happen in the opposite order or not at all.
NoVisibility类可能会无限循环下去,因为ready值可能对Reader线程来说一直是不可见的。在更为特殊的情况下,NoVisibility甚至可能会打印出0.因为可能会在写入number之前写入ready值,这是由于著名的“reordering”现象。尽管“recordering”对其他线程可能是可以被察觉的,但是只要在某一个线程内部该现象无法被察觉,那么就无法确保操作以程序中给定的顺序执行。当主线程在没有同步机制的情况下,首先写入数值,然后再去设置ready的值的话,Reader线程可能会察觉到写入是以相反的顺序发生的,甚至可能根本就没有发生。
[1] This may seem like a broken design, but it is meant to allow JVMs to take full advantage of the performance of modern multiprocessor hardware. For example, in the absence of synchronization, the Java Memory Model permits the compiler to reorder operations and cache values in registers, and permits CPUs to reorder operations and cache values in processor-specific caches. For more details, see Chapter 16.
这可能看上去是一个很差的设计,但是却可以使得JVM充分的利用现代多处理器硬件的所有优势。例如,在没有同步机制的情况下,Java的内存模型允许编译器打乱操作顺序并且可以在寄存器中缓存数值。允许cpu打乱指令的操作顺序,并且可以在处理器级别的缓存中缓存数值。第十六章中,有更详细的描述。
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions “must” happen in insufflciently synchronized multithreaded programs will almost certainly be incorrect.
在没有同步机制的情况下,编译器、处理器以及运行时环境可能会彻底将表明的执行指令弄乱。在没有足够同步机制的多线程程序中,要想弄清楚内存动作的顺序可能会是错误的。
NoVisibility is about as simple as a concurrent program can get two threads and two shared variables and yet it is still all too easy to come to the wrong conclusions about what it does or even whether it will terminate. Reasoning about insufficiently synchronized concurrent programs is prohibitively difficult.
NoVisibility类非常简单的并发程序,只拥有两个线程和两个共享的变量,即便如此,我们还是很容易就得到了错误的结果,甚至可能会陷入死循环。并发程序同步机制不够的问题是极其难以发现的。
This may all sound a little scary, and it should. Fortunately, there's an easy way to avoid these complex issues: always use the proper synchronization whenever data is shared across threads.
可能听上去有些吓人,但是事实上的确是这样。幸运的是,有一种简单的方法可以避免这种复杂性,只要数据在线程之间被共享,就要永远使用合适的同步机制。
3.1.1. Stale Data(过期数据)
NoVisibility demonstrated one of the ways that insufficiently synchronized programs can cause surprising results: stale data. When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable. Worse, staleness is not all-or-nothing: a thread can see an up-to-date value of one variable but a stale value of another variable that was written first.
NoVisibility类为我们展示了缺少足够的同步机制可能会引起令人诧异的结果:过期数据。当Reader线程检查ready值的时候,可能会看到过期值。除非在每个变量被访问的时候都是用同步机制,否则都有可能会看到某个变量的过期值。更为糟糕的是,过期并不是“all-or-nothing”的,一个线程可能会看到一个变量的最新值和另外一个变量的过期值。
When food is stale, it is usually still edible just less enjoyable. But stale data can be more dangerous. While an out-of-date hit counter in a web application might not be so bad,[2] stale values can cause serious safety or liveness failures. In NoVisibility, stale values could cause it to print the wrong value or prevent the program from terminating. Things can get even more complicated with stale values of object references, such as the link pointers in a linked list implementation. Stale data can cause serious and confusing failures such as unexpected exceptions, corrupted data structures, inaccurate computations, and infinite loops.
当食物过期时候,食物仍然是可以吃的,只不过是没有那么美味而已。如果数据过期的话,那危险就来了。如果一个web应用的“点击计数”过期的话,情况或许并不算太糟糕。过期数据可能会引起严重的安全和存活性问题。在NoVisibility类中,过期数据可能会引起数据错误或者导致死循环。如果对象的应用变成过期数据的话,情况会更加复杂。
[2] Reading data without synchronization is analogous to using the READ_UNCOMMITTED isolation level in a database, where you are willing to trade accuracy for performance. However, in the case of unsynchronized reads, you are trading away a greater degree of accuracy, since the visible value for a shared variable can be arbitrarily stale.
在没有同步机制的情况下读取数据有点儿像在数据库中使用“未提交”的隔离级别,这样做可以获得比较高的性能。但是对于非同步读来说,你将会失去比较高的准确度,因为某一个共享变量的可见值可能是过期的。
MutableInteger in Listing 3.2 is not thread-safe because the value field is accessed from both get and set without synchronization. Among other hazards, it is susceptible to stale values: if one thread calls set, other threads calling get may or may not see that update.
由于值域被get和set方法在没有同步机制的情况下访问,Listing3.2中的MutableInteger类不是线程安全的。与其他并发威胁相比,过期数据可能是易受影响的。如果一个线程调用set方法,那么其它线程可能会也可能不会看到数据的更改。
We can make MutableInteger tHRead safe by synchronizing the getter and setter as shown in SynchronizedInteger in Listing 3.3. Synchronizing only the setter would not be sufficient: threads calling get would still be able to see stale values.
可以通过使用同步的getter和setter方法将MutableInteger设置成线程安全的。只同步setter方法是不够的,调用get方法的线程还是会看到过期数据。
Listing 3.2. Non-thread-safe Mutable Integer Holder.
@NotThreadSafe public class MutableInteger { private int value; public int get() { return value; } public void set(int value) { this.value = value; } }
Listing 3.3. Thread-safe Mutable Integer Holder.
@ThreadSafe public class SynchronizedInteger { @GuardedBy("this") private int value; public synchronized int get() { return value; } public synchronized void set(int value) { this.value = value; } }
3.1.2. Nonatomic 64-bit Operations(非原子的64比特操作)
When a thread reads a variable without synchronization, it may see a stale value, but at least it sees a value that was actually placed there by some thread rather than some random value. This safety guarantee is called out-of-thin-air safety.
当线程在没有同步机制的情况下读取变量的时候,线程可能会看到过期数据,但是至少这个数据是曾经被后一个线程放上的,而至于是一个随机数据。这种安全保证被称为“out-of-thin-air”。
Out-of-thin-air safety applies to all variables, with one exception: 64-bit numeric variables (double and long) that are not declared volatile (see Section 3.1.4). The Java Memory Model requires fetch and store operations to be atomic, but for nonvolatile long and double variables, the JVM is permitted to treat a 64-bit read or write as two separate 32-bit operations. If the reads and writes occur in different threads, it is therefore possible to read a nonvolatile long and get back the high 32 bits of one value and the low 32 bits of another.[3] Thus, even if you don't care about stale values, it is not safe to use shared mutable long and double variables in multithreaded programs unless they are declared volatile or guarded by a lock.
[3] When the Java Virtual Machine Specification was written, many widely used processor architectures could not efficiently provide atomic 64-bit arithmetic operations.
“Out-of-thin-air”适用于所有的变量操作,但是还是有例外,这就是那些没有被设置成volatile的64比特的数字操作(double和long)。JVM要求读取和存储数据必须是原子的,但是对于非volatile的long型和double型,JVM允许当做两个单独的32比特进行读写操作。如果读写操作出现在不同的线程中的时候,这就有可能会出现读取到一个nonvolatile的长整型或者得到高位的32比特或者低位的32比特。这样即使你不去关心过期数据的问题,在多线程环境中使用可变的long和double变量也是不安全的。除非他们被声明为volatile的或者被锁守护。当Java虚拟机规范制定的时候,很多被广泛使用的处理器架构无法有效地提供原子的64比特的算术操作。
3.1.3. Locking and Visibility(锁和可见性)
Intrinsic locking can be used to guarantee that one thread sees the effects of another in a predictable manner, as illustrated by Figure 3.1. When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock. In other words, everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock. Without synchronization, there is no such guarantee.
内在的锁机制可以用来保证一个线程使用一种可以预期的方式看到另外一个线程的结果,这种方式在图3.1中有所体现。当线程A执行完同步代码块之后,接着线程B进入了被同一把锁所保护的同步代码块,当B获得锁的时候,在释放锁前对线程A可见的数据都被赋予给线程B。也就是说,线程A在同步锁被授予线程B过程中和之前的所有事情当线程B执行该同步代码块的时候都是可见的。如果没有同步机制,就不会有这样的保证。
Figure 3.1. Visibility Guarantees for Synchronization.
We can now give the other reason for the rule requiring all threads to synchronize on the same lock when accessing a shared mutable variable to guarantee that values written by one thread are made visible to other threads. Otherwise, if a thread reads a variable without holding the appropriate lock, it might see a stale value.
当访问可变的共享变量时,所有线程使用同一把锁进行同步以保证其中一个线程对变量的更改对其他线程课件。现在我们有了另外一个使用这个规则的理由。否则,如果一个线程在没有使用恰当的锁的时候读取变量,他就将看到一个过期变量。
Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.
锁机制并非不仅仅对于互斥操作有意义,它同样对于内存共享有意义。为了确保所有线程能够看到大多数共享可变变量的正确数值,读写线程都必须使用同一把锁进行同步。
3.1.4. Volatile Variables(volatile变量)
The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Java语言提供一种可选的,非标准的同步机制-volatile变量,来保证对某个变量的修改以可以预见的形式被其他线程获得。当一个变量被声明为volatile类型之后,编译器和运行时环境就会住注意到该变量是被共享的,这样这个变量之上的操作就不会与其他内存操作打乱时序。Volatile变量不会再寄存器中缓存也不会对其他处理器隐藏,因此读取一个volatile类型的变量将肯定会返回被线程修改后的最新值。
A good way to think about volatile variables is to imagine that they behave roughly like the SynchronizedInteger class in Listing 3.3, replacing reads and writes of the volatile variable with calls to get and set.[4] Yet accessing a volatile variable performs no locking and so cannot cause the executing thread to block, making volatile variables a lighter-weight synchronization mechanism than synchronized.[5]
[4] This analogy is not exact; the memory visibility effects of SynchronizedInteger are actually slightly stronger than those of volatile variables. See Chapter 16.
[5] Volatile reads are only slightly more expensive than nonvolatile reads on most current processor architectures.
当时volatile变量没有锁机制,因此不能将线程变成阻塞状态,这使得volatile变量成为同步机制的一种轻量级实现。
这种模拟其实并不精确,SynchronizedInteger类的内存可见效果实际上稍稍强于使用volatile变量,详情可以查看第十六章。
在大部分处理器架构下,Volatile方式的读取的时间效率只比非volatile形式的读取稍微第一点儿。
The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable,
the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block. However, we do not recommend relying too heavily on volatile variables for visibility; code that relies on volatile variables for visibility of arbitrary state is more fragile and harder to understand than code that uses locking.
Volatile变量的可见性效果超过了volatile变量本身。当线程A写入一个volatile类型的变量,而线程B接着去读取同一个变量的时候,在修改volatile变量之前对A所有可见的所有变量值在线程B读取到volatile变量值后都变成可见的。依赖于volatile变量来获取任意状态可见比使用锁机制更加更脆弱,也更加难以理解。
Use volatile variables only when they simplify implementing and verifying your synchronization policy; avoid using volatile variables when veryfing correctness would require subtle reasoning about visibility. Good uses of volatile variables include ensuring the visibility of their own state, that of the object they refer to, or indicating that an important lifecycle event (such as initialization or shutdown) has occurred.
当代码逻辑的实现非常简单,或者验证你的同步策略的时候,才会用到volatile变量。如果代码的正确性验证需要考虑到可见性的时候,不要使用volatile变量。对于volatile变量的正确使用方式包括:确保volatile变量自身状态的可见性-也就是他们所在的对象的状态,或者指示一个重要的生命周期事件(例如初始化或者关闭)的发生。
Listing 3.4 illustrates a typical use of volatile variables: checking a status flag to determine when to exit a loop. In this example, our anthropomorphized thread is trying to get to sleep by the time-honored method of counting sheep. For this example to work, the asleep flag must be volatile. Otherwise, the thread might not notice when asleep has been set by another thread.[6] We could instead have used locking to ensure visibility of changes to asleep, but that would have made the code more cumbersome.
Listing3.4中是一个volatile变量的典型应用:通过检查一个flag状态决定什么时候跳出loop。在这个例子中,人格化的线程通过传统的数数的方法来进入睡眠。这个例子能够实现,asleep标识位必须是volatile的。否则,当asleep被别的线程修改的时候,人格化的线程可能并不会注意到这种修改。我们可以使用锁机制来确保asleep标识的可见性,但是这将会使得代码非常显得笨重。
[6] Debugging tip: For server applications, be sure to always specify the -server JVM command line switch when invoking the JVM, even for development and testing. The server JVM performs more optimization than the client JVM, such as hoisting variables out of a loop that are not modified in the loop; code that might appear to work in the development environment (client JVM) can break in the deployment environment (server JVM). For example, had we "forgotten" to declare the variable asleep as volatile in Listing 3.4, the server JVM could hoist the test out of the loop (turning it into an infinite loop), but the client JVM would not. An infinite loop that shows up in development is far less costly than one that only shows up in production.
调试提示:对于服务器应用来说,即使在开发和测试过程中,当需要激活JVM的时候,一定要确保指定-server命令的使用。Server端的JVM比客户端的JVM实现了更多的优化,例如提升在一个循环中没有被修改的变量,在开发环境中可用的代码可能会在部署环境中出错。例如我们可能会忘记向Listing3.4中那样忘记声明volatile变量。Server模式的JVM将可能把检查从循环中提取出来(变成一个无限循环),但是client模式的JVM不会这样做。出现开发中的无限循环所带来的代价远远低于在产品中出现。
Listing 3.4. Counting Sheep.
volatile boolean asleep;
...
while (!asleep)
countSomeSheep();
Volatile variables are convenient, but they have limitations. The most common use for volatile variables is as a completion, interruption, or status flag, such as the asleep flag in Listing 3.4. Volatile variables can be used for other kinds of state information, but more care is required when attempting this. For example, the semantics of volatile are not strong enough to make the increment operation (count++) atomic, unless you can guarantee that the variable is written only from a single thread. (Atomic variables do provide atomic read-modify-write support and can often be used as "better volatile variables"; see Chapter 15.)
Voltile类型的变量既有其方便之处,也有其使用限制。Volatile变量通常会用来作为竞争、中断、状态标记,比如Listing3.4中的用法。当Volatile变量用于其他类型的状态信息时,就需要格外小心。例如,在不能保证递增操作由单线程执行的情况下,Volatile的语义学定义不足以保证递增操作的原子性。能够提供“read-modify-write”操作的原子性变量可原子变量可以被当做“better volatile variables”使用,见第十五章。
Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.
锁机制可以同时保证可见性和原子性,volatile变量只能够保证可见性。
You can use volatile variables only when all the following criteria are met:
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• The variable does not participate in invariants with other state variables; and
• Locking is not required for any other reason while the variable is being accessed.
只有在下列标准都遵循的情况下,才可以使用volatile类型的变量。
• Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
• 对变量的修改并不依赖于变量的当前值,或者你能够保证只有一个线程可以修改该变量值。
• The variable does not participate in invariants with other state variables; and
• 该变量不会与其他状态一起参与不变性的维护。
• Locking is not required for any other reason while the variable is being accessed.
• 在变量被访问的时候,没有其他对锁机制的需求。
评论
2 楼
jimmee
2010-03-28
truth315 写道
楼主这是《JAVA CONCURRENCY IN PRACTICE》,请问有中文版的电子书没?在下有英文版的,但苦于语言不通。
你去买本中文版的吧,我都买了一本,中文电子版我也没有呢.
1 楼
truth315
2010-03-28
楼主这是《JAVA CONCURRENCY IN PRACTICE》,请问有中文版的电子书没?在下有英文版的,但苦于语言不通。
发表评论
-
[转载]并发之痛 Thread,Goroutine,Actor
2017-04-06 19:21 783转自 http://jolestar.com/pa ... -
JVM动态调整字节码
2016-04-14 19:27 1445粗略的点开btrace的源码看了一下,实际上他只是封装了JD ... -
java字节码常量池处理说明
2016-04-13 23:23 13431. 根据java的字节码格式说明,常量池中每一项的大小不一 ... -
Mac OSX 10.10 Yosemite编译OpenJDK 8
2016-04-03 18:14 3749编译时间:2016-04-03 系统版本:Mac OS ... -
Java 并发之 ConcurrentSkipListMap 简述
2015-09-20 20:24 1255JCIP 提到了在 Java 6 中引入了两个新的并发集合类 ... -
hbase等源码导入eclipse流程
2015-09-20 19:00 1877hbase: 1. 下载源码 svn co ht ... -
最简单的平衡树(红-黑树)的实现
2015-09-04 08:04 1347在二叉搜索树(BST)的基础上,要实现一颗平衡树,可以使用 ... -
多线程程序中操作的原子性[转载]
2014-12-06 10:49 13120. 背景 原子操作就是不可再分的操作。在多线程程序中原子 ... -
6. 内存屏障[转载]
2014-11-26 00:07 823原文地址 作者:Martin Thompson 译者: ... -
5.合并写(write combining)[转载]
2014-11-25 21:54 887原文地址 译者:无叶 ... -
4. 内存访问模型的重要性[转载]
2014-11-25 21:53 1176在高性能的计算中,我 ... -
3. Java 7与伪共享的新仇旧恨[转载]
2014-11-25 21:45 1001原文:False Shareing && J ... -
2. 伪共享(False Sharing)[转载]
2014-11-25 21:40 967作者:Martin Thompson 译者:丁一 缓存 ... -
lucene索引创建的理解思路
2014-06-29 23:12 1605虽然lucene4很早就出来,但是这里仍然以lucene3. ... -
lucene的拼写检查的实现原理
2014-06-08 18:19 14351. 建索引时, 使用ngram的方式创建索引 Sp ... -
字符串相似算法-(3) NGram Distance
2014-06-08 17:54 5058就是N-Gram version of edit dista ... -
字符串相似算法-(2) Levenshtein distance
2014-06-08 16:32 2379编辑距离概念描述: ... -
字符串相似算法-(1) Jaro-Winkler Distance
2014-06-08 12:05 6951Jaro-Winkler Distance 算法 ... -
tomcat参数编码处理过程
2014-06-07 09:49 19601. org.apache.coyote.http11 ... -
SSLEngine的示例
2014-05-26 19:44 7953为什么要使用SSLEngine, 参考javadoc的说明 ...
相关推荐
Java并发性编程是Java开发中的重要领域,尤其是在大型企业如阿里巴巴这样的公司中,高效地处理多线程环境下的任务是至关重要的。这份“Java并发编程(阿里巴巴).ppt”很可能包含了关于如何在Java中安全、高效地进行...
内存模型是并发编程的基础,它定义了多线程环境下共享变量的可见性、有序性和一致性。本资料“基于内存模型的Java并发编程”深入探讨了Java内存模型(JMM)以及如何在实际编程中利用其特性来编写高效且线程安全的...
在Java编程语言中,变量的可见性是一个至关重要的概念,它直接影响着...总的来说,Java变量的可见性是编程实践中不可或缺的一部分,熟练掌握这一概念能够帮助开发者写出更高效、安全的代码,避免潜在的bug和性能问题。
Java内存模型(Java Memory Model, JMM)是Java并发编程的基础之一,它定义了一套规则来保证线程之间的数据可见性和一致性。当程序执行并行操作时,如果没有适当的控制措施,很容易导致数据不一致的问题。因此,深入...
Java多线程编程中,原子性、可见...总结来说,原子性、可见性和有序性是Java多线程编程中保证并发安全的重要概念。通过合理使用Java提供的并发工具和关键字,开发者可以有效地管理线程之间的交互,确保程序的正确运行。
在Java编程语言中,内存模型(Java Memory Model, JMM)是理解和解决多线程并发问题的关键。本文将深入探讨JMM中的三个核心概念:原子性、有序性和可见性。 ### 1. 原子性(Atomicity) 原子性是指一个操作或多个...
本文档为《Java并发编程实践》(Java Concurrency in Practice)的电子版PDF文件,这本书由Brain Goetz、Tim Peierls、Joshua Bloch、Joseph Bowbeer、David Holmes以及Doug Lea所著。该书详细探讨了Java语言中的...
在 XML 文件中,我们可以使用不同的 visibility 属性值来设置控件的可见性,而在 Java 代码中,我们可以使用不同的 VISIBILITY 常量来设置控件的可见性。这些状态的区别在于对控件的显示和布局的影响,它们可以帮助...
Java并发编程是Java语言的核心特性之一,它允许开发者编写能够同时执行多个任务的程序。理解Java并发的基础知识,对于设计出高效、线程安全的应用程序至关重要。本文将从并发概念、Java内存模型、标准同步机制、线程...
在Silverlight中,图层的可见性主要通过设置UI元素的`Visibility`属性来控制。`Visibility`属性有三个可选值:`Visible`(可见)、`Collapsed`(折叠,不占用空间)和`Hidden`(隐藏,占用空间但不可见)。在这个...
在大型3D场景中,可见性剔除(Visibility Culling)是一项关键技术,它有助于减少渲染计算量,提高性能。通过剔除那些在相机视野之外或被其他物体遮挡的物体,可以显著降低GPU的负担。 Vulkan 中的可见性剔除通常...
在Android开发中,`visibility`属性是控制UI组件可见性的重要元素,广泛应用于各种视图控件,如TextView、ImageView、Button等。该属性决定了一个控件是否在屏幕上显示,以及如何显示。`visibility`属性有三个可能的...
在本文中,我们将详细介绍jQuery中的可见性过滤器:hidden和:visibility的用法,并结合实例来分析它们的功能和使用技巧。 首先,我们需要明确隐藏(hidden)和可见(visible)状态的区别: - hidden(隐藏)状态...
这里的“可见”和“隐藏”不仅仅是指CSS中的`display:none`,还包括`visibility:hidden`以及元素的宽度和高度为0的情况。 例如,如果你想要隐藏所有可见的段落,可以使用以下代码: ```javascript $("p:visible")....
`BoolToVisibilityConverter` 是一个内置的数据转换器,它用于将布尔值(`bool`)转换为 `Visibility`枚举类型,这对于根据逻辑状态控制UI元素的可见性非常有用。在本教程中,我们将深入探讨如何使用`...
它不仅为社交网络平台上的社交可见性服务提供了定价策略的参考,也为社交网络运营商在实践中的资源管理和服务优化提供了理论依据。通过合理的设计和运作,社交网络平台能够实现健康的商业模式,促进内容的多元和个性...
这份护肤品零售可见性报告详细分析了护肤品牌CeraVe和Cetaphil在零售领域的表现,特别关注了它们之间的正相关性和品牌可见性的增长趋势。 首先,报告指出CeraVe和Cetaphil是护肤品类别中表现最为突出的两个品牌,...
3. 可见性(Visibility):可见性是指一个线程对共享变量的可见性,确保线程的可见性正确。 Java 初级程序员需要具备扎实的基础知识和面向对象编程的概念,熟悉 Java 基础知识和开发工具,并且了解并发三特性,以便...
- **可见性**(Visibility): 当一个线程修改了共享变量后,其他线程能够立即看到修改的结果。Java中的`volatile`关键字可以保证可见性。 - **有序性**(Ordering): Java内存模型允许编译器和处理器重新排序操作...