`
arganzheng
  • 浏览: 104358 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
文章分类
社区版块
存档分类
最新评论

线程学习笔记

阅读更多

线程学习笔记

2010-05-05 星期三 雨

一、线程原语

线程如何结束
1. 自己结束自己(void pthread_exit(void *rval_ptr);)
2. 被同进程的其他线程取消掉(通过int pthread_cancel(pthread_t tid); 但是被通知线程可以选择如何处理这个通知。However, a thread can elect to ignore or otherwise control how it is canceled.)
A single thread can exit in three ways, thereby stopping its flow of control, without terminating the entire process. 
1.  The thread can simply return from the start routine. The return value is the thread's exit code.
2.  The thread can be canceled by another thread in the same process.
3.  The thread can call pthread_exit.

其他线程如何知道某个线程已经结束,结束状态如何

使用int pthread_join(pthread_t thread, void **rval_ptr);
The calling thread will block until the specified thread calls pthread_exit, returns from its start routine, or is canceled.
If we're not interested in a thread's return value, we can set rval_ptr to NULL. In this case, calling pthread_join allows us to wait for the specified thread, but does not retrieve the thread's termination status. 

每个线程都有自己开始时的执行函数,那么有没有结束时的清理函数呢

换句话说,其他线程可以通过pthread_join在他等待的线程结束时得到通知,从而做一些事情。那么结束的线程能否自己得到通知,并且做一下清理事情呢?
A thread can arrange for functions to be called when it exits, similar to the way that the atexit function (Section 7.3) can be used by a process to arrange that functions can be called when the process exits. The functions are known as thread cleanup handlers. More than one cleanup handler can be established for a thread. The handlers are recorded in a stack, which means that they are executed in the reverse order from that with which they were registered. 
void pthread_cleanup_push(void (*rtn)(void *),  void *arg);
void pthread_cleanup_pop(int execute);

线程VS进程

By now, you should begin to see similarities between the thread functions and the process functions. Figure 11.6 summarizes the similar functions.

Process primitive Thread primitive Description
fork pthread_create create a new flow of control
exit pthread_exit exit from an existing flow of control
waitpid pthread_join get exit status from flow of control
atexit pthread_cancel_push register function to be called at exit from flow of control
getpid pthread_self get ID for flow of control
abort pthread_cancel request abnormal termination of flow of control

二、线程同步

为什么需要同步控制

关键在于写操作不是一个原子操作。在现在计算机体系中,内存存取需要多个总线周期。
If the modification is atomic, then there isn't a race. In the previous example, if the increment takes only one memory cycle, then no race exists. If our data always appears to be sequentially consistent, then we need no additional synchronization. Our operations are sequentially consistent when multiple threads can't observe inconsistencies in our data. In modern computer systems, memory accesses take multiple bus cycles, and multiprocessors generally interleave bus cycles among multiple processors, so we aren't guaranteed that our data is sequentially consistent. 

Besides the computer architecture, races can arise from the ways in which our programs use variables, creating places where it is possible to view inconsistencies. For example, we might increment a variable and then make a decision based on its value. The combination of the increment step and the decision-making step aren't atomic, so this opens a window where inconsistencies can arise. 

如何进行同步控制

1. Mutexes(互斥锁)
We can protect our data and ensure access by only one thread at a time by using the pthreads mutual-exclusion interfaces. A mutex is basically a lock that we set (lock) before accessing a shared resource and release (unlock) when we're done. While it is set, any other thread that tries to set it will block until we release it. If more than one thread is blocked when we unlock the mutex, then all threads blocked on the lock will be made runnable, and the first one to run will be able to set the lock. The others will see that the mutex is still locked and go back to waiting for it to become available again. In this way, only one thread will proceed at a time. 
每个Mutexes一般与一个他要保护的共享变量邦定在一起,就像一个钟情的护花使者一样。所以在C/C++,你可以经常看到如下代码:
Using a mutex to protect a data structure
struct foo {
    int             f_count;
    pthread_mutex_t f_lock;
    /* ... more stuff here ... */
};
 
struct foo * foo_alloc(void) /* allocate the object */
{
    struct foo *fp;
    if ((fp = malloc(sizeof(struct foo))) != NULL) {
        fp->f_count = 1;
        if (pthread_mutex_init(&fp->f_lock, NULL) != 0) {
            free(fp);
            return(NULL);
        }
        /* ... continue initialization ... */
    }
    return(fp);
}

void foo_hold(struct foo *fp) /* add a reference to the object */
{
    pthread_mutex_lock(&fp->f_lock);
    fp->f_count++;
    pthread_mutex_unlock(&fp->f_lock);
}

void foo_rele(struct foo *fp) /* release a reference to the object */
{
    pthread_mutex_lock(&fp->f_lock);
    if (--fp->f_count == 0) { /* last reference */
        pthread_mutex_unlock(&fp->f_lock);
        pthread_mutex_destroy(&fp->f_lock);
        free(fp);
    } else {
        pthread_mutex_unlock(&fp->f_lock);
    }
}
死锁

原因:
1. 自己锁住自己:A thread will deadlock itself if it tries to lock the same mutex twice. 
2. 等待一个等待自己释放锁的线程释放锁:when we use more than one mutex in our programs, a deadlock can occur if we allow one thread to hold a mutex and block while trying to lock a second mutex at the same time that another thread holding the second mutex tries to lock the first mutex. Neither thread can proceed, because each needs a resource that is held by the other, so we have a deadlock. 

死锁避免
Deadlocks can be avoided by carefully controlling the order in which mutexes are locked. 

Sometimes, an application's architecture makes it difficult to apply a lock ordering. If enough locks and data structures are involved that the functions you have available can't be molded to fit a simple hierarchy, then you'll have to try some other approach. In this case, you might be able to release your locks and try again at a later time. You can use the pthread_mutex_trylock interface to avoid deadlocking in this case. If you are already holding locks and pthread_mutex_trylock is successful, then you can proceed. If it can't acquire the lock, however, you can release the locks you already hold, clean up, and try again later. 

ReaderWriter Locks(读写锁,三态锁
Readerwriter locks are similar to mutexes, except that they allow for higher degrees of parallelism. With a mutex, the state is either locked or unlocked, and only one thread can lock it at a time. Three states are possible with a readerwriter lock: locked in read mode, locked in write mode, and unlocked. Only one thread at a time can hold a readerwriter lock in write mode, but multiple threads can hold a readerwriter lock in read mode at the same time. 

When a readerwriter lock is write-locked, all threads attempting to lock it block until it is unlocked. When a readerwriter lock is read-locked, all threads attempting to lock it in read mode are given access, but any threads attempting to lock it in write mode block until all the threads have relinquished their read locks. Although implementations vary, readerwriter locks usually block additional readers if a lock is already held in read mode and a thread is blocked trying to acquire the lock in write mode. This prevents a constant stream of readers from starving waiting writers. 

读写锁特别适合于读多于写的应用场景。Readerwriter locks are well suited for situations in which data structures are read more often than they are modified. When a readerwriter lock is held in write mode, the data structure it protects can be modified safely, since only one thread at a time can hold the lock in write mode. When the readerwriter lock is held in read mode, the data structure it protects can be read by multiple threads, as long as the threads first acquire the lock in read mode. 

Readerwriter locks are also called sharedexclusive locks. When a readerwriter lock is read-locked, it is said to be locked in shared mode. When it is write-locked, it is said to be locked in exclusive mode. 


Tip: 目前,用于协调并发程序的主要技术仍然是锁和条件变量。在一门面向对象的语言中,每个对象上都带有一个隐式的锁(intrinsic lock),而加锁则由Synchronized来完成,Java 提供了synchronized关键字来支持intrinsic lock。Synchronized 关键字可以放在方法的前面、对象的前面、类的前面。但原理与经典的加锁解锁是一样的。
下面一个例子显示了一个內部类,XOM在核实了命名空间URI时使用这个类把已经验证过的命名空间缓冲起来。
<<Beautiful Code>>示例5-11:已验证的命名空间URI的缓存
private final static class URICache{
    private final static int LOAD = 6;
    private String[] cache = new String[LOAD];
    private int position = 0;

    Synchronized boolean contains(String s){
         for(int i = 0; i < LOAD; i++){
             // 这里我假使这个命名空间URI是內部的。这种情况和常见,但却并非总是如此。
             // 如果命名空间以前不是內部的也不会超成破坏。当命名空间URI以前不是內部的,
             // 那么使用equals()比==更快。当如果是的话,将会更慢。
             if(s == cache[i]){
                 return true;
             }
         }
         return false;
    }

    Synchronized void put(String s){
        cache[position] = s;
        position++;
        if(position == LOAD) 
            position = 0;
    }
}

Condition Variables(条件变量,线程间的信号)

A mutex is fine to prevent simultaneous access to a shared variable, but we need something else to let us go to sleep waiting for some condition to occur. pthread_join()只有当被等待线程结束时才会通知等待线程,但我们需要在其他情况也得到通知。如果没有条件变量,唯一的可能就是使用轮询(polling)。但是轮询代码是业务无关,并且是一种CPU时间的浪费。

We want a method for the main loop to go to sleep until one of its threads notifies it that something is ready. A condition variable, in conjunction with a mutex, provides this facility. The mutex provides mutual exclusion and the condition variable provides a signaling mechanism. 

Condition variables are another synchronization mechanism available to threads. Condition variables provide a place for threads to rendezvous. When used with mutexes, condition variables allow threads to wait in a race-free way for arbitrary conditions to occur. 

The condition itself is protected by a mutex. A thread must first lock the mutex to change the condition state. Other threads will not notice the change until they acquire the mutex, because the mutex must be locked to be able to evaluate the condition. 

Thread-Specific Data(线程私有数据)

Thread-specific data, also known as thread-private data, is a mechanism for storing and finding data associated with a particular thread. The reason we call the data thread-specific, or thread-private, is that we'd like each thread to access its own separate copy of the data, without worrying about synchronizing access with other threads. 

例子:errno就是一个thread-specific data.

所谓的线程私有数据,其实是人为造出来的私有空间。Recall that all threads in a process have access to the entire address space of the process. Other than using registers, there is no way for one thread to prevent another from accessing its data. This is true even for thread-specific data. Even though the underlying implementation doesn't prevent access, the functions provided to manage thread-specific data promote data separation among threads. 

Before allocating thread-specific data, we need to create a key to associate with the data. The key will be used to gain access to the thread-specific data. We use pthread_key_create to create a key. 
#include <pthread.h>
int pthread_key_create(pthread_key_t *keyp,   void (*destructor)(void *));

Threads usually use malloc to allocate memory for their thread-specific data. The destructor function usually frees the memory that was allocated. If the thread exited without freeing the memory, then the memory would be lost: leaked by the process. 

A thread can allocate multiple keys for thread-specific data. Each key can have a destructor associated with it. There can be a different destructor function for each key, or they can all use the same function. Each operating system implementation can place a limit on the number of keys a process can allocate (recall PTHREAD_KEYS_MAX from Figure 12.1). 

We can break the association of a key with the thread-specific data values for all threads by calling pthread_key_delete. 
#include <pthread.h>
int pthread_key_delete(pthread_key_t *key);

Once a key is created, we can associate thread-specific data with the key by calling pthread_setspecific. We can obtain the address of the thread-specific data with pthread_getspecific. 

实现:通过线程独有的key关联到一个通过malloc分配的堆空间,从而实现各个线程无关。其实就是多了一个间接层(通过key访问)。不使用pthread_id,是因为pthread_id往往太大了。Since there is no guarantee that thread IDs are small, sequential integers, we can't simply allocate an array of per thread data and use the thread ID as the index. Even if we could depend on small, sequential thread IDs, we'd like a little extra protection so that one thread can't mess with another's data. 

Example
In Figure 12.11, we showed a hypothetical implementation of getenv. We came up with a new interface to provide the same functionality, but in a thread-safe way (Figure 12.12). But what would happen if we couldn't modify our application programs to use the new interface? In that case, we could use thread-specific data to maintain a per thread copy of the data buffer used to hold the return string. This is shown in Figure 12.13. 

We use pthread_once to ensure that only one key is created for the thread-specific data we will use. If pthread_getspecific returns a null pointer, we need to allocate the memory buffer and associate it with the key. Otherwise, we use the memory buffer returned by pthread_getspecific. For the destructor function, we use free to free the memory previously allocated by malloc. The destructor function will be called with the value of the thread-specific data only if the value is non-null. 

Note that although this version of getenv is thread-safe, it is not async-signal safe. Even if we made the mutex recursive, we could not make it reentrant with respect to signal handlers, because it calls malloc, which itself is not async-signal safe. 
Figure 12.13. A thread-safe, compatible version of getenv

#include <limits.h>
#include <string.h>
#include <pthread.h>
#include <stdlib.h>
static pthread_key_t key;
static pthread_once_t init_done = PTHREAD_ONCE_INIT;
pthread_mutex_t env_mutex = PTHREAD_MUTEX_INITIALIZER;
extern char **environ;
static void
thread_init(void)
{
    pthread_key_create(&key, free);
}
char *
getenv(const char *name)
{
    int     i, len;
    char    *envbuf;
    pthread_once(&init_done, thread_init);
    pthread_mutex_lock(&env_mutex);
    envbuf = (char *)pthread_getspecific(key);
    if (envbuf == NULL) {
        envbuf = malloc(ARG_MAX);
        if (envbuf == NULL) {
            pthread_mutex_unlock(&env_mutex);
            return(NULL);
        }
        pthread_setspecific(key, envbuf);
    }
    len = strlen(name);
    for (i = 0; environ[i] != NULL; i++) {
        if ((strncmp(name, environ[i], len) == 0) &&
          (environ[i][len] == '=')) {
            strcpy(envbuf, &environ[i][len+1]);
            pthread_mutex_unlock(&env_mutex);
            return(envbuf);
        }
    }
    pthread_mutex_unlock(&env_mutex);
    return(NULL);
}

分享到:
评论

相关推荐

    多线程学习笔记

    多线程学习笔记 iOS开发中,多线程是一种常见的技术手段,用于优化应用程序的性能,提升用户体验。多线程的核心是让程序能够并发地执行多个任务,合理地利用设备的计算能力,尤其是在拥有多个核心的处理器上。 ...

    java线程学习笔记

    Java 线程学习笔记 Java 线程创建有两种方法: 1. 继承 Thread 类,重写 run 方法:通过继承 Thread 类并重写 run 方法来创建线程,这种方法可以使线程具有自己的执行逻辑。 2. 实现 Runnable 接口:通过实现 ...

    JAVA 多线程学习笔记

    这篇学习笔记将深入探讨Java多线程的核心概念、实现方式以及相关工具的使用。 一、多线程基础 1. 线程与进程:在操作系统中,进程是资源分配的基本单位,而线程是程序执行的基本单位。每个进程至少有一个主线程,...

    Java多线程学习笔记

    ### Java多线程学习笔记 #### 一、线程的基本概念 在计算机科学中,**线程**(Thread)是程序执行流的最小单位。一个标准的程序只能做一件事情,而通过多线程技术,可以让程序同时处理多个任务。在Java中,线程是...

    UNIX多线程学习笔记

    【UNIX多线程学习笔记】 在UNIX操作系统中,多线程是一种重要的编程模型,它允许多个执行流在单个进程中并发运行。多线程带来了许多优势,包括提高应用程序响应速度,充分利用多CPU系统的资源,以及优化程序结构,...

    C++多线程学习笔记1

    这份"C++多线程学习笔记1"涵盖了基础到进阶的多线程概念,旨在帮助初学者快速掌握这一关键技能。 首先,C++11引入了对多线程的支持,引入了`&lt;thread&gt;`库,使得创建和管理线程变得简单。创建一个新的线程可以使用`...

    java基础:多线程学习笔记

    java基础:多线程学习笔记

    多线程 学习笔记.md

    多线程 学习笔记.md

    多线程学习笔记,关于创建线程,删除线程等相关指令的应用。

    在计算机科学中,多线程是一种编程模型,允许一个应用程序同时执行多个任务。这提高了系统的效率,特别是对于处理大量并发操作的情况。以下是一些关于多线程的重要知识点,特别是关于在Linux和Windows环境下创建和...

    多线程学习笔记与学习

    本篇笔记将深入探讨多线程的各个方面,包括其定义、作用以及如何在Windows环境中进行多线程编程。 首先,进程和线程是操作系统中的基本执行单位。进程是程序的实例,拥有独立的虚拟内存空间和系统资源,当进程结束...

    Python3的多线程学习笔记[定义].pdf

    本篇学习笔记主要涵盖了线程基础、threading模块的使用以及线程同步控制。 首先,线程是操作系统分配CPU执行时间的基本单位,一个进程可以包含多个线程。在Python3中,线程的状态主要包括新建、就绪、运行、死亡、...

    java学习笔记2(多线程)

    java学习笔记2(多线程)java学习笔记2(多线程)

    java多线程学习笔记

    这篇文档和源代码将深入探讨Java多线程的各个方面,旨在帮助学习者掌握这一关键技术。 首先,我们要了解Java中创建线程的两种主要方式:继承Thread类和实现Runnable接口。继承Thread类时,我们需要重写run()方法,...

    多线程学习笔记.docx

    在多线程编程中,进程和线程是两个核心概念。进程是操作系统资源分配的基本单位,每个独立执行的程序都对应一个进程。而线程则是程序执行的最小单元,是进程内部的一条执行路径。多线程是指在一个应用程序中存在多个...

    python线程教程,python线程学习笔记.doc

    在Python编程语言中,线程是并发执行任务的基本单元,允许程序同时处理多个任务。Python标准库中的`threading`模块提供了线程相关的功能。在本文中,我们将深入探讨Python线程的基础知识,包括线程如何访问全局变量...

    Linux 进程 线程学习笔记

    ### Linux进程与线程创建详解 #### 进程与线程的概念 在深入探讨Linux下C语言编程中进程和线程的创建之前,我们先来理解一下进程与线程的基本概念。 - **进程**:是操作系统进行资源分配和调度的基本单位,每个...

Global site tag (gtag.js) - Google Analytics