Go to the first, previous, next, last section, table of contents.


POSIX Threads

This chapter describes the pthreads (POSIX threads) library. This library provides support functions for multithreaded programs: thread primitives, synchronization objects, and so forth. It also implements POSIX 1003.1b semaphores (not to be confused with System V semaphores).

The threads operations (`pthread_*') do not use errno. Instead they return an error code directly. The semaphore operations do use errno.

Basic Thread Operations

These functions are the thread equivalents of fork, exit, and wait.

Function: int pthread_create (pthread_t * thread, pthread_attr_t * attr, void * (*start_routine)(void *), void * arg)
pthread_create creates a new thread of control that executes concurrently with the calling thread. The new thread calls the function start_routine, passing it arg as first argument. The new thread terminates either explicitly, by calling pthread_exit, or implicitly, by returning from the start_routine function. The latter case is equivalent to calling pthread_exit with the result returned by start_routine as exit code.

The attr argument specifies thread attributes to be applied to the new thread. See section Thread Attributes, for details. The attr argument can also be NULL, in which case default attributes are used: the created thread is joinable (not detached) and has an ordinary (not realtime) scheduling policy.

On success, the identifier of the newly created thread is stored in the location pointed by the thread argument, and a 0 is returned. On error, a non-zero error code is returned.

This function may return the following errors:

EAGAIN
Not enough system resources to create a process for the new thread, or more than PTHREAD_THREADS_MAX threads are already active.

Function: void pthread_exit (void *retval)
pthread_exit terminates the execution of the calling thread. All cleanup handlers (see section Cleanup Handlers) that have been set for the calling thread with pthread_cleanup_push are executed in reverse order (the most recently pushed handler is executed first). Finalization functions for thread-specific data are then called for all keys that have non-NULL values associated with them in the calling thread (see section Thread-Specific Data). Finally, execution of the calling thread is stopped.

The retval argument is the return value of the thread. It can be retrieved from another thread using pthread_join.

The pthread_exit function never returns.

Function: int pthread_cancel (pthread_t thread)

pthread_cancel sends a cancellation request to the thread denoted by the thread argument. If there is no such thread, pthread_cancel fails and returns ESRCH. Otherwise it returns 0. See section Cancellation, for details.

Function: int pthread_join (pthread_t th, void **thread_return)
pthread_join suspends the execution of the calling thread until the thread identified by th terminates, either by calling pthread_exit or by being canceled.

If thread_return is not NULL, the return value of th is stored in the location pointed to by thread_return. The return value of th is either the argument it gave to pthread_exit, or PTHREAD_CANCELED if th was canceled.

The joined thread th must be in the joinable state: it must not have been detached using pthread_detach or the PTHREAD_CREATE_DETACHED attribute to pthread_create.

When a joinable thread terminates, its memory resources (thread descriptor and stack) are not deallocated until another thread performs pthread_join on it. Therefore, pthread_join must be called once for each joinable thread created to avoid memory leaks.

At most one thread can wait for the termination of a given thread. Calling pthread_join on a thread th on which another thread is already waiting for termination returns an error.

pthread_join is a cancellation point. If a thread is canceled while suspended in pthread_join, the thread execution resumes immediately and the cancellation is executed without waiting for the th thread to terminate. If cancellation occurs during pthread_join, the th thread remains not joined.

On success, the return value of th is stored in the location pointed to by thread_return, and 0 is returned. On error, one of the following values is returned:

ESRCH
No thread could be found corresponding to that specified by th.
EINVAL
The th thread has been detached, or another thread is already waiting on termination of th.
EDEADLK
The th argument refers to the calling thread.

Thread Attributes

Threads have a number of attributes that may be set at creation time. This is done by filling a thread attribute object attr of type pthread_attr_t, then passing it as second argument to pthread_create. Passing NULL is equivalent to passing a thread attribute object with all attributes set to their default values.

Attribute objects are consulted only when creating a new thread. The same attribute object can be used for creating several threads. Modifying an attribute object after a call to pthread_create does not change the attributes of the thread previously created.

Function: int pthread_attr_init (pthread_attr_t *attr)
pthread_attr_init initializes the thread attribute object attr and fills it with default values for the attributes. (The default values are listed below for each attribute.)

Each attribute attrname (see below for a list of all attributes) can be individually set using the function pthread_attr_setattrname and retrieved using the function pthread_attr_getattrname.

Function: int pthread_attr_destroy (pthread_attr_t *attr)
pthread_attr_destroy destroys the attribute object pointed to by attr releasing any resources associated with it. attr is left in an undefined state, and you must not use it again in a call to any pthreads function until it has been reinitialized.

Function: int pthread_attr_setattr (pthread_attr_t *obj, int value)
Set attribute attr to value in the attribute object pointed to by obj. See below for a list of possible attributes and the values they can take.

On success, these functions return 0. If value is not meaningful for the attr being modified, they will return the error code EINVAL. Some of the functions have other failure modes; see below.

Function: int pthread_attr_getattr (const pthread_attr_t *obj, int *value)
Store the current setting of attr in obj into the variable pointed to by value.

These functions always return 0.

The following thread attributes are supported:

`detachstate'
Choose whether the thread is created in the joinable state (value PTHREAD_CREATE_JOINABLE) or in the detached state (PTHREAD_CREATE_DETACHED). The default is PTHREAD_CREATE_JOINABLE. In the joinable state, another thread can synchronize on the thread termination and recover its termination code using pthread_join, but some of the thread resources are kept allocated after the thread terminates, and reclaimed only when another thread performs pthread_join on that thread. In the detached state, the thread resources are immediately freed when it terminates, but pthread_join cannot be used to synchronize on the thread termination. A thread created in the joinable state can later be put in the detached thread using pthread_detach.
`schedpolicy'
Select the scheduling policy for the thread: one of SCHED_OTHER (regular, non-realtime scheduling), SCHED_RR (realtime, round-robin) or SCHED_FIFO (realtime, first-in first-out). The default is SCHED_OTHER. The realtime scheduling policies SCHED_RR and SCHED_FIFO are available only to processes with superuser privileges. pthread_attr_setschedparam will fail and return ENOTSUP if you try to set a realtime policy when you are unprivileged. The scheduling policy of a thread can be changed after creation with pthread_setschedparam.
`schedparam'
Change the scheduling parameter (the scheduling priority) for the thread. The default is 0. This attribute is not significant if the scheduling policy is SCHED_OTHER; it only matters for the realtime policies SCHED_RR and SCHED_FIFO. The scheduling priority of a thread can be changed after creation with pthread_setschedparam.
`inheritsched'
Choose whether the scheduling policy and scheduling parameter for the newly created thread are determined by the values of the schedpolicy and schedparam attributes (value PTHREAD_EXPLICIT_SCHED) or are inherited from the parent thread (value PTHREAD_INHERIT_SCHED). The default is PTHREAD_EXPLICIT_SCHED.
`scope'
Choose the scheduling contention scope for the created thread. The default is PTHREAD_SCOPE_SYSTEM, meaning that the threads contend for CPU time with all processes running on the machine. In particular, thread priorities are interpreted relative to the priorities of all other processes on the machine. The other possibility, PTHREAD_SCOPE_PROCESS, means that scheduling contention occurs only between the threads of the running process: thread priorities are interpreted relative to the priorities of the other threads of the process, regardless of the priorities of other processes. PTHREAD_SCOPE_PROCESS is not supported in LinuxThreads. If you try to set the scope to this value, pthread_attr_setscope will fail and return ENOTSUP.
`stackaddr'
Provide an address for an application managed stack. The size of the stack must be at least PTHREAD_STACK_MIN.
`stacksize'
Change the size of the stack created for the thread. The value defines the minimum stack size, in bytes. If the value exceeds the system's maximum stack size, or is smaller than PTHREAD_STACK_MIN, pthread_attr_setstacksize will fail and return EINVAL.
`stack'
Provide both the address and size of an application managed stack to use for the new thread. The base of the memory area is stackaddr with the size of the memory area, stacksize, measured in bytes. If the value of stacksize is less than PTHREAD_STACK_MIN, or greater than the system's maximum stack size, or if the value of stackaddr lacks the proper alignment, pthread_attr_setstack will fail and return EINVAL.
`guardsize'
Change the minimum size in bytes of the guard area for the thread's stack. The default size is a single page. If this value is set, it will be rounded up to the nearest page size. If the value is set to 0, a guard area will not be created for this thread. The space allocated for the guard area is used to catch stack overflow. Therefore, when allocating large structures on the stack, a larger guard area may be required to catch a stack overflow. If the caller is managing their own stacks (if the stackaddr attribute has been set), then the guardsize attribute is ignored. If the value exceeds the stacksize, pthread_atrr_setguardsize will fail and return EINVAL.

Cancellation

Cancellation is the mechanism by which a thread can terminate the execution of another thread. More precisely, a thread can send a cancellation request to another thread. Depending on its settings, the target thread can then either ignore the request, honor it immediately, or defer it till it reaches a cancellation point. When threads are first created by pthread_create, they always defer cancellation requests.

When a thread eventually honors a cancellation request, it behaves as if pthread_exit(PTHREAD_CANCELED) was called. All cleanup handlers are executed in reverse order, finalization functions for thread-specific data are called, and finally the thread stops executing. If the canceled thread was joinable, the return value PTHREAD_CANCELED is provided to whichever thread calls pthread_join on it. See pthread_exit for more information.

Cancellation points are the points where the thread checks for pending cancellation requests and performs them. The POSIX threads functions pthread_join, pthread_cond_wait, pthread_cond_timedwait, pthread_testcancel, sem_wait, and sigwait are cancellation points. In addition, these system calls are cancellation points:

accept @tab open @tab sendmsg
close @tab pause @tab sendto
connect @tab read @tab system
fcntl @tab recv @tab tcdrain
fsync @tab recvfrom @tab wait
lseek @tab recvmsg @tab waitpid
msync @tab send @tab write
nanosleep

All library functions that call these functions (such as printf) are also cancellation points.

Function: int pthread_setcancelstate (int state, int *oldstate)
pthread_setcancelstate changes the cancellation state for the calling thread -- that is, whether cancellation requests are ignored or not. The state argument is the new cancellation state: either PTHREAD_CANCEL_ENABLE to enable cancellation, or PTHREAD_CANCEL_DISABLE to disable cancellation (cancellation requests are ignored).

If oldstate is not NULL, the previous cancellation state is stored in the location pointed to by oldstate, and can thus be restored later by another call to pthread_setcancelstate.

If the state argument is not PTHREAD_CANCEL_ENABLE or PTHREAD_CANCEL_DISABLE, pthread_setcancelstate fails and returns EINVAL. Otherwise it returns 0.

Function: int pthread_setcanceltype (int type, int *oldtype)
pthread_setcanceltype changes the type of responses to cancellation requests for the calling thread: asynchronous (immediate) or deferred. The type argument is the new cancellation type: either PTHREAD_CANCEL_ASYNCHRONOUS to cancel the calling thread as soon as the cancellation request is received, or PTHREAD_CANCEL_DEFERRED to keep the cancellation request pending until the next cancellation point. If oldtype is not NULL, the previous cancellation state is stored in the location pointed to by oldtype, and can thus be restored later by another call to pthread_setcanceltype.

If the type argument is not PTHREAD_CANCEL_DEFERRED or PTHREAD_CANCEL_ASYNCHRONOUS, pthread_setcanceltype fails and returns EINVAL. Otherwise it returns 0.

Function: void pthread_testcancel (void)
pthread_testcancel does nothing except testing for pending cancellation and executing it. Its purpose is to introduce explicit checks for cancellation in long sequences of code that do not call cancellation point functions otherwise.

Cleanup Handlers

Cleanup handlers are functions that get called when a thread terminates, either by calling pthread_exit or because of cancellation. Cleanup handlers are installed and removed following a stack-like discipline.

The purpose of cleanup handlers is to free the resources that a thread may hold at the time it terminates. In particular, if a thread exits or is canceled while it owns a locked mutex, the mutex will remain locked forever and prevent other threads from executing normally. The best way to avoid this is, just before locking the mutex, to install a cleanup handler whose effect is to unlock the mutex. Cleanup handlers can be used similarly to free blocks allocated with malloc or close file descriptors on thread termination.

Here is how to lock a mutex mut in such a way that it will be unlocked if the thread is canceled while mut is locked:

pthread_cleanup_push(pthread_mutex_unlock, (void *) &mut);
pthread_mutex_lock(&mut);
/* do some work */
pthread_mutex_unlock(&mut);
pthread_cleanup_pop(0);

Equivalently, the last two lines can be replaced by

pthread_cleanup_pop(1);

Notice that the code above is safe only in deferred cancellation mode (see pthread_setcanceltype). In asynchronous cancellation mode, a cancellation can occur between pthread_cleanup_push and pthread_mutex_lock, or between pthread_mutex_unlock and pthread_cleanup_pop, resulting in both cases in the thread trying to unlock a mutex not locked by the current thread. This is the main reason why asynchronous cancellation is difficult to use.

If the code above must also work in asynchronous cancellation mode, then it must switch to deferred mode for locking and unlocking the mutex:

pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &oldtype);
pthread_cleanup_push(pthread_mutex_unlock, (void *) &mut);
pthread_mutex_lock(&mut);
/* do some work */
pthread_cleanup_pop(1);
pthread_setcanceltype(oldtype, NULL);

The code above can be rewritten in a more compact and efficient way, using the non-portable functions pthread_cleanup_push_defer_np and pthread_cleanup_pop_restore_np:

pthread_cleanup_push_defer_np(pthread_mutex_unlock, (void *) &mut);
pthread_mutex_lock(&mut);
/* do some work */
pthread_cleanup_pop_restore_np(1);

Function: void pthread_cleanup_push (void (*routine) (void *), void *arg)

pthread_cleanup_push installs the routine function with argument arg as a cleanup handler. From this point on to the matching pthread_cleanup_pop, the function routine will be called with arguments arg when the thread terminates, either through pthread_exit or by cancellation. If several cleanup handlers are active at that point, they are called in LIFO order: the most recently installed handler is called first.

Function: void pthread_cleanup_pop (int execute)
pthread_cleanup_pop removes the most recently installed cleanup handler. If the execute argument is not 0, it also executes the handler, by calling the routine function with arguments arg. If the execute argument is 0, the handler is only removed but not executed.

Matching pairs of pthread_cleanup_push and pthread_cleanup_pop must occur in the same function, at the same level of block nesting. Actually, pthread_cleanup_push and pthread_cleanup_pop are macros, and the expansion of pthread_cleanup_push introduces an open brace { with the matching closing brace } being introduced by the expansion of the matching pthread_cleanup_pop.

Function: void pthread_cleanup_push_defer_np (void (*routine) (void *), void *arg)
pthread_cleanup_push_defer_np is a non-portable extension that combines pthread_cleanup_push and pthread_setcanceltype. It pushes a cleanup handler just as pthread_cleanup_push does, but also saves the current cancellation type and sets it to deferred cancellation. This ensures that the cleanup mechanism is effective even if the thread was initially in asynchronous cancellation mode.

Function: void pthread_cleanup_pop_restore_np (int execute)
pthread_cleanup_pop_restore_np pops a cleanup handler introduced by pthread_cleanup_push_defer_np, and restores the cancellation type to its value at the time pthread_cleanup_push_defer_np was called.

pthread_cleanup_push_defer_np and pthread_cleanup_pop_restore_np must occur in matching pairs, at the same level of block nesting.

The sequence

pthread_cleanup_push_defer_np(routine, arg);
...
pthread_cleanup_pop_defer_np(execute);

is functionally equivalent to (but more compact and efficient than)

{
  int oldtype;
  pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &oldtype);
  pthread_cleanup_push(routine, arg);
  ...
  pthread_cleanup_pop(execute);
  pthread_setcanceltype(oldtype, NULL);
}

Mutexes

A mutex is a MUTual EXclusion device, and is useful for protecting shared data structures from concurrent modifications, and implementing critical sections and monitors.

A mutex has two possible states: unlocked (not owned by any thread), and locked (owned by one thread). A mutex can never be owned by two different threads simultaneously. A thread attempting to lock a mutex that is already locked by another thread is suspended until the owning thread unlocks the mutex first.

None of the mutex functions is a cancellation point, not even pthread_mutex_lock, in spite of the fact that it can suspend a thread for arbitrary durations. This way, the status of mutexes at cancellation points is predictable, allowing cancellation handlers to unlock precisely those mutexes that need to be unlocked before the thread stops executing. Consequently, threads using deferred cancellation should never hold a mutex for extended periods of time.

It is not safe to call mutex functions from a signal handler. In particular, calling pthread_mutex_lock or pthread_mutex_unlock from a signal handler may deadlock the calling thread.

Function: int pthread_mutex_init (pthread_mutex_t *mutex, const pthread_mutexattr_t *mutexattr)

pthread_mutex_init initializes the mutex object pointed to by mutex according to the mutex attributes specified in mutexattr. If mutexattr is NULL, default attributes are used instead.

The LinuxThreads implementation supports only one mutex attribute, the mutex type, which is either "fast", "recursive", or "error checking". The type of a mutex determines whether it can be locked again by a thread that already owns it. The default type is "fast".

Variables of type pthread_mutex_t can also be initialized statically, using the constants PTHREAD_MUTEX_INITIALIZER (for timed mutexes), PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP (for recursive mutexes), PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP (for fast mutexes(, and PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP (for error checking mutexes).

pthread_mutex_init always returns 0.

Function: int pthread_mutex_lock (pthread_mutex_t *mutex))
pthread_mutex_lock locks the given mutex. If the mutex is currently unlocked, it becomes locked and owned by the calling thread, and pthread_mutex_lock returns immediately. If the mutex is already locked by another thread, pthread_mutex_lock suspends the calling thread until the mutex is unlocked.

If the mutex is already locked by the calling thread, the behavior of pthread_mutex_lock depends on the type of the mutex. If the mutex is of the "fast" type, the calling thread is suspended. It will remain suspended forever, because no other thread can unlock the mutex. If the mutex is of the "error checking" type, pthread_mutex_lock returns immediately with the error code EDEADLK. If the mutex is of the "recursive" type, pthread_mutex_lock succeeds and returns immediately, recording the number of times the calling thread has locked the mutex. An equal number of pthread_mutex_unlock operations must be performed before the mutex returns to the unlocked state.

Function: int pthread_mutex_trylock (pthread_mutex_t *mutex)
pthread_mutex_trylock behaves identically to pthread_mutex_lock, except that it does not block the calling thread if the mutex is already locked by another thread (or by the calling thread in the case of a "fast" mutex). Instead, pthread_mutex_trylock returns immediately with the error code EBUSY.

Function: int pthread_mutex_timedlock (pthread_mutex_t *mutex, const struct timespec *abstime)
The pthread_mutex_timedlock is similar to the pthread_mutex_lock function but instead of blocking for in indefinite time if the mutex is locked by another thread, it returns when the time specified in abstime is reached.

This function can only be used on standard ("timed") and "error checking" mutexes. It behaves just like pthread_mutex_lock for all other types.

If the mutex is successfully locked, the function returns zero. If the time specified in abstime is reached without the mutex being locked, ETIMEDOUT is returned.

This function was introduced in the POSIX.1d revision of the POSIX standard.

Function: int pthread_mutex_unlock (pthread_mutex_t *mutex)
pthread_mutex_unlock unlocks the given mutex. The mutex is assumed to be locked and owned by the calling thread on entrance to pthread_mutex_unlock. If the mutex is of the "fast" type, pthread_mutex_unlock always returns it to the unlocked state. If it is of the "recursive" type, it decrements the locking count of the mutex (number of pthread_mutex_lock operations performed on it by the calling thread), and only when this count reaches zero is the mutex actually unlocked.

On "error checking" mutexes, pthread_mutex_unlock actually checks at run-time that the mutex is locked on entrance, and that it was locked by the same thread that is now calling pthread_mutex_unlock. If these conditions are not met, pthread_mutex_unlock returns EPERM, and the mutex remains unchanged. "Fast" and "recursive" mutexes perform no such checks, thus allowing a locked mutex to be unlocked by a thread other than its owner. This is non-portable behavior and must not be relied upon.

Function: int pthread_mutex_destroy (pthread_mutex_t *mutex)
pthread_mutex_destroy destroys a mutex object, freeing the resources it might hold. The mutex must be unlocked on entrance. In the LinuxThreads implementation, no resources are associated with mutex objects, thus pthread_mutex_destroy actually does nothing except checking that the mutex is unlocked.

If the mutex is locked by some thread, pthread_mutex_destroy returns EBUSY. Otherwise it returns 0.

If any of the above functions (except pthread_mutex_init) is applied to an uninitialized mutex, they will simply return EINVAL and do nothing.

A shared global variable x can be protected by a mutex as follows:

int x;
pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;

All accesses and modifications to x should be bracketed by calls to pthread_mutex_lock and pthread_mutex_unlock as follows:

pthread_mutex_lock(&mut);
/* operate on x */
pthread_mutex_unlock(&mut);

Mutex attributes can be specified at mutex creation time, by passing a mutex attribute object as second argument to pthread_mutex_init. Passing NULL is equivalent to passing a mutex attribute object with all attributes set to their default values.

Function: int pthread_mutexattr_init (pthread_mutexattr_t *attr)
pthread_mutexattr_init initializes the mutex attribute object attr and fills it with default values for the attributes.

This function always returns 0.

Function: int pthread_mutexattr_destroy (pthread_mutexattr_t *attr)
pthread_mutexattr_destroy destroys a mutex attribute object, which must not be reused until it is reinitialized. pthread_mutexattr_destroy does nothing in the LinuxThreads implementation.

This function always returns 0.

LinuxThreads supports only one mutex attribute: the mutex type, which is either PTHREAD_MUTEX_ADAPTIVE_NP for "fast" mutexes, PTHREAD_MUTEX_RECURSIVE_NP for "recursive" mutexes, PTHREAD_MUTEX_TIMED_NP for "timed" mutexes, or PTHREAD_MUTEX_ERRORCHECK_NP for "error checking" mutexes. As the NP suffix indicates, this is a non-portable extension to the POSIX standard and should not be employed in portable programs.

The mutex type determines what happens if a thread attempts to lock a mutex it already owns with pthread_mutex_lock. If the mutex is of the "fast" type, pthread_mutex_lock simply suspends the calling thread forever. If the mutex is of the "error checking" type, pthread_mutex_lock returns immediately with the error code EDEADLK. If the mutex is of the "recursive" type, the call to pthread_mutex_lock returns immediately with a success return code. The number of times the thread owning the mutex has locked it is recorded in the mutex. The owning thread must call pthread_mutex_unlock the same number of times before the mutex returns to the unlocked state.

The default mutex type is "timed", that is, PTHREAD_MUTEX_TIMED_NP.

Function: int pthread_mutexattr_settype (pthread_mutexattr_t *attr, int type)
pthread_mutexattr_settype sets the mutex type attribute in attr to the value specified by type.

If type is not PTHREAD_MUTEX_ADAPTIVE_NP, PTHREAD_MUTEX_RECURSIVE_NP, PTHREAD_MUTEX_TIMED_NP, or PTHREAD_MUTEX_ERRORCHECK_NP, this function will return EINVAL and leave attr unchanged.

The standard Unix98 identifiers PTHREAD_MUTEX_DEFAULT, PTHREAD_MUTEX_NORMAL, PTHREAD_MUTEX_RECURSIVE, and PTHREAD_MUTEX_ERRORCHECK are also permitted.

Function: int pthread_mutexattr_gettype (const pthread_mutexattr_t *attr, int *type)
pthread_mutexattr_gettype retrieves the current value of the mutex type attribute in attr and stores it in the location pointed to by type.

This function always returns 0.

Condition Variables

A condition (short for "condition variable") is a synchronization device that allows threads to suspend execution until some predicate on shared data is satisfied. The basic operations on conditions are: signal the condition (when the predicate becomes true), and wait for the condition, suspending the thread execution until another thread signals the condition.

A condition variable must always be associated with a mutex, to avoid the race condition where a thread prepares to wait on a condition variable and another thread signals the condition just before the first thread actually waits on it.

Function: int pthread_cond_init (pthread_cond_t *cond, pthread_condattr_t *cond_attr)

pthread_cond_init initializes the condition variable cond, using the condition attributes specified in cond_attr, or default attributes if cond_attr is NULL. The LinuxThreads implementation supports no attributes for conditions, hence the cond_attr parameter is actually ignored.

Variables of type pthread_cond_t can also be initialized statically, using the constant PTHREAD_COND_INITIALIZER.

This function always returns 0.

Function: int pthread_cond_signal (pthread_cond_t *cond)
pthread_cond_signal restarts one of the threads that are waiting on the condition variable cond. If no threads are waiting on cond, nothing happens. If several threads are waiting on cond, exactly one is restarted, but it is not specified which.

This function always returns 0.

Function: int pthread_cond_broadcast (pthread_cond_t *cond)
pthread_cond_broadcast restarts all the threads that are waiting on the condition variable cond. Nothing happens if no threads are waiting on cond.

This function always returns 0.

Function: int pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
pthread_cond_wait atomically unlocks the mutex (as per pthread_unlock_mutex) and waits for the condition variable cond to be signaled. The thread execution is suspended and does not consume any CPU time until the condition variable is signaled. The mutex must be locked by the calling thread on entrance to pthread_cond_wait. Before returning to the calling thread, pthread_cond_wait re-acquires mutex (as per pthread_lock_mutex).

Unlocking the mutex and suspending on the condition variable is done atomically. Thus, if all threads always acquire the mutex before signaling the condition, this guarantees that the condition cannot be signaled (and thus ignored) between the time a thread locks the mutex and the time it waits on the condition variable.

This function always returns 0.

Function: int pthread_cond_timedwait (pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *abstime)
pthread_cond_timedwait atomically unlocks mutex and waits on cond, as pthread_cond_wait does, but it also bounds the duration of the wait. If cond has not been signaled before time abstime, the mutex mutex is re-acquired and pthread_cond_timedwait returns the error code ETIMEDOUT. The wait can also be interrupted by a signal; in that case pthread_cond_timedwait returns EINTR.

The abstime parameter specifies an absolute time, with the same origin as time and gettimeofday: an abstime of 0 corresponds to 00:00:00 GMT, January 1, 1970.

Function: int pthread_cond_destroy (pthread_cond_t *cond)
pthread_cond_destroy destroys the condition variable cond, freeing the resources it might hold. If any threads are waiting on the condition variable, pthread_cond_destroy leaves cond untouched and returns EBUSY. Otherwise it returns 0, and cond must not be used again until it is reinitialized.

In the LinuxThreads implementation, no resources are associated with condition variables, so pthread_cond_destroy actually does nothing.

pthread_cond_wait and pthread_cond_timedwait are cancellation points. If a thread is canceled while suspended in one of these functions, the thread immediately resumes execution, relocks the mutex specified by mutex, and finally executes the cancellation. Consequently, cleanup handlers are assured that mutex is locked when they are called.

It is not safe to call the condition variable functions from a signal handler. In particular, calling pthread_cond_signal or pthread_cond_broadcast from a signal handler may deadlock the calling thread.

Consider two shared variables x and y, protected by the mutex mut, and a condition variable cond that is to be signaled whenever x becomes greater than y.

int x,y;
pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

Waiting until x is greater than y is performed as follows:

pthread_mutex_lock(&mut);
while (x <= y) {
        pthread_cond_wait(&cond, &mut);
}
/* operate on x and y */
pthread_mutex_unlock(&mut);

Modifications on x and y that may cause x to become greater than y should signal the condition if needed:

pthread_mutex_lock(&mut);
/* modify x and y */
if (x > y) pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mut);

If it can be proved that at most one waiting thread needs to be waken up (for instance, if there are only two threads communicating through x and y), pthread_cond_signal can be used as a slightly more efficient alternative to pthread_cond_broadcast. In doubt, use pthread_cond_broadcast.

To wait for x to becomes greater than y with a timeout of 5 seconds, do:

struct timeval now;
struct timespec timeout;
int retcode;

pthread_mutex_lock(&mut);
gettimeofday(&now);
timeout.tv_sec = now.tv_sec + 5;
timeout.tv_nsec = now.tv_usec * 1000;
retcode = 0;
while (x <= y && retcode != ETIMEDOUT) {
        retcode = pthread_cond_timedwait(&cond, &mut, &timeout);
}
if (retcode == ETIMEDOUT) {
        /* timeout occurred */
} else {
        /* operate on x and y */
}
pthread_mutex_unlock(&mut);

Condition attributes can be specified at condition creation time, by passing a condition attribute object as second argument to pthread_cond_init. Passing NULL is equivalent to passing a condition attribute object with all attributes set to their default values.

The LinuxThreads implementation supports no attributes for conditions. The functions on condition attributes are included only for compliance with the POSIX standard.

Function: int pthread_condattr_init (pthread_condattr_t *attr)
Function: int pthread_condattr_destroy (pthread_condattr_t *attr)
pthread_condattr_init initializes the condition attribute object attr and fills it with default values for the attributes. pthread_condattr_destroy destroys the condition attribute object attr.

Both functions do nothing in the LinuxThreads implementation.

pthread_condattr_init and pthread_condattr_destroy always return 0.

POSIX Semaphores

Semaphores are counters for resources shared between threads. The basic operations on semaphores are: increment the counter atomically, and wait until the counter is non-null and decrement it atomically.

Semaphores have a maximum value past which they cannot be incremented. The macro SEM_VALUE_MAX is defined to be this maximum value. In the GNU C library, SEM_VALUE_MAX is equal to INT_MAX (see section Range of an Integer Type), but it may be much smaller on other systems.

The pthreads library implements POSIX 1003.1b semaphores. These should not be confused with System V semaphores (ipc, semctl and semop).

All the semaphore functions and macros are defined in `semaphore.h'.

Function: int sem_init (sem_t *sem, int pshared, unsigned int value)
sem_init initializes the semaphore object pointed to by sem. The count associated with the semaphore is set initially to value. The pshared argument indicates whether the semaphore is local to the current process (pshared is zero) or is to be shared between several processes (pshared is not zero).

On success sem_init returns 0. On failure it returns -1 and sets errno to one of the following values:

EINVAL
value exceeds the maximal counter value SEM_VALUE_MAX
ENOSYS
pshared is not zero. LinuxThreads currently does not support process-shared semaphores. (This will eventually change.)

Function: int sem_destroy (sem_t * sem)
sem_destroy destroys a semaphore object, freeing the resources it might hold. If any threads are waiting on the semaphore when sem_destroy is called, it fails and sets errno to EBUSY.

In the LinuxThreads implementation, no resources are associated with semaphore objects, thus sem_destroy actually does nothing except checking that no thread is waiting on the semaphore. This will change when process-shared semaphores are implemented.

Function: int sem_wait (sem_t * sem)
sem_wait suspends the calling thread until the semaphore pointed to by sem has non-zero count. It then atomically decreases the semaphore count.

sem_wait is a cancellation point. It always returns 0.

Function: int sem_trywait (sem_t * sem)
sem_trywait is a non-blocking variant of sem_wait. If the semaphore pointed to by sem has non-zero count, the count is atomically decreased and sem_trywait immediately returns 0. If the semaphore count is zero, sem_trywait immediately returns -1 and sets errno to EAGAIN.

Function: int sem_post (sem_t * sem)
sem_post atomically increases the count of the semaphore pointed to by sem. This function never blocks.

On processors supporting atomic compare-and-swap (Intel 486, Pentium and later, Alpha, PowerPC, MIPS II, Motorola 68k, Ultrasparc), the sem_post function is can safely be called from signal handlers. This is the only thread synchronization function provided by POSIX threads that is async-signal safe. On the Intel 386 and earlier Sparc chips, the current LinuxThreads implementation of sem_post is not async-signal safe, because the hardware does not support the required atomic operations.

sem_post always succeeds and returns 0, unless the semaphore count would exceed SEM_VALUE_MAX after being incremented. In that case sem_post returns -1 and sets errno to EINVAL. The semaphore count is left unchanged.

Function: int sem_getvalue (sem_t * sem, int * sval)
sem_getvalue stores in the location pointed to by sval the current count of the semaphore sem. It always returns 0.

Thread-Specific Data

Programs often need global or static variables that have different values in different threads. Since threads share one memory space, this cannot be achieved with regular variables. Thread-specific data is the POSIX threads answer to this need.

Each thread possesses a private memory block, the thread-specific data area, or TSD area for short. This area is indexed by TSD keys. The TSD area associates values of type void * to TSD keys. TSD keys are common to all threads, but the value associated with a given TSD key can be different in each thread.

For concreteness, the TSD areas can be viewed as arrays of void * pointers, TSD keys as integer indices into these arrays, and the value of a TSD key as the value of the corresponding array element in the calling thread.

When a thread is created, its TSD area initially associates NULL with all keys.

Function: int pthread_key_create (pthread_key_t *key, void (*destr_function) (void *))
pthread_key_create allocates a new TSD key. The key is stored in the location pointed to by key. There is a limit of PTHREAD_KEYS_MAX on the number of keys allocated at a given time. The value initially associated with the returned key is NULL in all currently executing threads.

The destr_function argument, if not NULL, specifies a destructor function associated with the key. When a thread terminates via pthread_exit or by cancellation, destr_function is called on the value associated with the key in that thread. The destr_function is not called if a key is deleted with pthread_key_delete or a value is changed with pthread_setspecific. The order in which destructor functions are called at thread termination time is unspecified.

Before the destructor function is called, the NULL value is associated with the key in the current thread. A destructor function might, however, re-associate non-NULL values to that key or some other key. To deal with this, if after all the destructors have been called for all non-NULL values, there are still some non-NULL values with associated destructors, then the process is repeated. The LinuxThreads implementation stops the process after PTHREAD_DESTRUCTOR_ITERATIONS iterations, even if some non-NULL values with associated descriptors remain. Other implementations may loop indefinitely.

pthread_key_create returns 0 unless PTHREAD_KEYS_MAX keys have already been allocated, in which case it fails and returns EAGAIN.

Function: int pthread_key_delete (pthread_key_t key)
pthread_key_delete deallocates a TSD key. It does not check whether non-NULL values are associated with that key in the currently executing threads, nor call the destructor function associated with the key.

If there is no such key key, it returns EINVAL. Otherwise it returns 0.

Function: int pthread_setspecific (pthread_key_t key, const void *pointer)
pthread_setspecific changes the value associated with key in the calling thread, storing the given pointer instead.

If there is no such key key, it returns EINVAL. Otherwise it returns 0.

Function: void * pthread_getspecific (pthread_key_t key)
pthread_getspecific returns the value currently associated with key in the calling thread.

If there is no such key key, it returns NULL.

The following code fragment allocates a thread-specific array of 100 characters, with automatic reclaimation at thread exit:

/* Key for the thread-specific buffer */
static pthread_key_t buffer_key;

/* Once-only initialisation of the key */
static pthread_once_t buffer_key_once = PTHREAD_ONCE_INIT;

/* Allocate the thread-specific buffer */
void buffer_alloc(void)
{
  pthread_once(&buffer_key_once, buffer_key_alloc);
  pthread_setspecific(buffer_key, malloc(100));
}

/* Return the thread-specific buffer */
char * get_buffer(void)
{
  return (char *) pthread_getspecific(buffer_key);
}

/* Allocate the key */
static void buffer_key_alloc()
{
  pthread_key_create(&buffer_key, buffer_destroy);
}

/* Free the thread-specific buffer */
static void buffer_destroy(void * buf)
{
  free(buf);
}

Threads and Signal Handling

Function: int pthread_sigmask (int how, const sigset_t *newmask, sigset_t *oldmask)
pthread_sigmask changes the signal mask for the calling thread as described by the how and newmask arguments. If oldmask is not NULL, the previous signal mask is stored in the location pointed to by oldmask.

The meaning of the how and newmask arguments is the same as for sigprocmask. If how is SIG_SETMASK, the signal mask is set to newmask. If how is SIG_BLOCK, the signals specified to newmask are added to the current signal mask. If how is SIG_UNBLOCK, the signals specified to newmask are removed from the current signal mask.

Recall that signal masks are set on a per-thread basis, but signal actions and signal handlers, as set with sigaction, are shared between all threads.

The pthread_sigmask function returns 0 on success, and one of the following error codes on error:

EINVAL
how is not one of SIG_SETMASK, SIG_BLOCK, or SIG_UNBLOCK
EFAULT
newmask or oldmask point to invalid addresses

Function: int pthread_kill (pthread_t thread, int signo)
pthread_kill sends signal number signo to the thread thread. The signal is delivered and handled as described in section Signal Handling.

pthread_kill returns 0 on success, one of the following error codes on error:

EINVAL
signo is not a valid signal number
ESRCH
The thread thread does not exist (e.g. it has already terminated)

Function: int sigwait (const sigset_t *set, int *sig)
sigwait suspends the calling thread until one of the signals in set is delivered to the calling thread. It then stores the number of the signal received in the location pointed to by sig and returns. The signals in set must be blocked and not ignored on entrance to sigwait. If the delivered signal has a signal handler function attached, that function is not called.

sigwait is a cancellation point. It always returns 0.

For sigwait to work reliably, the signals being waited for must be blocked in all threads, not only in the calling thread, since otherwise the POSIX semantics for signal delivery do not guarantee that it's the thread doing the sigwait that will receive the signal. The best way to achieve this is block those signals before any threads are created, and never unblock them in the program other than by calling sigwait.

Signal handling in LinuxThreads departs significantly from the POSIX standard. According to the standard, "asynchronous" (external) signals are addressed to the whole process (the collection of all threads), which then delivers them to one particular thread. The thread that actually receives the signal is any thread that does not currently block the signal.

In LinuxThreads, each thread is actually a kernel process with its own PID, so external signals are always directed to one particular thread. If, for instance, another thread is blocked in sigwait on that signal, it will not be restarted.

The LinuxThreads implementation of sigwait installs dummy signal handlers for the signals in set for the duration of the wait. Since signal handlers are shared between all threads, other threads must not attach their own signal handlers to these signals, or alternatively they should all block these signals (which is recommended anyway).

Threads and Fork

It's not intuitively obvious what should happen when a multi-threaded POSIX process calls fork. Not only are the semantics tricky, but you may need to write code that does the right thing at fork time even if that code doesn't use the fork function. Moreover, you need to be aware of interaction between fork and some library features like pthread_once and stdio streams.

When fork is called by one of the threads of a process, it creates a new process which is copy of the calling process. Effectively, in addition to copying certain system objects, the function takes a snapshot of the memory areas of the parent process, and creates identical areas in the child. To make matters more complicated, with threads it's possible for two or more threads to concurrently call fork to create two or more child processes.

The child process has a copy of the address space of the parent, but it does not inherit any of its threads. Execution of the child process is carried out by a new thread which returns from fork function with a return value of zero; it is the only thread in the child process. Because threads are not inherited across fork, issues arise. At the time of the call to fork, threads in the parent process other than the one calling fork may have been executing critical regions of code. As a result, the child process may get a copy of objects that are not in a well-defined state. This potential problem affects all components of the program.

Any program component which will continue being used in a child process must correctly handle its state during fork. For this purpose, the POSIX interface provides the special function pthread_atfork for installing pointers to handler functions which are called from within fork.

Function: int pthread_atfork (void (*prepare)(void), void (*parent)(void), void (*child)(void))

pthread_atfork registers handler functions to be called just before and just after a new process is created with fork. The prepare handler will be called from the parent process, just before the new process is created. The parent handler will be called from the parent process, just before fork returns. The child handler will be called from the child process, just before fork returns.

pthread_atfork returns 0 on success and a non-zero error code on error.

One or more of the three handlers prepare, parent and child can be given as NULL, meaning that no handler needs to be called at the corresponding point.

pthread_atfork can be called several times to install several sets of handlers. At fork time, the prepare handlers are called in LIFO order (last added with pthread_atfork, first called before fork), while the parent and child handlers are called in FIFO order (first added, first called).

If there is insufficient memory available to register the handlers, pthread_atfork fails and returns ENOMEM. Otherwise it returns 0.

The functions fork and pthread_atfork must not be regarded as reentrant from the context of the handlers. That is to say, if a pthread_atfork handler invoked from within fork calls pthread_atfork or fork, the behavior is undefined.

Registering a triplet of handlers is an atomic operation with respect to fork. If new handlers are registered at about the same time as a fork occurs, either all three handlers will be called, or none of them will be called.

The handlers are inherited by the child process, and there is no way to remove them, short of using exec to load a new pocess image.

To understand the purpose of pthread_atfork, recall that fork duplicates the whole memory space, including mutexes in their current locking state, but only the calling thread: other threads are not running in the child process. The mutexes are not usable after the fork and must be initialized with pthread_mutex_init in the child process. This is a limitation of the current implementation and might or might not be present in future versions.

To avoid this, install handlers with pthread_atfork as follows: have the prepare handler lock the mutexes (in locking order), and the parent handler unlock the mutexes. The child handler should reset the mutexes using pthread_mutex_init, as well as any other synchronization objects such as condition variables.

Locking the global mutexes before the fork ensures that all other threads are locked out of the critical regions of code protected by those mutexes. Thus when fork takes a snapshot of the parent's address space, that snapshot will copy valid, stable data. Resetting the synchronization objects in the child process will ensure they are properly cleansed of any artifacts from the threading subsystem of the parent process. For example, a mutex may inherit a wait queue of threads waiting for the lock; this wait queue makes no sense in the child process. Initializing the mutex takes care of this.

Streams and Fork

The GNU standard I/O library has an internal mutex which guards the internal linked list of all standard C FILE objects. This mutex is properly taken care of during fork so that the child receives an intact copy of the list. This allows the fopen function, and related stream-creating functions, to work correctly in the child process, since these functions need to insert into the list.

However, the individual stream locks are not completely taken care of. Thus unless the multithreaded application takes special precautions in its use of fork, the child process might not be able to safely use the streams that it inherited from the parent. In general, for any given open stream in the parent that is to be used by the child process, the application must ensure that that stream is not in use by another thread when fork is called. Otherwise an inconsistent copy of the stream object be produced. An easy way to ensure this is to use flockfile to lock the stream prior to calling fork and then unlock it with funlockfile inside the parent process, provided that the parent's threads properly honor these locks. Nothing special needs to be done in the child process, since the library internally resets all stream locks.

Note that the stream locks are not shared between the parent and child. For example, even if you ensure that, say, the stream stdout is properly treated and can be safely used in the child, the stream locks do not provide an exclusion mechanism between the parent and child. If both processes write to stdout, strangely interleaved output may result regardless of the explicit use of flockfile or implicit locks.

Also note that these provisions are a GNU extension; other systems might not provide any way for streams to be used in the child of a multithreaded process. POSIX requires that such a child process confines itself to calling only asynchronous safe functions, which excludes much of the library, including standard I/O.

Miscellaneous Thread Functions

Function: pthread_t pthread_self (void)
pthread_self returns the thread identifier for the calling thread.

Function: int pthread_equal (pthread_t thread1, pthread_t thread2)
pthread_equal determines if two thread identifiers refer to the same thread.

A non-zero value is returned if thread1 and thread2 refer to the same thread. Otherwise, 0 is returned.

Function: int pthread_detach (pthread_t th)
pthread_detach puts the thread th in the detached state. This guarantees that the memory resources consumed by th will be freed immediately when th terminates. However, this prevents other threads from synchronizing on the termination of th using pthread_join.

A thread can be created initially in the detached state, using the detachstate attribute to pthread_create. In contrast, pthread_detach applies to threads created in the joinable state, and which need to be put in the detached state later.

After pthread_detach completes, subsequent attempts to perform pthread_join on th will fail. If another thread is already joining the thread th at the time pthread_detach is called, pthread_detach does nothing and leaves th in the joinable state.

On success, 0 is returned. On error, one of the following codes is returned:

ESRCH
No thread could be found corresponding to that specified by th
EINVAL
The thread th is already in the detached state

Function: void pthread_kill_other_threads_np (void)
pthread_kill_other_threads_np is a non-portable LinuxThreads extension. It causes all threads in the program to terminate immediately, except the calling thread which proceeds normally. It is intended to be called just before a thread calls one of the exec functions, e.g. execve.

Termination of the other threads is not performed through pthread_cancel and completely bypasses the cancellation mechanism. Hence, the current settings for cancellation state and cancellation type are ignored, and the cleanup handlers are not executed in the terminated threads.

According to POSIX 1003.1c, a successful exec* in one of the threads should automatically terminate all other threads in the program. This behavior is not yet implemented in LinuxThreads. Calling pthread_kill_other_threads_np before exec* achieves much of the same behavior, except that if exec* ultimately fails, then all other threads are already killed.

Function: int pthread_once (pthread_once_t *once_control, void (*init_routine) (void))

The purpose of pthread_once is to ensure that a piece of initialization code is executed at most once. The once_control argument points to a static or extern variable statically initialized to PTHREAD_ONCE_INIT.

The first time pthread_once is called with a given once_control argument, it calls init_routine with no argument and changes the value of the once_control variable to record that initialization has been performed. Subsequent calls to pthread_once with the same once_control argument do nothing.

If a thread is cancelled while executing init_routine the state of the once_control variable is reset so that a future call to pthread_once will call the routine again.

If the process forks while one or more threads are executing pthread_once initialization routines, the states of their respective once_control variables will appear to be reset in the child process so that if the child calls pthread_once, the routines will be executed.

pthread_once always returns 0.

Function: int pthread_setschedparam (pthread_t target_thread, int policy, const struct sched_param *param)

pthread_setschedparam sets the scheduling parameters for the thread target_thread as indicated by policy and param. policy can be either SCHED_OTHER (regular, non-realtime scheduling), SCHED_RR (realtime, round-robin) or SCHED_FIFO (realtime, first-in first-out). param specifies the scheduling priority for the two realtime policies. See sched_setpolicy for more information on scheduling policies.

The realtime scheduling policies SCHED_RR and SCHED_FIFO are available only to processes with superuser privileges.

On success, pthread_setschedparam returns 0. On error it returns one of the following codes:

EINVAL
policy is not one of SCHED_OTHER, SCHED_RR, SCHED_FIFO, or the priority value specified by param is not valid for the specified policy
EPERM
Realtime scheduling was requested but the calling process does not have sufficient privileges.
ESRCH
The target_thread is invalid or has already terminated
EFAULT
param points outside the process memory space

Function: int pthread_getschedparam (pthread_t target_thread, int *policy, struct sched_param *param)

pthread_getschedparam retrieves the scheduling policy and scheduling parameters for the thread target_thread and stores them in the locations pointed to by policy and param, respectively.

pthread_getschedparam returns 0 on success, or one of the following error codes on failure:

ESRCH
The target_thread is invalid or has already terminated.
EFAULT
policy or param point outside the process memory space.

Function: int pthread_setconcurrency (int level)
pthread_setconcurrency is unused in LinuxThreads due to the lack of a mapping of user threads to kernel threads. It exists for source compatibility. It does store the value level so that it can be returned by a subsequent call to pthread_getconcurrency. It takes no other action however.

Function: int pthread_getconcurrency ()
pthread_getconcurrency is unused in LinuxThreads due to the lack of a mapping of user threads to kernel threads. It exists for source compatibility. However, it will return the value that was set by the last call to pthread_setconcurrency.


Go to the first, previous, next, last section, table of contents.