OpenMP API 3.1 C/C++ Page 1
© 2011 OpenMP ARB OMP0711C
Directives
OpenMP 3.1 API C/C++ Syntax Quick Reference Card
OpenMP Application Program Interface (API) is a portable, scalable
model that gives shared-memory parallel programmers a simple
and flexible interface for developing parallel applications for
platforms ranging from the desktop to the supercomputer.
OpenMP supports multi-platform shared-memory parallel
programming in C/C++ and Fortran on all architectures, including
Unix platforms and Windows NT platforms.
A separate OpenMP reference card for Fortran is also available.
[n.n.n] refers to sections in the OpenMP API Specification available at www.openmp.org.
An OpenMP executable directive applies to the
succeeding structured block or an OpenMP Construct.
A structured-block is a single statement or a compound
statement with a single entry at the top and a single exit
at the bottom.
Parallel [2.4]
The parallel construct forms a team of threads and starts
parallel execution.
#pragma omp parallel [clause[ [, ]clause] ...]
structured-block
clause:
if(scalar-expression)
num_threads(integer-expression)
default(shared | none)
private(list)
firstprivate(list)
shared(list)
copyin(list)
reduction(operator: list)
Loop [2.5.1]
The loop construct specifies that the iterations of
loops will be distributed among and executed by the
encountering team of threads.
#pragma omp for [clause[ [, ]clause] ...]
for-loops
clause:
private(list)
firstprivate(list)
lastprivate(list)
reduction(operator: list)
schedule(kind[, chunk_size])
collapse(n)
ordered
nowait
kind:
• static: Iterations are divided into chunks of size
chunk_size. Chunks are assigned to threads in the
team in round-robin fashion in order of thread
number.
• dynamic: Each thread executes a chunk of iterations
then requests another chunk until no chunks remain
to be distributed.
• guided: Each thread executes a chunk of iterations
then requests another chunk until no chunks remain
to be assigned. The chunk sizes start large and shrink
to the indicated chunk_size as chunks are scheduled.
• auto: The decision regarding scheduling is delegated
to the compiler and/or runtime system.
• runtime: The schedule and chunk size are taken from
the run-sched-var ICV.
Sections [2.5.2]
The sections construct contains a set of structured blocks
that are to be distributed among and executed by the
encountering team of threads.
#pragma omp sections [clause[[,] clause] ...]
{
[#pragma omp section]
structured-block
[#pragma omp section
structured-block]
...
}
clause:
private(list)
firstprivate(list)
lastprivate(list)
reduction(operator: list)
nowait
Single [2.5.3]
The single construct specifies that the associated
structured block is executed by only one of the threads
in the team (not necessarily the master thread), in the
context of its implicit task.
#pragma omp single [clause[ [, ]clause] ...]
structured-block
clause:
private(list)
firstprivate(list)
copyprivate(list)
nowait
Parallel Loop [2.6.1]
The parallel loop construct is a shortcut for specifying
a parallel construct containing one or more associated
loops and no other statements.
#pragma omp parallel for [clause[ [, ]clause] ...]
for-loop
clause:
Any accepted by the parallel or for directives, except
the nowait clause, with identical meanings and
restrictions.
Parallel Sections [2.6.2]
The parallel sections construct is a shortcut for specifying
a parallel construct containing one sections construct and
no other statements.
#pragma omp parallel sections [clause[ [, ]clause] ...]
{
[#pragma omp section]
structured-block
[#pragma omp section
structured-block]
...
}
clause:
Any of the clauses accepted by the parallel or sections
directives, except the nowait clause, with identical
meanings and restrictions.
Task [2.7.1]
The task construct defines an explicit task. The data
environment of the task is created according to the
data-sharing attribute clauses on the task construct
and any defaults that apply.
#pragma omp task [clause[ [, ]clause] ...]
structured-block
clause:
if(scalar-expression)
final(scalar-expression)
untied
default(shared | none)
mergeable
private(list)
firstprivate(list)
shared(list)
Taskyield [2.7.2]
The taskyield construct specifies that the current task can
be suspended in favor of execution of a different task.
#pragma omp taskyield
Master [2.8.1]
The master construct specifies a structured block that
is executed by the master thread of the team. There is
no implied barrier either on entry to, or exit from, the
master construct.
#pragma omp master
structured-block
Critical [2.8.2]
The critical construct restricts execution of the associated
structured block to a single thread at a time.
#pragma omp critical [(name)]
structured-block
Barrier [2.8.3]
The barrier construct specifies an explicit barrier at the
point at which the construct appears.
#pragma omp barrier
Taskwait [2.8.4]
The taskwait construct specifies a wait on the completion
of child tasks of the current task.
#pragma omp taskwait
Atomic [2.8.5]
The atomic construct ensures that a specific storage
location is updated atomically, rather than exposing it to
the possibility of multiple, simultaneous writing threads.
#pragma omp atomic [read | write | update | capture]
expression-stmt
#pragma omp atomic capture
structured-block
where expression-stmt may be one of the following forms:
if clause is... expression-stmt:
read v = x;
write x = expr;
update or
is not present
x++; x--; ++x;
--x; x binop = expr; x = x binop expr;
capture v = x++; v = x--; v = ++x;
v = --x; v = x binop= expr;
and structured-block may be one of the following forms:
{v = x; x binop= expr;} {x binop= expr; v = x;}
{v = x; x = x binop expr;} {x = x binop expr; v = x;}
{v = x; x++;} {v = x; ++x;} {++x; v = x;} {x++; v = x;}
{v = x; x--;} {v = x; --x;} {--x; v = x;} {x--; v = x;}
Flush [2.8.6]
The flush construct executes the OpenMP flush
operation, which makes a thread’s temporary view of
memory consistent with memory, and enforces an order
on the memory operations of the variables.
#pragma omp flush [(list)]
Ordered [2.8.7]
The ordered construct specifies a structured block in a
loop region that will be executed in the order of the loop
iterations. This sequentializes and orders the code within
an ordered region while allowing code outside the region
to run in parallel.
#pragma omp ordered
structured-block
Threadprivate [2.9.2]
The threadprivate directive specifies that variables are
replicated, with each thread having its own copy.
#pragma omp threadprivate(list)
list:
A comma-separated list of file-scope, namespace-
scope, or static block-scope variables that do not have
incomplete types.
Most common form
of the for loop:
for(var = lb;
var relational-op b;
var += incr)
Simple Parallel Loop Example
The following example demonstrates how to
parallelize a simple loop using the parallel loop
construct.
void simple(int n, float *a, float *b)
{
int i;
#pragma omp parallel for
for (i=1; i
本文档为【OpenMP3.1-CCard】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。