Skip to content

Commit

Permalink
Updated to TBB 4.4 update 4
Browse files Browse the repository at this point in the history
  • Loading branch information
wjakob committed May 12, 2016
1 parent 4fa22d5 commit 92a3dcd
Show file tree
Hide file tree
Showing 1,374 changed files with 52,091 additions and 31,690 deletions.
299 changes: 297 additions & 2 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,6 +1,300 @@
------------------------------------------------------------------------
The list of most significant changes made over time in
Intel(R) Threading Building Blocks (Intel(R) TBB).

Intel TBB 4.4 Update 4
TBB_INTERFACE_VERSION == 9004

Changes (w.r.t. Intel TBB 4.4 Update 3):

- Removed a few cases of excessive user data copying in the flow graph.
- Improved robustness of concurrent_bounded_queue::abort() in case of
simultaneous push and pop operations.

Preview Features:

- Added tbb::flow::async_msg, a special message type to support
communications between the flow graph and external asynchronous
activities.
- async_node modified to support use with C++03 compilers.

Bugs fixed:

- Fixed a bug in dynamic memory allocation replacement for Windows* OS.
- Fixed excessive memory consumption on Linux* OS caused by enabling
zero-copy realloc.
- Fixed performance regression on Intel(R) Xeon Phi(tm) coprocessor with
auto_partitioner.

------------------------------------------------------------------------
Intel TBB 4.4 Update 3
TBB_INTERFACE_VERSION == 9003

Changes (w.r.t. Intel TBB 4.4 Update 2):

- Modified parallel_sort to not require a default constructor for values
and to use iter_swap() for value swapping.
- Added support for creating or initializing a task_arena instance that
is connected to the arena currently used by the thread.
- graph/binpack example modified to use multifunction_node.
- For performance analysis, use Intel(R) VTune(TM) Amplifier XE 2015
and higher; older versions are no longer supported.
- Improved support for compilation with disabled RTTI, by omitting its use
in auxiliary code, such as assertions. However some functionality,
particularly the flow graph, does not work if RTTI is disabled.
- The tachyon example for Android* can be built using Android Studio 1.5
and higher with experimental Gradle plugin 0.4.0.

Preview Features:

- Added class opencl_subbufer that allows using OpenCL* sub-buffer
objects with opencl_node.
- Class global_control supports the value of 1 for
max_allowed_parallelism.

Bugs fixed:

- Fixed a race causing "TBB Warning: setaffinity syscall failed" message.
- Fixed a compilation issue on OS X* with Intel(R) C++ Compiler 15.0.
- Fixed a bug in queuing_rw_mutex::downgrade() that could temporarily
block new readers.
- Fixed speculative_spin_rw_mutex to stop using the lazy subscription
technique due to its known flaws.
- Fixed memory leaks in the tool support code.

------------------------------------------------------------------------
Intel TBB 4.4 Update 2
TBB_INTERFACE_VERSION == 9002

Changes (w.r.t. Intel TBB 4.4 Update 1):

- Improved interoperability with Intel(R) OpenMP RTL (libiomp) on Linux:
OpenMP affinity settings do not affect the default number of threads
used in the task scheduler. Intel(R) C++ Compiler 16.0 Update 1
or later is required.
- Added a new flow graph example with different implementations of the
Cholesky Factorization algorithm.

Preview Features:

- Added template class opencl_node to the flow graph API. It allows a
flow graph to offload computations to OpenCL* devices.
- Extended join_node to use type-specified message keys. It simplifies
the API of the node by obtaining message keys via functions
associated with the message type (instead of node ports).
- Added static_partitioner that minimizes overhead of parallel_for and
parallel_reduce for well-balanced workloads.
- Improved template class async_node in the flow graph API to support
user settable concurrency limits.

Bugs fixed:

- Fixed a possible crash in the GUI layer for library examples on Linux.

------------------------------------------------------------------------
Intel TBB 4.4 Update 1
TBB_INTERFACE_VERSION == 9001

Changes (w.r.t. Intel TBB 4.4):

- Added support for Microsoft* Visual Studio* 2015.
- Intel TBB no longer performs dynamic replacement of memory allocation
functions for Microsoft Visual Studio 2005 and earlier versions.
- For GCC 4.7 and higher, the intrinsics-based platform isolation layer
uses __atomic_* built-ins instead of the legacy __sync_* ones.
This change is inspired by a contribution from Mathieu Malaterre.
- Improvements in task_arena:
Several application threads may join a task_arena and execute tasks
simultaneously. The amount of concurrency reserved for application
threads at task_arena construction can be set to any value between
0 and the arena concurrency limit.
- The fractal example was modified to demonstrate class task_arena
and moved to examples/task_arena/fractal.

Bugs fixed:

- Fixed a deadlock during destruction of task_scheduler_init objects
when one of destructors is set to wait for worker threads.
- Added a workaround for a possible crash on OS X* when dynamic memory
allocator replacement (libtbbmalloc_proxy) is used and memory is
released during application startup.
- Usage of mutable functors with task_group::run_and_wait() and
task_arena::enqueue() is disabled. An attempt to pass a functor
which operator()() is not const will produce compilation errors.
- Makefiles and environment scripts now properly recognize GCC 5.0 and
higher.

Open-source contributions integrated:

- Improved performance of parallel_for_each for inputs allowing random
access, by Raf Schietekat.

------------------------------------------------------------------------
Intel TBB 4.4
TBB_INTERFACE_VERSION == 9000

Changes (w.r.t. Intel TBB 4.3 Update 6):

- The following features are now fully supported:
tbb::flow::composite_node;
additional policies of tbb::flow::graph_node::reset().
- Platform abstraction layer for Windows* OS updated to use compiler
intrinsics for most atomic operations.
- The tbb/compat/thread header updated to automatically include
C++11 <thread> where available.
- Fixes and refactoring in the task scheduler and class task_arena.
- Added key_matching policy to tbb::flow::join_node, which removes
the restriction on the type that can be compared-against.
- For tag_matching join_node, tag_value is redefined to be 64 bits
wide on all architectures.
- Expanded the documentation for the flow graph with details about
node semantics and behavior.
- Added dynamic replacement of C11 standard function aligned_alloc()
under Linux* OS.
- Added C++11 move constructors and assignment operators to
tbb::enumerable_thread_specific container.
- Added hashing support for tbb::tbb_thread::id.
- On OS X*, binaries that depend on libstdc++ are not provided anymore.
In the makefiles, libc++ is now used by default; for building with
libstdc++, specify stdlib=libstdc++ in the make command line.

Preview Features:

- Added a new example, graph/fgbzip2, that shows usage of
tbb::flow::async_node.
- Modification to the low-level API for memory pools:
added a function for finding a memory pool by an object allocated
from that pool.
- tbb::memory_pool now does not request memory till the first allocation
from the pool.

Changes affecting backward compatibility:

- Internal layout of flow graph nodes has changed; recompilation is
recommended for all binaries that use the flow graph.
- Resetting a tbb::flow::source_node will immediately activate it,
unless it was created in inactive state.

Bugs fixed:

- Failure at creation of a memory pool will not cause process
termination anymore.

Open-source contributions integrated:

- Supported building TBB with Clang on AArch64 with use of built-in
intrinsics by David A.

------------------------------------------------------------------------
Intel TBB 4.3 Update 6
TBB_INTERFACE_VERSION == 8006

Changes (w.r.t. Intel TBB 4.3 Update 5):

- Supported zero-copy realloc for objects >1MB under Linux* via
mremap system call.
- C++11 move-aware insert and emplace methods have been added to
concurrent_hash_map container.
- install_name is set to @rpath/<library name> on OS X*.

Preview Features:

- Added template class async_node to the flow graph API. It allows a
flow graph to communicate with an external activity managed by
the user or another runtime.
- Improved speed of flow::graph::reset() clearing graph edges.
rf_extract flag has been renamed rf_clear_edges.
- extract() method of graph nodes now takes no arguments.

Bugs fixed:

- concurrent_unordered_{set,map} behaves correctly for degenerate
hashes.
- Fixed a race condition in the memory allocator that may lead to
excessive memory consumption under high multithreading load.

------------------------------------------------------------------------
Intel TBB 4.3 Update 5
TBB_INTERFACE_VERSION == 8005

Changes (w.r.t. Intel TBB 4.3 Update 4):

- Added add_ref_count() method of class tbb::task.

Preview Features:

- Added class global_control for application-wide control of allowed
parallelism and thread stack size.
- memory_pool_allocator now throws the std::bad_alloc exception on
allocation failure.
- Exceptions thrown for by memory pool constructors changed from
std::bad_alloc to std::invalid_argument and std::runtime_error.

Bugs fixed:

- scalable_allocator now throws the std::bad_alloc exception on
allocation failure.
- Fixed a race condition in the memory allocator that may lead to
excessive memory consumption under high multithreading load.
- A new scheduler created right after destruction of the previous one
might be unable to modify the number of worker threads.

Open-source contributions integrated:

- (Added but not enabled) push_front() method of class tbb::task_list
by Raf Schietekat.

------------------------------------------------------------------------
Intel TBB 4.3 Update 4
TBB_INTERFACE_VERSION == 8004

Changes (w.r.t. Intel TBB 4.3 Update 3):

- Added a C++11 variadic constructor for enumerable_thread_specific.
The arguments from this constructor are used to construct
thread-local values.
- Improved exception safety for enumerable_thread_specific.
- Added documentation for tbb::flow::tagged_msg class and
tbb::flow::output_port function.
- Fixed build errors for systems that do not support dynamic linking.
- C++11 move-aware insert and emplace methods have been added to
concurrent unordered containers.

Preview Features:

- Interface-breaking change: typedefs changed for node predecessor and
successor lists, affecting copy_predecessors and copy_successors
methods.
- Added template class composite_node to the flow graph API. It packages
a subgraph to represent it as a first-class flow graph node.
- make_edge and remove_edge now accept multiport nodes as arguments,
automatically using the node port with index 0 for an edge.

Open-source contributions integrated:

- Draft code for enumerable_thread_specific constructor with multiple
arguments (see above) by Adrien Guinet.
- Fix for GCC invocation on IBM* Blue Gene*
by Jeff Hammond and Raf Schietekat.
- Extended testing with smart pointers for Clang & libc++
by Raf Schietekat.

------------------------------------------------------------------------
Intel TBB 4.3 Update 3
TBB_INTERFACE_VERSION == 8003

Changes (w.r.t. Intel TBB 4.3 Update 2):

- Move constructor and assignment operator were added to unique_lock.

Preview Features:

- Time overhead for memory pool destruction was reduced.

Open-source contributions integrated:

- Build error fix for iOS* by Raf Schietekat.

------------------------------------------------------------------------
Intel TBB 4.3 Update 2
TBB_INTERFACE_VERSION == 8002
Expand Down Expand Up @@ -40,6 +334,7 @@ Changes (w.r.t. Intel TBB 4.3):
- Different kind of solutions for each TBB example were merged.

Preview Features:

- Task priorities are re-enabled in preview binaries.

Bugs fixed:
Expand Down Expand Up @@ -72,7 +367,7 @@ Changes (w.r.t. Intel TBB 4.2 Update 5):
- C++11 move constructors and assignment operators have been added to
concurrent_vector, concurrent_hash_map, concurrent_priority_queue,
concurrent_unordered_{set,multiset,map,multimap}.
- C++11 move aware emplace/push/pop methods have been added to
- C++11 move-aware emplace/push/pop methods have been added to
concurrent_vector, concurrent_queue, concurrent_bounded_queue,
concurrent_priority_queue.
- Methods to insert a C++11 initializer list have been added:
Expand Down Expand Up @@ -640,7 +935,7 @@ Bugs fixed:
- concurrent_queue counter wraparound bug was fixed, which occurred when
the number of push and pop operations exceeded ~>4 billion on IA32.
- fixed races in the TBB scheduler that could put workers asleep too
early, especially in presense of affinitized tasks.
early, especially in presence of affinitized tasks.

------------------------------------------------------------------------
Intel TBB 4.0 Update 1 commercial-aligned release
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2005-2014 Intel Corporation. All Rights Reserved.
# Copyright 2005-2016 Intel Corporation. All Rights Reserved.
#
# This file is part of Threading Building Blocks. Threading Building Blocks is free software;
# you can redistribute it and/or modify it under the terms of the GNU General Public License
Expand Down Expand Up @@ -29,7 +29,7 @@ default: tbb tbbmalloc $(if $(use_proxy),tbbproxy)
all: tbb tbbmalloc tbbproxy test examples

tbb: mkdir
#$(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbb cfg=debug
$(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbb cfg=debug
$(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.tbb cfg=release

tbbmalloc: mkdir
Expand Down
2 changes: 1 addition & 1 deletion build/AIX.gcc.inc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2005-2014 Intel Corporation. All Rights Reserved.
# Copyright 2005-2016 Intel Corporation. All Rights Reserved.
#
# This file is part of Threading Building Blocks. Threading Building Blocks is free software;
# you can redistribute it and/or modify it under the terms of the GNU General Public License
Expand Down
2 changes: 1 addition & 1 deletion build/AIX.inc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2005-2014 Intel Corporation. All Rights Reserved.
# Copyright 2005-2016 Intel Corporation. All Rights Reserved.
#
# This file is part of Threading Building Blocks. Threading Building Blocks is free software;
# you can redistribute it and/or modify it under the terms of the GNU General Public License
Expand Down
Loading

0 comments on commit 92a3dcd

Please sign in to comment.