Quantcast
Channel: Active questions tagged gcc - Stack Overflow
Viewing all articles
Browse latest Browse all 22031

Compare and swap C++0x

$
0
0

From the C++0x proposal on C++ Atomic Types and Operations:

29.1 Order and Consistency [atomics.order]

Add a new sub-clause with the following paragraphs.

The enumeration memory_order specifies the detailed regular (non-atomic) memory synchronization order as defined in [the new section added by N2334 or its adopted successor] and may provide for operation ordering. Its enumerated values and their meanings are as follows.

  • memory_order_relaxed

The operation does not order memory.

  • memory_order_release

Performs a release operation on the affected memory locations, thus making regular memory writes visible to other threads through the atomic variable to which it is applied.

  • memory_order_acquire

Performs an acquire operation on the affected memory locations, thus making regular memory writes in other threads released through the atomic variable to which it is applied, visible to the current thread.

  • memory_order_acq_rel

The operation has both acquire and release semantics.

  • memory_order_seq_cst

The operation has both acquire and release semantics, and in addition, has sequentially-consistent operation ordering.

Lower in the proposal:

bool A::compare_swap( C& expected, C desired,
        memory_order success, memory_order failure ) volatile

where one can specify memory order for the CAS.


My understanding is that “memory_order_acq_rel” will only necessarily synchronize those memory locations which are needed for the operation, while other memory locations may remain unsynchronized (it will not behave as a memory fence).

Now, my question is - if I choose “memory_order_acq_rel” and apply compare_swap to integral types, for instance, integers, how is this typically translated into machine code on modern consumer processors such as a multicore Intel i7? What about the other commonly used architectures (x64, SPARC, ppc, arm)?

In particular (assuming a concrete compiler, say gcc):

  1. How to compare-and-swap an integer location with the above operation?
  2. What instruction sequence will such a code produce?
  3. Is the operation lock-free on i7?
  4. Will such an operation run a full cache coherence protocol, synchronizing caches of different processor cores as if it were a memory fence on i7? Or will it just synchronize the memory locations needed by this operation?
  5. Related to previous question - is there any performance advantage to using acq_rel semantics on i7? What about the other architectures?

Thanks for all the answers.


Viewing all articles
Browse latest Browse all 22031

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>