[cpp-threads] A question about N2153
Chris Thomasson
cristom at comcast.net
Sat Jan 20 08:22:16 GMT 2007
> On Wed, Jan 17, 2007 at 09:00:02PM -0800, Chris Thomasson wrote:
>> ----- Original Message -----
>> From: "Chris Thomasson" <cristom at comcast.net>
>> To: "C++ threads standardisation" <cpp-threads at decadentplace.org.uk>;
>> <paulmck at linux.vnet.ibm.com>
>> Sent: Wednesday, January 17, 2007 4:09 PM
>> Subject: Re: [cpp-threads] A question about N2153
[...]
>> >>>load_depends == *#LoadDepends | #LoadLoad
>> >
>> >>Ummm... On all CPUs other than Alpha, you don't need -any- fencing,
[...]
>> >>ordering on data dependencies.
>> >I know... Of course load_depends would be a NOP on everything except
>> >Alpha.
>> Okay... Let me just sum up how I would like the new and improved version
>> of
>> C++, or whatever...
>> To do RCU, well, you do can do the barriers like this:
>> <pseudo c++ code>
[...]
> So "n:(#StoreStore)->next = gs:(#Naked).front" is the same as
> "n->next = gs.front; smp_wmb()"?
Yes. That is:
n->next = gs.front;
membar #StoreStore;
> In the Linux kernel, we use rcu_assign_pointer(), which is a cpp macro
> defined in terms of the architecture-dependent smp_wmb(). So, if I
> understand the above code, in the Linux kernel, one would have the
> following for the last two assignments:
>
> n->next = gs.front;
> rcu_assign_pointer(gs.front, n);
Yes, well, as long as 'rcu_assign_pointer(...)' executed a #StoreStore
barrier BEFORE it touches gs.front... Does rcu_assign_pointer only execute a
#StoreStore? It does not add a #LoadStore? Well, then IMHO, perhaps you
should have two variants:
rcu_assign_pointer_mb_storestore(...)
And:
rcu_assign_pointer_mb_loadstore_storestore(...)
?
> We used to use explicit memory barriers, but found that the above was
> much easier for people to get right.
;^). Well, I have to admit that I also like to wrap up the membar in the
actual load or store function... Well, take a look at the last 10 or so
function names/implementations in this code:
http://appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html
x86 is coarse membar granularity, so I can postfix the function names with
'fence', or 'naked'... Not to functional, however I abstract into a much
more granular API here, near the bottom of the following include file for
the above assembler code:
http://appcore.home.comcast.net/appcore/include/cpu/i686/ac_i686_h.html
'fence'
'acquire'
'release'
'depends'
'naked'
So, for my AppCore API, to do a RCU reader well, you do this:
* please note that rcu_read_lock/unlock are not needed in a user-space RCU
implementation... Pre-emption can be addressed several ways... anyway...
void reader_thread(...) {
node *n = ac_mb_loadptr_depends(&gs.front);
while (n) {
node *nx = ac_mb_loadptr_depends(&n->next);
n->const_function(...);
n = nx;
}
}
> But see below.
[...]
> In the Linux kernel, one would do something like the following:
[...]
> The rcu_dereference() macro is defined in terms of the
> architecture-dependent smp_read_barrier_depends() primitive.
> Again, we used to use explicit memory barriers, but found that the
> above was much easier for people to get right -- and much easier
> to build tools to check for correct usage (see Josh Triplett's
> RCU additions to Linux's "sparse" checker).
>
> So, am I advocating hiding memory barriers completely? No way!!!
:^)
> People building things like RCU infrastructure and many other things
> need explicit memory barriers in order to get their job done. However,
> if such people are wise, they will define a clean API that does not
> expose explicit memory barriers to their users.
That's what I did with AppCore:
http://appcore.home.comcast.net/
http://appcore.home.comcast.net/ac_src_index.html
Far from perfect, but at least it does abstract the barriers away 'fairly'
well...
>> so, the reader-side has exactly 0 memory barriers on every current system
>> out there except the alpha.
>
> Very good!
Very good Indeed! ;^)
>> Also, its weak enough to express just a
>> normal
>> #StoreStore inside the writers critical section that is guarded by the
>> stack
>> objects associated mutex... I would kind of like it if C++ would copy
>> from
>> the SPARC model... Just my humble opinion of course...
>
> I must confess ignorance of your history, but if you like SPARC, you
> like SPARC.
Yeah. I am biased toward the SPARC... Well, its membar instruction is so
versatile you can realize highly granular memory barrier operations with
it... That's a plus is my book... Oh well...
> The Linux kernel follows DEC Alpha, but adds smp_rmb(),
> smp_read_barrier_depends(), and so on.
So, code that makes use of such primitives on Linux can be considered a
fairly portable or what? IMHO, I would fully expect the API's in question to
be classified under a so-called 'systems-level', aka; subject to possible
modifications? I must admit that when I am on Linux, I don't make direct use
of what I consider to be system-level API's... So, raw access to futexs,
atomic_xxx, and rcu_xxx api's are something I avoid... Instead, I define a
target architecture, create the supporting assembly language for my AppCore
Library, and make use of my own API's for, lets say, lock-free
programming... It eases my problem with paranoia... You know, I use
system-level api, or crap a service pack changed something... Now, my apps
are rendered useless on the 'new' stuff...
:O
More information about the cpp-threads
mailing list