C++ false sharing effect



Recently i had experienced performance bug due to false sharing in C++ product.

say that you have 2 processors each processor has its own cache memory .

imagine there are 2 threads . 1 thread is writing to a[10], other thread is just reading b[10].

okay, generally we imagine that when we read set of addresses very frequently, it will be in cache of processor that executes . but here these two arrays fall in one  CACHE LINE of processor. so when ever a[10] is being written by 1st thread, b[10] is flushed from cache and again being written into cache again .

This scenario is called false sharing , my solution to the problem is add enough padding to arrays so that these two arrays dont fall in one CACHE LINE. I added padding of 54 , so totally 64. 64 * 4 bytes wont come in one CACHE LINE. so that thread 2 can work on b[64] and b[64] will be in cache line.

now

thread 1 operates on
a[64].....

thread 2 operates on ....
b[64]....

impact  The performance gain for me is 3x for this solution. wasting few bytes for 3x performance gain is always convincing :)

No comments: