[PATCH] staging: zcache: fix serialization bug in zv stats

Dan Magenheimer dan.magenheimer at oracle.com
Fri Dec 30 17:31:17 UTC 2011


> From: Seth Jennings [mailto:sjenning at linux.vnet.ibm.com]
> Subject: Re: [PATCH] staging: zcache: fix serialization bug in zv stats
> 
> On 12/30/2011 11:02 AM, Dan Magenheimer wrote:
> >> From: Seth Jennings [mailto:sjenning at linux.vnet.ibm.com]
> >> Sent: Friday, December 30, 2011 9:42 AM
> >> To: Greg Kroah-Hartman
> >> Cc: Seth Jennings; Dan Magenheimer; Brian King; devel at driverdev.osuosl.org; linux-
> >> kernel at vger.kernel.org
> >> Subject: [PATCH] staging: zcache: fix serialization bug in zv stats
> >>
> >> In a multithreaded workload, the zv_curr_dist_counts
> >> and zv_cumul_dist_counts statistics are being corrupted
> >> because the increments and decrements in zv_create
> >> and zv_free are not atomic.
> >>
> >> This patch converts these statistics and their corresponding
> >> increments/decrements/reads to atomic operations.
> >>
> >> Based on v3.2-rc7
> >>
> >> Signed-off-by: Seth Jennings <sjenning at linux.vnet.ibm.com>
> >
> > I'm inclined to nack this change, at least unless inside an #ifdef DEBUG,
> > as these counts are interesting to a developer but not useful to a normal
> > end user, whereas the incremental cost for atomic_inc and atomic_dec are
> > non-trivial.  I don't think any off-by-one in these counters could
> > result in a bug and, before promotion from staging, they probably
> > should just go away.  (They are fun to "watch -d" though ;-)
> 
> In my test, it hammers on particular chunk size and the counter is off
> by hundreds :-/
> 
> I too was worried about performance impact, however, my tests showed
> no degradation.  That's probably because there are bigger bottlenecks
> elsewhere.
> 
> Perhaps we can commit this for now, so that the code is correct, and
> revisit this when we try to replace zbud with zsmalloc.  I'm sure
> we'll have to rethink the statistics at that time.
> 
> The only other option, IMO, is the remove the chunk stats altogether
> until we can find a solution that is both fast and correct.
> 
> I think that continuing with incorrect stats, regardless of the degree
> to which they are incorrect, isn't really a viable option.

OK, well I guess as long as this is addressed before promotion
from staging, and until then the heaviest users will be developers
so I agree you are correct that accurate stats are a good thing.
So, consider my nack resolved and:

Acked-by: Dan Magenheimer <dan.magenheimer at oracle.com>



More information about the devel mailing list