[PATCH v2 1/2] dmaengine: avalon: Intel Avalon-MM DMA Interface for PCIe

Alexander Gordeev a.gordeev.box at gmail.com
Tue Oct 15 11:24:50 UTC 2019


On Thu, Oct 10, 2019 at 02:30:34PM +0300, Dan Carpenter wrote:
> On Thu, Oct 10, 2019 at 10:51:45AM +0200, Alexander Gordeev wrote:
> > On Wed, Oct 09, 2019 at 09:53:23PM +0300, Dan Carpenter wrote:
> > > > > > +	u32 *rd_flags = hw->dma_desc_table_rd.cpu_addr->flags;
> > > > > > +	u32 *wr_flags = hw->dma_desc_table_wr.cpu_addr->flags;
> > > > > > +	struct avalon_dma_desc *desc;
> > > > > > +	struct virt_dma_desc *vdesc;
> > > > > > +	bool rd_done;
> > > > > > +	bool wr_done;
> > > > > > +
> > > > > > +	spin_lock(lock);

[*]

> > > > > > +
> > > > > > +	rd_done = (hw->h2d_last_id < 0);
> > > > > > +	wr_done = (hw->d2h_last_id < 0);
> > > > > > +
> > > > > > +	if (rd_done && wr_done) {
> > > > > > +		spin_unlock(lock);
> > > > > > +		return IRQ_NONE;
> > > > > > +	}
> > > > > > +
> > > > > > +	do {
> > > > > > +		if (!rd_done && rd_flags[hw->h2d_last_id])
> > > > > > +			rd_done = true;
> > > > > > +
> > > > > > +		if (!wr_done && wr_flags[hw->d2h_last_id])
> > > > > > +			wr_done = true;
> > > > > > +	} while (!rd_done || !wr_done);
> > > > > 
> > > > > This loop is very strange.  It feels like the last_id indexes needs
> > > > > to atomic or protected from racing somehow so we don't do an out of
> > > > > bounds read.
> > 
> > [...]
> > 
> > > You're missing my point.  When we set
> > > hw->d2h_last_id = 1;
> > [1]
> > > ...
> > > hw->d2h_last_id = 2;
> > [2]
> > 
> > > There is a tiny moment where ->d2h_last_id is transitioning from 1 to 2
> > > where its value is unknown.  We're in a busy loop here so we have a
> > > decent chance of hitting that 1/1000,000th of a second.  If we happen to
> > > hit it at exactly the right time then we're reading from a random
> > > address and it will cause an oops.
> > > 
> > > We have to use atomic_t types or something to handle race conditions.
> > 
> > Err.. I am still missing the point :( In your example I do see a chance
> > for a reader to read out 1 at point in time [2] - because of SMP race.
> > But what could it be other than 1 or 2?
> > 
> 
> The 1 to 2 transition was a poorly chosen example, but a -1 to 1
> trasition is better.  The cpu could write a byte at a time.  So maybe
> it only wrote the two highest bytes so now it's 0xffff.  It's not -1 and
> it's not 1 and it's not a valid index.
> 
> > Anyways, all code paths dealing with h2d_last_id and d2h_last_id indexes
> > are protected with a spinlock.
> 
> You have to protect both the writer and the reader.  (That's why this
> bug is so easy to spot).  https://lwn.net/Articles/793253/

I struggle to realize how the spinlock I use (see [*] above) does not
protect the reader.

I am going to post updated version shortly, hopefully it will make more
sense.

> regards,
> dan carpenter
> 


More information about the devel mailing list