[PATCH v4] Move DWC2 driver out of staging

Andre Heider a.heider at gmail.com
Tue Feb 4 18:39:44 UTC 2014


On Mon, Feb 03, 2014 at 08:51:48PM +0000, Paul Zimmerman wrote:
> > From: Paul Zimmerman
> > Sent: Monday, February 03, 2014 9:36 AM
> > 
> >> From: Stephen Warren [mailto:swarren at wwwdotorg.org]
> >> Sent: Saturday, February 01, 2014 7:44 PM
> >> 
> >> On 02/01/2014 03:00 AM, Andre Heider wrote:
> >>> On Fri, Jan 31, 2014 at 11:48:37PM -0700, Stephen Warren wrote:
> >>>> On 01/31/2014 11:12 AM, Andre Heider wrote:
> >>>>> On Mon, Jan 13, 2014 at 01:50:09PM -0800, Paul Zimmerman wrote:
> >>>>>> The DWC2 driver should now be in good enough shape to move out of
> >>>>>> staging. I have stress tested it overnight on RPI running mass
> >>>>>> storage and Ethernet transfers in parallel, and for several days
> >>>>>> on our proprietary PCI-based platform.
> >>>> ...
> >>>>> this looks just fine, but for whatever reason it breaks sdhci on my rpi.
> >>>>> With today's Linus' master the dwc2 controller seems to initialize fine,
> >>>>> but I get this upon boot:
> >>>>>
> >>>>> [    1.783316] sdhci-bcm2835 20300000.sdhci: sdhci_pltfm_init failed -12
> >>>>> [    1.794820] sdhci-bcm2835: probe of 20300000.sdhci failed with error -12
> >> ...
> >>>> This is due to the following code:
> >> ...
> >>>> What ends up happening, simply due to memory allocation order, is that
> >>>> the memory writes inside usb_settoggle() end up setting the SDHCI struct
> >>>> platform_device's num_resources to 0, so that it's call to
> >>>> platform_get_resource() fails.
> >>>> 
> >>>> With the DWC2 move patch reverted, some other random piece of memory is
> >>>> being corrupted, which just happens not to cause any visible problem.
> >>>> Likely it's some other struct platform_device that's already had its
> >>>> resources read by the time DWC2 probes and corrupts them.
> >>>> 
> >>>> (Yes, this was hard to find!)
> >>> 
> >>> Nice work, but how did you pinpoint this? Am I missing some option/tool
> >>> or did I just not stare for long enough?
> >> 
> >> Well, there was a clear place where an issue was present; the resource
> >> lookup in sdhci_pltfm_init() was failing, so I put a bunch of printfs
> >> into that function to dump out the data platform_get_resource() used.
> >> This clearly pointed at num_resources==0 being the problem. Next, I
> >> dumped the same data from the code in drivers/of that sets it up, and it
> >> was OK there, so I knew it was getting over-written somewhere. I then
> >> basically added hundreds of calls to the same data dumping function
> >> throughout kernel functions like really_probe() to track down the
> >> location of the problem. Luckily, the behaviour was stable, so I wasn't
> >> chasing a race/timing condition. Eventually I narrowed the window to the
> >> few lines of code I mentioned in _dwc2_hcd_endpoint_reset(). It would
> >> have been much harder if it was e.g. the USB HW DMAing to memory that
> >> caused the corruption, so I was lucky:-)
> > 
> > Nice work Stephen, thanks. I will try to come up with a patch to fix this
> > ASAP, along the lines of what Alan suggested.
> 
> Stephen, Andre,
> 
> Can you test the attached patch, please? It works for my on the Synopsys
> PCIe-based FPGA board. Unfortunately my RPI board is currently broken,
> so I am unable to test it there to verify it actually fixes the problem
> you are seeing.
> 
> The dwc2 driver doesn't use the usb_device toggle bits anywhere else,
> so the quickest fix is to just remove the problematic code from
> _dwc2_hcd_endpoint_reset().
> 
> If you give me your tested-bys, I will submit this as a proper patch
> to Greg.

LGTM, sdhci works again and there're no glaring USB issues with lan,
hid nor mass storage:

Tested-by: Andre Heider <a.heider at gmail.com>

Thanks Paul,
Andre


More information about the devel mailing list