[PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

Fri Jun 5 17:27:16 UTC 2020

Hi Christoph,
a question arouse, is there a real value to dealing with PFNs (as opposed to
real addresses) in the core DMA code/structures? I see that in some cases it
eases interacting with mm, but the overwhelming usage of say,
dev->dma_pfn_offset, involves shifting it.

Hi Jim,
On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> Hi Nicolas,

[...]

> > I understand the need for dev to be around, devm_*() is key. But also it's
> > important to keep the functions on purpose. And if of_dma_get_range() starts
> > setting ranges it calls, for the very least, for a function rename. Although
> > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > That
> > said, I get that's a more drastic move.
> 
> I agree with you.  I could do this from device.c:
> 
>         of_dma_get_num_ranges(..., &num_ranges); /* new function */
>         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
>         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> 
> The problem here is that there could be four ranges, all with
> offset=0.  My current code would optimize this case out but the above
> would have us holding useless memory and looping through the four
> ranges on every dma <=> phys conversion only to add 0.

Point taken. Ultimately it's setting the device's dma ranges in
of_dma_get_range() that was really bothering me, so if we have to pass the
device pointer for allocations, be it.

> > Talking about drastic moves. How about getting rid of the concept of
> > dma_pfn_offset for drivers altogether. Let them provide
> > dma_pfn_offset_regions
> > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > ranges setter function which is always easier to maintain.
> Do you agree that we have to somehow hang this info on the struct
> device structure?  Because in the dma2phys() and phys2dma() all you
> have is the dev parameter.  I don't see how this  can be done w/o
> involving dev.

Sorry I didn't make myself clear here. What bothers me is having two functions
setting the same device parameter trough different means, I'd be happy to get
rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
a device's bus dma regions. Something the likes of this comes to mind:

dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)

We could maybe use some helper macros for the linear case. But that's the gist
of it.

Also, it goes hand in hand with the comment below. Why having a special case
for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
in this case, code simplicity is more interesting than a small optimization.

> > I'd go as far as not creating a special case for uniform offsets. Let just
> > set
> > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > compute
> > heavy, but I don't think it's worth the optimization.
> Well, there are two subcases here.  One where we do know the bounds
> and one where we do not.  I suppose for the latter I could have the
> drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> this some thought...
> 
> > Just my two cents :)
> 
> Worth much more than $0.02 IMO :-)

BTW, would you consider renaming the DMA offset struct to something simpler
like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.

> BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> the inline functions but the problem is that it slows the fastpath;
> consider the following code from dma-direct.h
> 
>         if (dev->dma_pfn_offset_map) {
>                 unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);
> 
>                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         }
>         return dev_addr;
> 
> becomes
> 
>         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
paddr);
> 
>         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         return dev_addr;
> 
> So those configurations that  have no dma_pfn_offsets are doing an
> unnecessary shift and add.

Fair enough. Still not a huge difference, but I see the value being the most
common case.

Regards,
Nicolas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part
URL: <http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/attachments/20200605/1c2ab8e1/attachment.asc>