[PATCH RFC 1/2] drivers/base: export lock_device_hotplug/unlock_device_hotplug

David Hildenbrand david at redhat.com
Fri Aug 17 11:56:35 UTC 2018


On 17.08.2018 13:28, Heiko Carstens wrote:
> On Fri, Aug 17, 2018 at 01:04:58PM +0200, David Hildenbrand wrote:
>>>> If there are no objections, I'll go into that direction. But I'll wait
>>>> for more comments regarding the general concept first.
>>>
>>> It is the middle of the merge window, and maintainers are really busy
>>> right now.  I doubt you will get many review comments just yet...
>>>
>>
>> This has been broken since 2015, so I guess it can wait a bit :)
> 
> I hope you figured out what needs to be locked why. Your patch description
> seems to be "only" about locking order ;)

Well I hope so, too ... but there is a reason for the RFC mark ;) There
is definitely a lot of magic in the current code. And that's why it is
also not that obvious that locking is wrong.

To avoid/fix the locking order problem was the motivation for the
original patch that dropped mem_hotplug_lock on one path. So I focused
on that in my description.

> 
> I tried to figure out and document that partially with 55adc1d05dca ("mm:
> add private lock to serialize memory hotplug operations"), and that wasn't
> easy to figure out. I was especially concerned about sprinkling

Haven't seen that so far as that was reworked by 3f906ba23689
("mm/memory-hotplug: switch locking to a percpu rwsem"). Thanks for the
pointer. There is a long history to all this.

> lock/unlock_device_hotplug() calls, which has the potential to make it the
> next BKL thing.

Well, the thing with memory hotplug and device_hotplug_lock is that

a) ACPI already holds it while adding/removing memory via add_memory()
b) we hold it during online/offline of memory (via sysfs calls to
   device_online()/device_offline())

So it is already pretty much involved in all memory hotplug/unplug
activities on x86 (except paravirt). And as far as I understand, there
are good reasons to hold the lock in core.c and ACPI. (as mentioned by
Rafael)

The exceptions are add_memory() called on s390x, hyper-v, xen and ppc
(including manual probing). And device_online()/device_offline() called
from the kernel.

Holding device_hotplug_lock when adding/removing memory from the system
doesn't sound too wrong (especially as devices are created/removed). At
least that way (documenting and following the rules in the patch
description) we might at least get locking right.


I am very open to other suggestions (but as noted by Greg, many
maintainers might be busy by know).

E.g. When adding the memory block devices, we know that there won't be a
driver to attach to (as there are no drivers for the "memory" subsystem)
- the bus_probe_device() function that takes the device_lock() could
pretty much be avoided for that case. But burying such special cases
down in core driver code definitely won't make locking related to memory
hotplug easier.

Thanks for having a look!

-- 

Thanks,

David / dhildenb


More information about the devel mailing list