[PATCH 2/2] Drivers: hv: vmbus: offload the handling of channels to two workqueues

KY Srinivasan kys at microsoft.com
Tue Nov 27 05:22:20 UTC 2018



> -----Original Message-----
> From: Greg KH <gregkh at linuxfoundation.org>
> Sent: Monday, November 26, 2018 11:35 AM
> To: KY Srinivasan <kys at microsoft.com>
> Cc: linux-kernel at vger.kernel.org; devel at linuxdriverproject.org;
> olaf at aepfle.de; apw at canonical.com; jasowang at redhat.com; Stephen
> Hemminger <sthemmin at microsoft.com>; Michael Kelley
> <mikelley at microsoft.com>; vkuznets <vkuznets at redhat.com>; Haiyang
> Zhang <haiyangz at microsoft.com>; stable at vger.kernel.org
> Subject: Re: [PATCH 2/2] Drivers: hv: vmbus: offload the handling of channels
> to two workqueues
> 
> On Mon, Nov 26, 2018 at 02:29:57AM +0000, kys at linuxonhyperv.com wrote:
> > From: Dexuan Cui <decui at microsoft.com>
> >
> > vmbus_process_offer() mustn't call channel->sc_creation_callback()
> > directly for sub-channels, because sc_creation_callback() ->
> > vmbus_open() may never get the host's response to the
> > OPEN_CHANNEL message (the host may rescind a channel at any time,
> > e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
> > may not wake up the vmbus_open() as it's blocked due to a non-zero
> > vmbus_connection.offer_in_progress, and finally we have a deadlock.
> >
> > The above is also true for primary channels, if the related device
> > drivers use sync probing mode by default.
> >
> > And, usually the handling of primary channels and sub-channels can
> > depend on each other, so we should offload them to different
> > workqueues to avoid possible deadlock, e.g. in sync-probing mode,
> > NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
> > rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
> > and waits for all the sub-channels to appear, but the latter
> > can't get the rtnl_lock and this blocks the handling of sub-channels.
> >
> > The patch can fix the multiple-NIC deadlock described above for
> > v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
> > of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
> > but don't enable async-probing for Hyper-V drivers (yet).
> >
> > The patch can also fix the hang issue in sub-channel's handling described
> > above for all versions of kernels, including v4.19 and v4.20-rc3.
> >
> > So the patch should be applied to all the existing kernels.
> >
> > Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
> > Cc: stable at vger.kernel.org
> > Cc: Stephen Hemminger <sthemmin at microsoft.com>
> > Cc: K. Y. Srinivasan <kys at microsoft.com>
> > Cc: Haiyang Zhang <haiyangz at microsoft.com>
> > Signed-off-by: Dexuan Cui <decui at microsoft.com>
> > Signed-off-by: K. Y. Srinivasan <kys at microsoft.com>
> > ---
> >  drivers/hv/channel_mgmt.c | 188 +++++++++++++++++++++++++---------
> ----
> >  drivers/hv/connection.c   |  24 ++++-
> >  drivers/hv/hyperv_vmbus.h |   7 ++
> >  include/linux/hyperv.h    |   7 ++
> >  4 files changed, 161 insertions(+), 65 deletions(-)
> 
> As Sasha pointed out, this patch does not even apply :(

Sorry about that. These patches applied cleanly on my tree (misc-next).
This series is to be applied on top of
patch 0001-Drivers-hv-vmbus-Remove-the-useless-API-vmbus_get_ou.patch
While the patch 0001-Drivers-hv-vmbus-Remove-the-useless-API-vmbus_get_ou.patch
has been committed to the char-misc-testing branch, it is not in the misc-linus  branch and
that is the reason for this problem.

Regards,

K. Y
> 



More information about the devel mailing list