[PATCH net] hv_netvsc: Fix napi reschedule while receive completion is busy

Haiyang Zhang haiyangz at microsoft.com
Mon Jul 9 18:33:10 UTC 2018



> -----Original Message-----
> From: Stephen Hemminger <stephen at networkplumber.org>
> Sent: Monday, July 9, 2018 2:15 PM
> To: Haiyang Zhang <haiyangz at linuxonhyperv.com>
> Cc: Haiyang Zhang <haiyangz at microsoft.com>; davem at davemloft.net;
> netdev at vger.kernel.org; olaf at aepfle.de; Stephen Hemminger
> <sthemmin at microsoft.com>; linux-kernel at vger.kernel.org;
> devel at linuxdriverproject.org; vkuznets at redhat.com
> Subject: Re: [PATCH net] hv_netvsc: Fix napi reschedule while receive
> completion is busy
> 
> On Mon,  9 Jul 2018 16:43:19 +0000
> Haiyang Zhang <haiyangz at linuxonhyperv.com> wrote:
> 
> > From: Haiyang Zhang <haiyangz at microsoft.com>
> >
> > If out ring is full temporarily and receive completion cannot go out,
> > we may still need to reschedule napi if other conditions are met.
> > Otherwise the napi poll might be stopped forever, and cause network
> > disconnect.
> >
> > Fixes: 7426b1a51803 ("netvsc: optimize receive completions")
> > Signed-off-by: Haiyang Zhang <haiyangz at microsoft.com>
> > ---
> >  drivers/net/hyperv/netvsc.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> > index 8e9d0ee1572b..caaf5054f446 100644
> > --- a/drivers/net/hyperv/netvsc.c
> > +++ b/drivers/net/hyperv/netvsc.c
> > @@ -1285,14 +1285,14 @@ int netvsc_poll(struct napi_struct *napi, int
> budget)
> >  		nvchan->desc = hv_pkt_iter_next(channel, nvchan->desc);
> >  	}
> >
> > -	/* If send of pending receive completions suceeded
> > -	 *   and did not exhaust NAPI budget this time
> > +	send_recv_completions(ndev, net_device, nvchan);
> > +
> > +	/* If it did not exhaust NAPI budget this time
> >  	 *   and not doing busy poll
> >  	 * then re-enable host interrupts
> >  	 *     and reschedule if ring is not empty.
> >  	 */
> > -	if (send_recv_completions(ndev, net_device, nvchan) == 0 &&
> > -	    work_done < budget &&
> > +	if (work_done < budget &&
> >  	    napi_complete_done(napi, work_done) &&
> >  	    hv_end_read(&channel->inbound) &&
> >  	    napi_schedule_prep(napi)) {
> 
> This patch doesn't look right. I think the existing code works as written.
> 
> If send_receive_completions is unable to send because ring is full then
> vmbus_sendpacket will return -EBUSY which gets returns from
> send_receive_completions.  Because the return is non-zero, the driver will not
> call napi_complete_done.
> Since napi_complete_done was not called, NAPI will reschedule the napi poll
> routine.

With the existing code, we found in test, the rx_comp_busy counter increased,
one of the in-ring mask is 1, but guest is not reading it... With this patch, the 
pending receive completion will stay in the buffer (no loss), and be sent next time. 
It solves the disconnection problem when high number of connections.

If not calling napi_complete_done(), upper layer should guarantee napi_schedule,
then seems the upper NAPI code may have a bug -- the auto scheduling did not
happen in this case. I will check it further.

Thanks,
- Haiyang



More information about the devel mailing list