[PATCH] hv: hv_fcopy: drop the obsolete message on transfer failure

KY Srinivasan kys at microsoft.com
Thu Nov 20 17:58:22 UTC 2014



> -----Original Message-----
> From: Dexuan Cui
> Sent: Wednesday, November 19, 2014 11:48 PM
> To: KY Srinivasan; gregkh at linuxfoundation.org; linux-
> kernel at vger.kernel.org; driverdev-devel at linuxdriverproject.org;
> olaf at aepfle.de; apw at canonical.com; jasowang at redhat.com
> Cc: Haiyang Zhang
> Subject: RE: [PATCH] hv: hv_fcopy: drop the obsolete message on transfer
> failure
> 
> > -----Original Message-----
> > From: KY Srinivasan
> > Sent: Thursday, November 20, 2014 6:59 AM
> > > diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index
> > > 23b2ce2..177122a 100644
> > > --- a/drivers/hv/hv_fcopy.c
> > > +++ b/drivers/hv/hv_fcopy.c
> > > @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
> > > *dummy)
> > >   * process the pending transaction.
> > >   */
> > >  fcopy_respond_to_host(HV_E_FAIL);
> > > +
> > > +/* In the case the user-space daemon crashes, hangs or is killed,
> > > +we
> > > + * need to down the semaphore, otherwise, after the daemon starts
> > > next
> > > + * time, the obsolete data in fcopy_transaction.message or
> > > + * fcopy_transaction.fcopy_msg will be used immediately.
> > > + */
> > > +if (down_trylock(&fcopy_transaction.read_sema))
> > > +pr_debug("FCP: failed to acquire the semaphore\n");
> > > +
> > >  }
> >
> > When the daemon is killed, we currently reset the state in the release
> > function. Why can't we cleanup the semaphore state (initialize) here as
> well.
> >
> > K. Y
> 
> Hi KY,
> 1) The down_trylock() here is necessary: the daemon can fail to respond in 5
> seconds due to many reasons, e.g., the VM's CPU and I/O are too busy. In
> this case, the daemon may become running later(NOTE: in this example, the
> daemon is not killed), but from the host user's point of view, the PowerShell
> copy-vmfile command has failed, so here we have to 'down' the semaphore
> anyway, otherwise, the daemon can get obsolete data.
> 
> 2) If we add a line
> sema_init(&fcopy_transaction.read_sema, 0); in fcopy_release(), it seems
> OK at a glance, but we have to handle the race
> condition: the above down_trylock() and the sema_init() can, in theory, run
> simultaneously on different virtual CPUs.  It's tricky to address this.
> 
> 3) So I think we can reuse the same semaphore without an actually
> unnecessary re-initialization. :-)

Agreed; you may want to get rid of the pr_debug() call though.

Thanks,

K. Y
> 
> Thanks,
> -- Dexuan



More information about the devel mailing list