[PATCH 1/2] staging: greybus: loopback: fix gb_pm_runtime_get_sync error handling

Mon Jan 9 12:08:33 UTC 2017

On January 9, 2017 11:57:36 AM GMT+00:00, Johan Hovold <johan at kernel.org> wrote:
>On Mon, Jan 09, 2017 at 11:29:50AM +0000, Bryan O'Donoghue wrote:
>> On January 9, 2017 11:19:09 AM GMT+00:00, Johan Hovold
><johan at kernel.org> wrote:
>> >On Sun, Jan 08, 2017 at 03:27:18PM +0000, Bryan O'Donoghue wrote:
>> >> commit e854ff58ed70 ("greybus: loopback: add runtime pm support")
>> >> introduces pm runtime support to the loopback code. If a
>> >> gb_pm_runtime_get_sync() fails, currently we immediately return
>from
>> >the
>> >> loopback thread.
>> >> 
>> >> Alexandre reports that later on, subsequent to the afore mentioned
>> >failure,
>> >> doing an rmmod will trigger a kthread_stop() which will will
>generate
>> >a
>> >> fault. The fault results from dereferencing gb->task in
>> >> gb_loopback_disconnect().
>> >> 
>> >> One way to fix that is to have the loopback thread do_exit()
>instead
>> >of
>> >> simply returning on gb_pm_runtime_get_sync() failure - however
>this
>> >will
>> >> leave dangling sysfs entries with no loopback thread to take
>action
>> >based
>> >> on the sysfs inputs.
>> >> 
>> >> This patch fixes the fault by ignoring the
>gb_pm_runtime_get_sync()
>> >result
>> >> code, this will allow only gb_loopback_disconnect() to terminate
>the
>> >> loopback thread. Fix validated in gbsim.
>> >
>> >No, you must check the return for resume errors, and not attempt any
>> >further I/O in case of failure.
>> 
>> The greybus operations themselves do report an error subsequent to
>the
>> user-space tool.
>
>Then you must already have a mechanism for reporting errors. Just set a
>flag and don't execute any further operations after resume failure.
> 
>> >Kill the thread, or somehow report a test failure and wait for user
>> >space to retry.
>> 
>> Killing the thread would be messy, however yes I take your general
>> point.
>
>Setting a flag and doing a clean exit should do right? Considering a
>resume-failure as a fatal error is reasonable.
>
>> It should be possible to use the gb_pm_runtime_get_sync as a
>> determinant and cease further operations.
>
>But cancelling the current test and allow user-space to restart it
>would
>be more well-behaved of course. Not sure it's worth it given the state
>of things though.

Sorry yes, I'm not saying anything different. 

>
>Johan

---
bod