[PATCH v2 2/2] staging: greybus: loopback_test: Fix race preventing test completion

Axel Haslam ahaslam at baylibre.com
Tue Jan 3 10:17:47 UTC 2017


On Tue, Jan 3, 2017 at 10:33 AM, Bryan O'Donoghue
<pure.logic at nexus-software.ie> wrote:
> On 02/01/17 17:27, Axel Haslam wrote:
>> Hi Bryan,
>>
>> On Mon, Jan 2, 2017 at 3:32 PM, Johan Hovold <johan at kernel.org> wrote:
>>> Adding Axel on CC.
>>>
>>> On Thu, Dec 22, 2016 at 12:37:29AM +0000, Bryan O'Donoghue wrote:
>>>> commit 9250c0ee2626 ("greybus: Loopback_test: use poll instead of
>>>> inotify") changes the flow of determining when to break out of a loop
>>>> polling for loopback test completion.
>>>>
>>>> The clause is_complete() which determines if all tests are complete - as
>>>> used is subject to a race condition where one of the tests has completed
>>>> but at least one other test has not. On real hardware this typically
>>>> doesn't present itself however in gbsim - which is a lot slower due in-part
>>>> to a panopoly of printouts - we see that running a loopback test to more
>>>> than one Interface in gbsim will fail in all instances printing out
>>>> "Iteration count did not finish".
>>>
>>
>> Im not sure why you might be getting this error. I think the while(1) loop
>> should have exited when each interface has sent its event, and all the
>> tests are finished.
>
> Alex,
>
> Feliz año nuevo (Google translate skillz at work)

Google got it right!!

>
> What's happening is we break the loop when the number_of_events ==
> number of fd indices captured here @ t->poll_count
>

is this wrong? this means we will break from the loop once all interfaces
have sent the event (they are finished)

> open_poll_files() {
>     /* Set the poll count equal to the number of handles to track */
>     t->poll_count = fds_idx;
> }
>
>
> wait_for_complete() {
>
>     while(1) {
>         for (i = 0; i < t->poll_count; i++) {
>             if(happy)
>                 number_of_events++;
>         }
>         if (number_of_events == t->poll_count)
>             break;
>     }
>
>     if (!is_complete(t)) {
>         fprintf(stderr, "life stinks\n");
>         return -BROKEN;
>     }
> }
>
> is_complete() - then wants all iteration_counts to be equal to maximum
> or the test fails.
>

the test should fail if the iteration count does not equal the max, right?

as i see it,  a successful test means:
1- each interfaces should send an event upon completion.
2- the iteration count should equal iteration_max on each of the interfaces

what  am i missing?

> OTOH if the loop doesn't break until all of the tests are complete we
> never hit that problem. The responsibility should be on kernel-space to
> ensure all tests complete anyway IMO.
>

the user app can bail out early too, if a timeout for the poll is given or in
case of a signal interrupt.

> ---
> bod


More information about the devel mailing list