[dm-devel] dm-writeboost testing

Fri Oct 4 15:56:42 UTC 2013

On Fri, 4 Oct 2013, Akira Hayakawa wrote:

> Mikulas,
> 
> Thanks for your pointing out.
> 
> > The problem is that you are using workqueues the wrong way. You submit a 
> > work item to a workqueue and the work item is active until the device is 
> > unloaded.
> > 
> > If you submit a work item to a workqueue, it is required that the work 
> > item finishes in finite time. Otherwise, it may stall stall other tasks. 
> > The deadlock when I terminate Xserver is caused by this - the nvidia 
> > driver tries to flush system workqueue and it waits for all work items to 
> > terminate - but your work items don't terminate.
> > 
> > If you need a thread that runs for a long time, you should use 
> > kthread_create, not workqueues (see this 
> > http://people.redhat.com/~mpatocka/patches/kernel/dm-crypt-paralelizace/old-3/dm-crypt-encryption-threads.patch 
> > or this 
> > http://people.redhat.com/~mpatocka/patches/kernel/dm-crypt-paralelizace/old-3/dm-crypt-offload-writes-to-thread.patch 
> > as an example how to use kthreads).
> 
> But I see no reason why you recommend
> using a kthread for looping job
> instead of putting a looping work item
> into a single-threaded not-system workqueue.
>
> For me, they both seem to be working.

As I said, the system locks up when it tries to flush the system 
workqueue. This happens for example when terminating Xwindow with the 
nvidia binary driver, but it may happen in other parts of the kernel too. 
The fact that it works in your setup doesn't mean that it is correct.

> Is it documented that
> looping job should not be put into
> any type of workqueue?

It is general assumption when workqueues were created. Maybe it's not 
documented.

> You are only mentioning that
> putting a looping work item in system_wq
> is the wrong way since
> nvidia driver flush the workqueue.
> 
> Akira

Mikulas