possible deadlock in shmem_fallocate (4)

Hillf Danton hdanton at sina.com
Tue Jul 14 14:08:59 UTC 2020


On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> > 
> > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > > 
> > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > new flag. And the overall upside is to keep the current gfp either in
> > > > the khugepaged context or not.
> > > > 
> > > > --- a/include/uapi/linux/falloc.h
> > > > +++ b/include/uapi/linux/falloc.h
> > > > @@ -77,4 +77,6 @@
> > > >   */
> > > >  #define FALLOC_FL_UNSHARE_RANGE		0x40
> > > >  
> > > > +#define FALLOC_FL_NOBLOCK		0x80
> > > > +
> > > 
> > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> > 
> > Sounds fair, see below.
> > 
> > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > checked on the ashmem side and added as an exception before going
> > to filesystem. On shmem side, no more than a best effort is paid
> > on the inteded exception.
> > 
> > --- a/drivers/staging/android/ashmem.c
> > +++ b/drivers/staging/android/ashmem.c
> > @@ -437,6 +437,7 @@ static unsigned long
> >  ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  {
> >  	unsigned long freed = 0;
> > +	bool nofs;
> >  
> >  	/* We might recurse into filesystem code, so bail out if necessary */
> >  	if (!(sc->gfp_mask & __GFP_FS))
> > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> >  	if (!mutex_trylock(&ashmem_mutex))
> >  		return -1;
> >  
> > +	/* enter filesystem with caution: nonblock on locking */
> > +	nofs = current->flags & PF_MEMALLOC_NOFS;
> > +	if (!nofs)
> > +		current->flags |= PF_MEMALLOC_NOFS;
> > +
> >  	while (!list_empty(&ashmem_lru_list)) {
> >  		struct ashmem_range *range =
> >  			list_first_entry(&ashmem_lru_list, typeof(*range), lru);
> 
> I do not think this is an appropriate fix. First of all is this a real
> deadlock or a lockdep false positive? Is it possible that ashmem just

The warning matters and we can do something to quiesce it.

> needs to properly annotate its shmem inodes? Or is it possible that
> the internal backing shmem file is visible to the userspace so the write
> path would be possible?
> 
> If this a real problem then the proper fix would be to set internal
> shmem mapping's gfp_mask to drop __GFP_FS.

Thanks for the tip, see below.

Can you expand a bit on how it helps direct reclaimers like khugepaged
in the syzbot report wrt deadlock? TBH I have difficult time following
up after staring at the chart below for quite a while.

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&sb->s_type->i_mutex_key#15);
                               lock(fs_reclaim);

  lock(&sb->s_type->i_mutex_key#15);


--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -381,6 +381,7 @@ static int ashmem_mmap(struct file *file
 	if (!asma->file) {
 		char *name = ASHMEM_NAME_DEF;
 		struct file *vmfile;
+		gfp_t gfp;
 
 		if (asma->name[ASHMEM_NAME_PREFIX_LEN] != '\0')
 			name = asma->name;
@@ -392,6 +393,10 @@ static int ashmem_mmap(struct file *file
 			goto out;
 		}
 		vmfile->f_mode |= FMODE_LSEEK;
+		gfp = mapping_gfp_mask(vmfile->f_mapping);
+		if (gfp & __GFP_FS)
+			mapping_set_gfp_mask(vmfile->f_mapping,
+						gfp & ~__GFP_FS);
 		asma->file = vmfile;
 	}
 	get_file(asma->file);



More information about the devel mailing list