possible deadlock in shmem_fallocate (4)
Hillf Danton
hdanton at sina.com
Tue Jul 14 14:08:59 UTC 2020
On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote:
> On Tue 14-07-20 13:32:05, Hillf Danton wrote:
> >
> > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote:
> > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote:
> > > >
> > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the
> > > > new flag. And the overall upside is to keep the current gfp either in
> > > > the khugepaged context or not.
> > > >
> > > > --- a/include/uapi/linux/falloc.h
> > > > +++ b/include/uapi/linux/falloc.h
> > > > @@ -77,4 +77,6 @@
> > > > */
> > > > #define FALLOC_FL_UNSHARE_RANGE 0x40
> > > >
> > > > +#define FALLOC_FL_NOBLOCK 0x80
> > > > +
> > >
> > > You can't add a new UAPI flag to fix a kernel-internal problem like this.
> >
> > Sounds fair, see below.
> >
> > What the report indicates is a missing PF_MEMALLOC_NOFS and it's
> > checked on the ashmem side and added as an exception before going
> > to filesystem. On shmem side, no more than a best effort is paid
> > on the inteded exception.
> >
> > --- a/drivers/staging/android/ashmem.c
> > +++ b/drivers/staging/android/ashmem.c
> > @@ -437,6 +437,7 @@ static unsigned long
> > ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> > {
> > unsigned long freed = 0;
> > + bool nofs;
> >
> > /* We might recurse into filesystem code, so bail out if necessary */
> > if (!(sc->gfp_mask & __GFP_FS))
> > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri
> > if (!mutex_trylock(&ashmem_mutex))
> > return -1;
> >
> > + /* enter filesystem with caution: nonblock on locking */
> > + nofs = current->flags & PF_MEMALLOC_NOFS;
> > + if (!nofs)
> > + current->flags |= PF_MEMALLOC_NOFS;
> > +
> > while (!list_empty(&ashmem_lru_list)) {
> > struct ashmem_range *range =
> > list_first_entry(&ashmem_lru_list, typeof(*range), lru);
>
> I do not think this is an appropriate fix. First of all is this a real
> deadlock or a lockdep false positive? Is it possible that ashmem just
The warning matters and we can do something to quiesce it.
> needs to properly annotate its shmem inodes? Or is it possible that
> the internal backing shmem file is visible to the userspace so the write
> path would be possible?
>
> If this a real problem then the proper fix would be to set internal
> shmem mapping's gfp_mask to drop __GFP_FS.
Thanks for the tip, see below.
Can you expand a bit on how it helps direct reclaimers like khugepaged
in the syzbot report wrt deadlock? TBH I have difficult time following
up after staring at the chart below for quite a while.
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&sb->s_type->i_mutex_key#15);
lock(fs_reclaim);
lock(&sb->s_type->i_mutex_key#15);
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -381,6 +381,7 @@ static int ashmem_mmap(struct file *file
if (!asma->file) {
char *name = ASHMEM_NAME_DEF;
struct file *vmfile;
+ gfp_t gfp;
if (asma->name[ASHMEM_NAME_PREFIX_LEN] != '\0')
name = asma->name;
@@ -392,6 +393,10 @@ static int ashmem_mmap(struct file *file
goto out;
}
vmfile->f_mode |= FMODE_LSEEK;
+ gfp = mapping_gfp_mask(vmfile->f_mapping);
+ if (gfp & __GFP_FS)
+ mapping_set_gfp_mask(vmfile->f_mapping,
+ gfp & ~__GFP_FS);
asma->file = vmfile;
}
get_file(asma->file);
More information about the devel
mailing list