[PATCHv5 4/8] zswap: add to mm/

Dan Magenheimer dan.magenheimer at oracle.com
Mon Feb 18 23:17:16 UTC 2013


> From: Seth Jennings [mailto:sjenning at linux.vnet.ibm.com]
> Subject: Re: [PATCHv5 4/8] zswap: add to mm/
> 
> On 02/18/2013 03:59 PM, Dan Magenheimer wrote:
> >>>>> please document in Documentation/kernel-parameters.txt.
> >>>>
> >>>> Will do.
> >>>
> >>> Is that a good idea?  Konrad's frontswap/cleancache patches
> >>> to fix frontswap/cleancache initialization so that backends
> >>> can be built/loaded as modules may be merged for 3.9.
> >>> AFAIK, module parameters are not included in kernel-parameters.txt.
> >>
> >> This is true.  However, the frontswap/cleancache init stuff isn't the
> >> only reason zswap is built-in only.  The writeback code depends on
> >> non-exported kernel symbols:
> >>
> >> swapcache_free
> >> __swap_writepage
> >> __add_to_swap_cache
> >> swapcache_prepare
> >> swapper_space
> >> end_swap_bio_write
> >>
> >> I know a fix is as trivial as exporting them, but I didn't want to
> >> take on that debate right now.
> >
> > Hmmm... I wonder if exporting these might be the best solution
> > as it (unnecessarily?) exposes some swap subsystem internals.
> > I wonder if a small change to read_swap_cache_async might
> > be more acceptable.
> 
> Yes, I'm not saying that I'm for exporting them; just that that would
> be an easy and probably improper fix.
> 
> As I recall, the only thing I really needed to change in my adaption
> of read_swap_cache_async(), zswap_get_swap_cache_page() in zswap, was
> the assumption built in that it is swapping in a page on behalf of a
> userspace program with the vma argument and alloc_page_vma().  Maybe
> if we change it to just use alloc_page when vma is NULL, that could
> work.  In a non-NUMA kernel alloc_page_vma() equals alloc_page() so I
> wouldn't expect weird things doing that.

The zcache version (zcache_get_swap_cache_page, in linux-next) expects
the new_page to be pre-allocated and passed in.  This could be
done easily with something like the patch below.  But both the
zswap and zcache version require three distinct return values
and slightly different actions before returning "success" so
some minor surgery will be needed there as well.

With a more generic read_swap_cache_async, I think the only
remaining swap subsystem change might be the modified
__swap_writepage (and possibly the end_swap_bio_write change,
though that seems to be mostly just to modify a counter...
may not be really needed.)

Oh, and then of course read_swap_cache_async() would need to be
exported.

Dan

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 0cb36fb..c0e2509 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -279,9 +279,10 @@ struct page * lookup_swap_cache(swp_entry_t entry)
  * the swap entry is no longer in use.
  */
 struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
-			struct vm_area_struct *vma, unsigned long addr)
+			struct vm_area_struct *vma, unsigned long addr,
+			struct page *new_page)
 {
-	struct page *found_page, *new_page = NULL;
+	struct page *found_page;
 	int err;
 
 	do {
@@ -389,7 +390,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
 	for (offset = start_offset; offset <= end_offset ; offset++) {
 		/* Ok, do the async read-ahead now */
 		page = read_swap_cache_async(swp_entry(swp_type(entry), offset),
-						gfp_mask, vma, addr);
+						gfp_mask, vma, addr, NULL);
 		if (!page)
 			continue;
 		page_cache_release(page);
@@ -397,5 +398,5 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
 	blk_finish_plug(&plug);
 
 	lru_add_drain();	/* Push any new pages onto the LRU now */
-	return read_swap_cache_async(entry, gfp_mask, vma, addr);
+	return read_swap_cache_async(entry, gfp_mask, vma, addr, NULL);
 }



More information about the devel mailing list