[PATCH 1/1] mm: Export split_page().

Mon Mar 4 18:10:56 UTC 2013

> -----Original Message-----
> From: Dave Hansen [mailto:dave at linux.vnet.ibm.com]
> Sent: Monday, March 04, 2013 12:06 PM
> To: KY Srinivasan
> Cc: Greg KH; linux-kernel at vger.kernel.org; devel at linuxdriverproject.org;
> olaf at aepfle.de; apw at canonical.com; andi at firstfloor.org; akpm at linux-
> foundation.org; linux-mm at kvack.org
> Subject: Re: [PATCH 1/1] mm: Export split_page().
> 
> On 03/03/2013 06:36 PM, KY Srinivasan wrote:
> >> I guess the most obvious question about exporting this symbol is, "Why
> >> doesn't any of the other hypervisor balloon drivers need this?  What is
> >> so special about hyper-v?"
> >
> > The balloon protocol that Hyper-V has specified is designed around the ability
> to
> > move 2M pages. While the protocol can handle 4k allocations, it is going to be
> very chatty
> > with 4K allocations.
> 
> What does "very chatty" mean?  Do you think that there will be a
> noticeable performance difference ballooning 2M pages vs 4k?

The balloon protocol that Hyper-V host specified allows you to specify page
ranges - start_pfn: num_pfn. With 2M pages the number of messages that need
to be exchanges is significantly fewer than with 4K page allocations.
> 
> > Furthermore, the Memory Balancer on the host is also designed to work
> > best with memory moving around in 2M chunks. While I have not seen the
> code on the Windows
> > host that does this memory balancing, looking at how Windows guests behave
> in this environment,
> > (relative to Linux) I have to assume that the 2M allocations that Windows
> guests do are a big part of
> > the difference we see.
> 
> You've been talking about differences.  Could you elaborate on what the
> differences in behavior are that you are trying to rectify here?

As I look at how smoothly memory is balanced on Windows guests with changing load conditions
in the guest relative to what I see with Linux, I see Linux taking more time to reach the steady state
during a balancing operation.  I will experiment with 2M allocations and report if this issue is addressed.

> 
> >> Or can those other drivers also need/use it as well, and they were just
> >> too chicken to be asking for the export?  :)
> >
> > The 2M balloon allocations would make sense if the host is designed
> accordingly.
> 
> How does the guest decide which size pages to allocate?  It seems like a
> relatively bad idea to be inflating the balloon with 2M pages from the
> guest in the case where the guest is under memory pressure _and_
> fragmented.

I want to start with 2M allocations and if they fail, fall back onto lower order allocations.
As I said, the host can support 4K allocations and that will be the final fallback position
(that is what I have currently implemented). If the guest memory is fragmented, then
obviously we will go in for lower order allocations.

Regards,

K. Y