[PATCH 1/2] Drivers: hv: hv_balloon: report offline pages as being used
Vitaly Kuznetsov
vkuznets at redhat.com
Wed Feb 25 16:55:48 UTC 2015
KY Srinivasan <kys at microsoft.com> writes:
>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets at redhat.com]
>> Sent: Thursday, February 19, 2015 8:27 AM
>> To: KY Srinivasan; devel at linuxdriverproject.org
>> Cc: Haiyang Zhang; linux-kernel at vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 1/2] Drivers: hv: hv_balloon: report offline pages as being
>> used
>>
>> When hot-added memory pages are not brought online or when some
>> memory blocks
>> are sent offline the subsequent ballooning process kills the guest with OOM
>> killer. This happens as we don't report these pages as neither used nor free
>> and apparently host algorythm considers them as being unused. Keep track
>> of
>> all online/offline operations and report all currently offline pages as being
>> used so host won't try to balloon them out.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets at redhat.com>
>> ---
>> drivers/hv/hv_balloon.c | 33 ++++++++++++++++++++++++---------
>> 1 file changed, 24 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
>> index a095b70..e4b4454 100644
>> --- a/drivers/hv/hv_balloon.c
>> +++ b/drivers/hv/hv_balloon.c
>> @@ -503,6 +503,8 @@ struct hv_dynmem_device {
>> * Number of pages we have currently ballooned out.
>> */
>> unsigned int num_pages_ballooned;
>> + unsigned int num_pages_onlined;
>> + unsigned int num_pages_added;
>>
>> /*
>> * State to manage the ballooning (up) operation.
>> @@ -556,12 +558,15 @@ static void post_status(struct hv_dynmem_device
>> *dm);
>> static int hv_memory_notifier(struct notifier_block *nb, unsigned long val,
>> void *v)
>> {
>> + struct memory_notify *mem = (struct memory_notify *)v;
>> +
>> switch (val) {
>> case MEM_GOING_ONLINE:
>> mutex_lock(&dm_device.ha_region_mutex);
>> break;
>>
>> case MEM_ONLINE:
>> + dm_device.num_pages_onlined += mem->nr_pages;
>> case MEM_CANCEL_ONLINE:
>
> Why are we not adjusting num_pages_onlined when we cancel the online
> Operation.
Because we didn't increase the number yet.
To my understanding, events come in the following order:
1) MEM_GOING_ONLINE - we just take the lock
2) MEM_ONLINE - and we increase nr_pages and drop the lock
or
MEM_CANCEL_ONLINE - we just drop the lock (mem never was online so
nr_pages wasn't increased)
3) MEM_GOING_OFFLINE - we do nothing
4) MEM_OFFLINE - and we decrease nr_pages
or
MEM_CANCEL_OFFLINE - we do nothing (mem is still online, no need to
adjust nr_pages)
>
> K. Y
--
Vitaly
More information about the devel
mailing list