[PATCH] vsock: only load vmci transport on VMware hypervisor by default

Jorgen S. Hansen jhansen at vmware.com
Thu Aug 17 15:16:58 UTC 2017


> On Aug 17, 2017, at 3:55 PM, Stefan Hajnoczi <stefanha at redhat.com> wrote:
> 
> On Thu, Aug 17, 2017 at 08:00:29AM +0000, Dexuan Cui wrote:
>> 
>> Without the patch, vmw_vsock_vmci_transport.ko can automatically load
>> when an application creates an AF_VSOCK socket.
>> 
>> This is the expected good behavior on VMware hypervisor, but as we
>> are going to add hv_sock.ko (i.e. Hyper-V transport for AF_VSOCK), we
>> should make sure vmw_vsock_vmci_transport.ko can't load on Hyper-V,
>> otherwise there is a -EBUSY conflict when both vmw_vsock_vmci_transport.ko
>> and hv_sock.ko try to call vsock_core_init() on Hyper-V.
>> 
>> On the other hand, hv_sock.ko can only load on Hyper-V, because it
>> depends on hv_vmbus.ko, which detects Hyper-V in hv_acpi_init().
>> 
>> KVM's vsock_virtio_transport doesn't have the issue because it doesn't
>> define MODULE_ALIAS_NETPROTO(PF_VSOCK).
> 
> Thanks for sending this patch, vmci's MODULE_ALIAS_NETPROTO(PF_VSOCK) is
> a problem for vhost_vsock.ko (the virtio host driver) too.  A host
> userspace program can create a AF_VSOCK socket before vhost_vsock is
> loaded.  The vmci transport will be unconditionally loaded and that's
> not the right behavior.
> 
> Putting aside nested virtualization, I want to load the transport (vmci,
> Hyper-V, vsock) for which there is paravirtualized hardware present
> inside the guest.

Good points. Completely agree that this is the desired behavior for a guest.


> It's a little tricker on the host side (doesn't matter for Hyper-V and
> probably also doesn't for VMware) because the host-side driver is a
> software device with no hardware backing it.  In KVM we assume the
> vhost_vsock.ko kernel module will be loaded sufficiently early.

Since the vmci driver is currently tied to PF_VSOCK it hasn’t been a problem, but on the host side the VMCI driver has no hardware backing it either, so when we move to a more appropriate solution, this will be an issue for VMCI as well. I’ll check our shipped products, but they most likely assume that if an upstreamed vmci module is present, it will be loaded automatically.

> 
> Things get trickier with nested virtualization because the VM might want
> to talk to its host but also to its nested VMs.  The simple way of
> fixing this would be to allow two transports loaded simultaneously and
> route traffic destined to CID 2 to the host transport and all other
> traffic to the guest transport.

This is close to the routing the VMCI driver does in a nested environment, but that is with the assumption that there is only one type of transport. Having two different transports would require that we delay resolving the transport type until the socket endpoint has been bound to an address. Things get trickier if listening sockets use VMADDR_CID_ANY - if only one transport is present, this would allow the socket to accept connections from both guests and outer host, but with multiple transports that won’t work, since we can’t associate a socket with a transport until the socket is bound.
 
> 
> Perhaps we should discuss these cases a bit more to figure out how to
> avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK).

Agreed.

> 
>> 
>> The patch also adds a module parameter "skip_hypervisor_check" for
>> vmw_vsock_vmci_transport.ko.
>> 
>> Signed-off-by: Dexuan Cui <decui at microsoft.com>
>> Cc: Alok Kataria <akataria at vmware.com>
>> Cc: Andy King <acking at vmware.com>
>> Cc: Adit Ranadive <aditr at vmware.com>
>> Cc: George Zhang <georgezhang at vmware.com>
>> Cc: Jorgen Hansen <jhansen at vmware.com>
>> Cc: K. Y. Srinivasan <kys at microsoft.com>
>> Cc: Haiyang Zhang <haiyangz at microsoft.com>
>> Cc: Stephen Hemminger <sthemmin at microsoft.com>
>> ---
>> net/vmw_vsock/Kconfig          |  2 +-
>> net/vmw_vsock/vmci_transport.c | 11 +++++++++++
>> 2 files changed, 12 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig
>> index a24369d..3f52929 100644
>> --- a/net/vmw_vsock/Kconfig
>> +++ b/net/vmw_vsock/Kconfig
>> @@ -17,7 +17,7 @@ config VSOCKETS
>> 
>> config VMWARE_VMCI_VSOCKETS
>> 	tristate "VMware VMCI transport for Virtual Sockets"
>> -	depends on VSOCKETS && VMWARE_VMCI
>> +	depends on VSOCKETS && VMWARE_VMCI && HYPERVISOR_GUEST
>> 	help
>> 	  This module implements a VMCI transport for Virtual Sockets.
>> 
>> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>> index 10ae782..c068873 100644
>> --- a/net/vmw_vsock/vmci_transport.c
>> +++ b/net/vmw_vsock/vmci_transport.c
>> @@ -16,6 +16,7 @@
>> #include <linux/types.h>
>> #include <linux/bitops.h>
>> #include <linux/cred.h>
>> +#include <linux/hypervisor.h>
>> #include <linux/init.h>
>> #include <linux/io.h>
>> #include <linux/kernel.h>
>> @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info {
>> 	struct vmci_transport_packet pkt;
>> };
>> 
>> +static bool skip_hypervisor_check;
>> +module_param(skip_hypervisor_check, bool, 0444);
>> +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware platforms");
>> +
>> static LIST_HEAD(vmci_transport_cleanup_list);
>> static DEFINE_SPINLOCK(vmci_transport_cleanup_lock);
>> static DECLARE_WORK(vmci_transport_cleanup_work, vmci_transport_cleanup);
>> @@ -2085,6 +2090,12 @@ static int __init vmci_transport_init(void)
>> {
>> 	int err;
>> 
>> +	/* Check if we are running on VMware's hypervisor and bail out
>> +	 * if we are not.
>> +	 */
>> +	if (!skip_hypervisor_check && x86_hyper != &x86_hyper_vmware)
>> +		return -ENODEV;
>> +

I’ma bit concerned with this. On the host-side, we still want to be able to use the VMCI transport, so to allow that, the above should also allow loading the transport when x86_hyper == NULL. However, this may still cause a conflict with virtio host side support, so it looks like we need to find a better overall way to make the protocols co-exist.

>> 	/* Create the datagram handle that we will use to send and receive all
>> 	 * VSocket control messages for this context.
>> 	 */
>> -- 
>> 2.7.4



More information about the devel mailing list