Monitoring network system calls from outside VM

W. Michael Petullo mike at flyn.org
Tue Dec 6 12:35:46 EST 2016


I am working on a system which will monitor the system calls serviced
by an operating system running inside a VM. All of the software runs
outside of the VM, and I wish to avoid modifying or installing software
inside of the VM. Imagine an external monitor observing
PID/syscall/syscall parameters and PID/syscall return values.

We have a prototype running which uses Xen introspection.

One difficulty is mapping packets back to the process which generated
them. After observing a series of system calls, the monitor might want
to deny a process's ability to send network traffic. A problem arises
because the parameters to and return values from socket() and connect()
or sendto() do not reveal the ephemeral port chosen by the operating
system. Thus the monitor does not know which process created which packet.

I had thought about watching packets. The first one which the VM generates
destined to the parameters set by connect() or sendto() would likely
be from the process which called connect() or sendto(). However, I am
not sure the events would always work out in the right order. I think
something like this might be possible even while assuming one core:

	Process A connect(1.1.1.1:80)
	Process B connect(1.1.1.1:80)
	Kernel sends B's three-way handshake
	Kernel sends A's three-way handshake
	Process B connect returns
	Process A connect returns

If this happened, then the monitor would incorrectly associate B's source
port with A and vice versa (because the connects and handshakes are out
of order).

Could the events work their way through the kernel in the way described
above, assuming the OS is running on a single core? Or is the kernel
written in such a way that it would preserve the expected ordering?
With interrupts and scheduling points, I fear it is the former.

Does anyone have an idea of how I could in the monitor associate source
ports/packets with processes? The monitor can easily use VM introspection
to do things like map PIDs to process names, but walking the kernel data
structures to solve our packet problem using introspection is much
harder. More importantly, such introspection is also fragile as
data structures change across versions of Linux.

Thank you,

-- 
Mike

:wq



More information about the Kernelnewbies mailing list