Fwd: How to change page permission from inside the kernel?

Fri Jul 6 19:31:45 EDT 2018

> What happens after you've been up for 3 weeks and you're running out of
> usable pages?
That can't happen, it is my mistake missing some details, this is for
only protecting Kernel Pages,
Pages that are hold code or static data that is created once and
assumed to be there for ever, like kernel
code section as well as any static data. However this isn't to be used
on any user space memory
(from the guest's perspective) process because processes start and end.

> How does this interact with ballooning?
I haven't thought about this _yet_. Ballooning is complex, I  can
start simple and expand later to support all
of the current KVM features, Also I am not even sure if ballooning is
considered related or not because If
Kernel pages can be part of the Balloon then I would that when they
are copied they are copied with the right
permissions and this is done completely independent of gva and gpa.

> What ways can malware in the guest use this to DoS or otherwise break
> the kernel? (Setting the page that contains the 'struct proc' for PID 1 would
> be amusing, and I'm sure there's plenty of amusement with race conditions to
> make other kernel threads fail when they encounter a page they were expecting
> to be R/W)

I should state that hypercalls are done only from kernel mode (I am
not sure about KVM but
t is the case in Xen and if KVM allows user mode hypercall we can
check for this specific one to
be created from kernel mode it is easy)
In this case, worrying about a if kernel code can Dos the kernel via
ROE is relatively meaningless
because there are already many other ways to break the system once one
have kernel mode access.

However If you mean that this way guest kernel can break host kernel,
then this is a different story, it will
not be possible because of kvm MMU is designed to make guest not being
able to access memory that host
never allocated to it.

So we are replacing some way to hide a rootkit in a kernel by a way to
crash the kernel
(given that there are many ways to do both). I think in most cases it
would be rather better if the  guest OS crashes
(thus indicating that something is wrong) rather than have a rootkit
working silently.
> Well actually, what I was looking for was a description of the rootkit that you're
> defending against - where it lives, what kernel data it attacks, and so on....
What about some real example?
I just found this one from google
https://github.com/nurupo/rootkit/blob/master/rootkit.c
While I aggree that the rootkit will make its way into the kernel but
it _will not_ be able to achieve all of its functionalities.
anything that uses "_asm_hook_patch" function should fail because the
system call tables is protected. I am not sure if ROE will prevent
other parts of this rootkit from working or not but here is one :).
But I am pretty sure that there are other static data that can be
abused by rootkits inside the kernel that should rather be protected.

> If you're unable to explain the threat you're trying to guard against,
> you're going to be unable to guard against it properly.
Well I kinda had one, but you never asked for it. We assume that the
attacker is running in kernel mode and his goal is to alter static
data/ code section in the kernel for whatever reason
with out a reboot. if reboot is needed well it can be trivially done
by replacing the whole kernel with a new one including any malicious
code and then faulting the running kernel thus forcing a reboot or
even.
Also attacks that involve dynamic data (except for the page table) is
out of scope including attacks that involve any block device.
We plan to handle memory either that is accessed either via CPU or DMA
as well as privileged registers that are set once and never modify
again.
We haven't agreed on how an unauthorized write to ROE protected memory
would be handled but worst case it can lead to crashing the guest OS,
Crashing the Host OS is considered a big breach and it shouldn't be
possible. Although this thing isn't really part of threat model but it
is kinda of a motto that crash is better than rootkit. because once it
happens we will know it is there and the indecent can be thoroughly
inspected.
> Hint:  Your protection is trivially bypassed by simply *replacing* the
> protected page.  The attacker snarfs up a R/W 4K page, copies the
> protected page to the new page, injects whatever malware bits they
> find amusing, and change the virtual memory pointer tables so this
> new page frame is referenced by the virtual address of the old page.
> If the hypervisor even notices the swap, all it will probably do is make
> the new page R/O - but that matters not because it's already compromised.

I think I did explain how we handled this:
>>- Screwing the page table itself which is not static data. but parts
>>of the details I am missing is that hypervisor will keep track of all
>>guest page number, guest frame number of all ROE protected pages and
>>make sure that it is never the wrong mapping when updating the TLB.

> You missed the point - your protection can be bypassed without manipulating
> a ROE page.
Changing the virtual memory pointer table is ok but again these memory
mappings will never
make it to the TLB and will be caught during by KVM MMU because it
will explicitly check both gva and gpa when resolving a gva so perhaps
this will put the guest kernel in some inconsistent state (again
crashing the guest is considered much much better than a rootkit
running in silence).