Inexplicable PROT_EXEC flag set on mmap callback

Kenneth Adam Miller kennethadammiller at gmail.com
Sat Jan 16 14:15:01 EST 2016


Astonishing. I changed my non-C based binary to remove PROT_READ, and I
found that the mmap test completed successfully! Now I just have to figure
out how to edit the binary headers to remove the READ_IMPLIES_EXEC option
and then test it.

On Sat, Jan 16, 2016 at 1:33 PM, Kenneth Adam Miller <
kennethadammiller at gmail.com> wrote:

> The particular non-C binary that I'm using is rust with musl support, so
> that I can statically compile the binary in order to eliminate all library
> dependencies and then run it on a buildroot based linux.
>
> On Sat, Jan 16, 2016 at 1:32 PM, Kenneth Adam Miller <
> kennethadammiller at gmail.com> wrote:
>
>> Wait, are you assuming that I'm using the latest kernel? Because I'm
>> using 3.14.56...
>>
>> On Sat, Jan 16, 2016 at 1:31 PM, Mike Krinkin <krinkin.m.u at gmail.com>
>> wrote:
>>
>>> On Sat, Jan 16, 2016 at 01:16:42PM -0500, Kenneth Adam Miller wrote:
>>> > Ok, so you think that the format of the binary would influence the
>>> kernel
>>> > to change the permissions on the user's behalf? There's not much prose
>>> > explanation here, and I don't understand why the kernel would do
>>> something
>>> > like this.
>>>
>>> That personality falg was introduced here with quite a detail explanation
>>> (which i don't understand though):
>>> http://lwn.net/Articles/94068/
>>>
>>> > I just wanted to use a static binary to eliminate library
>>> > dependency issues between my host machine and the target machine. I
>>> had no
>>> > idea that settings like this would carry over to my task at hand.
>>>
>>> I compiled simple hello world with -static flag, and GNU_STACK in the
>>> binary
>>> has no executable flag set, so static has probably nothing to do with
>>> this.
>>>
>>> >
>>> > On Sat, Jan 16, 2016 at 1:08 PM, Mike Krinkin <krinkin.m.u at gmail.com>
>>> wrote:
>>> >
>>> > > On Sat, Jan 16, 2016 at 12:45:17PM -0500, Kenneth Adam Miller wrote:
>>> > > > I got the strace output of my non-C binary (I filtered the noise
>>> out of
>>> > > the
>>> > > > output for you):
>>> > > >
>>> > > > mmap(NULL, 8192, PROT_READ | PROT_WRITE,
>>> MAP_PRIVATE|MAP_ANONYMOUS, -1,
>>> > > 0)
>>> > > >
>>> > > > I also have readelf -l output:
>>> > > >
>>> > > > Elf file type is EXEC (Executable file)
>>> > > > Entry point 0x401311
>>> > > > There are 7 program headers, starting at offset 64
>>> > > >
>>> > > > Program Headers:
>>> > > >   Type           Offset             VirtAddr           PhysAddr
>>> > > >                  FileSiz            MemSiz              Flags
>>> Align
>>> > > >   LOAD           0x0000000000000000 0x0000000000400000
>>> 0x0000000000400000
>>> > > >                  0x00000000000db604 0x00000000000db604  R E    1000
>>> > > >   LOAD           0x00000000000dc1c0 0x00000000004dd1c0
>>> 0x00000000004dd1c0
>>> > > >                  0x0000000000006220 0x00000000000091dc  RW     1000
>>> > > >   NOTE           0x00000000000001c8 0x00000000004001c8
>>> 0x00000000004001c8
>>> > > >                  0x0000000000000024 0x0000000000000024  R      4
>>> > > >   GNU_EH_FRAME   0x00000000000d5680 0x00000000004d5680
>>> 0x00000000004d5680
>>> > > >                  0x0000000000005f84 0x0000000000005f84  R      4
>>> > > >   GNU_STACK      0x0000000000000000 0x0000000000000000
>>> 0x0000000000000000
>>> > > >                  0x0000000000000000 0x0000000000000000  RWE    0
>>> > >
>>> > > Well, probably this is a bit more relevant:
>>> > > http://lxr.free-electrons.com/source/mm/mmap.c#L1281
>>> > >
>>> > > As far as i can see, kernel sets READ_IMPLIES_EXEC flag here:
>>> > > http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L844
>>> > >
>>> > > if executable_stack != EXSTACK_DISABLE_X, and executable_stack
>>> initialized
>>> > > here:
>>> > > http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L781
>>> > >
>>> > > if GNU_STACK has an executable flag set (and i suppose, that RWE
>>> means,
>>> > > that
>>> > > in your case GNU_STACK indeed has exectuable flag set).
>>> > >
>>> > > It may be a reason, i'm not shure though. May be this can help:
>>> > > http://man7.org/linux/man-pages/man2/personality.2.html
>>> > >
>>> > >
>>> > > >   TLS            0x00000000000dc1c0 0x00000000004dd1c0
>>> 0x00000000004dd1c0
>>> > > >                  0x0000000000000100 0x0000000000000100  R      10
>>> > > >   GNU_RELRO      0x00000000000dc1c0 0x00000000004dd1c0
>>> 0x00000000004dd1c0
>>> > > >                  0x0000000000005e40 0x0000000000005e40  RW     20
>>> > > >
>>> > > >  Section to Segment mapping:
>>> > > >   Segment Sections...
>>> > > >    00     .note.gnu.build-id .init .text .fini .gcc_except_table
>>> .rodata
>>> > > > .debug_gdb_scripts .eh_frame .eh_frame_hdr
>>> > > >    01     .tdata .data.rel.ro.local .data.rel.ro .init_array .got
>>> > > .got.plt
>>> > > > .data .bss
>>> > > >    02     .note.gnu.build-id
>>> > > >    03     .eh_frame_hdr
>>> > > >    04
>>> > > >    05     .tdata
>>> > > >    06     .tdata .data.rel.ro.local .data.rel.ro .init_array .got
>>> > > .got.plt
>>> > > >
>>> > > > Some notes:
>>> > > >
>>> > > > As a test, I changed the non-C binary's target device file to
>>> /dev/zero,
>>> > > > and then I could see that the non-C mmap attempt would succeed
>>> just fine.
>>> > > >
>>> > > > After further verification and debugging based on guidance from
>>> another
>>> > > > forum, I have convinced that the vm_flags change must be occuring
>>> > > somewhere
>>> > > > in kernel land after control flow has left user land. Now I need to
>>> > > figure
>>> > > > out how to use a kernel debugger or kprobes to walk through the
>>> execution
>>> > > > of mmap callback delegation and see where the flags parameter is
>>> being
>>> > > > changed.
>>> > > >
>>> > > > I was pointed out to this:
>>> > > > http://lxr.free-electrons.com/source/mm/mmap.c#L1312
>>> > > >
>>> > > > But why would my vm_flags be changed by the kernel? And what can I
>>> do to
>>> > > > get this to stop? Why is the kernel changing the vm_flags for a
>>> non-C
>>> > > > binary using my device file, but not for either a C binary using my
>>> > > device
>>> > > > file or any type of binary that's not using my device file?
>>> > > >
>>> > > > On Thu, Jan 14, 2016 at 12:28 PM, Kenneth Adam Miller <
>>> > > > kennethadammiller at gmail.com> wrote:
>>> > > >
>>> > > > >
>>> > > > >
>>> > > > > On Thu, Jan 14, 2016 at 12:00 PM, Mike Krinkin <
>>> krinkin.m.u at gmail.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > >> Hi, i have a couple of questions to clarify, if you don't mind
>>> > > > >>
>>> > > > >> On Thu, Jan 14, 2016 at 11:04:28AM -0500, Kenneth Adam Miller
>>> wrote:
>>> > > > >> > I have a custom drive and userland program pair that I'm
>>> using for a
>>> > > > >> very
>>> > > > >> > special use case at my workplace where we are mapping specific
>>> > > physical
>>> > > > >> > address ranges into userland memory with a mmap callback.
>>> Everything
>>> > > > >> works
>>> > > > >> > together well with a C userland program that calls into our
>>> driver's
>>> > > > >> ioctl
>>> > > > >> > and mmap definitions, but for our case we are using an
>>> alternative
>>> > > > >> systems
>>> > > > >> > language just for the userland program.
>>> > > > >>
>>> > > > >> So you have userland app written in C, and another not written
>>> in C?
>>> > > > >> The former works well while the latter doesn't, am i right?
>>> > > > >>
>>> > > > >
>>> > > > > Yes, the former works in so much as mmap completes successfully.
>>> I've
>>> > > > > verified that the
>>> > > > > parameters are identical in the non-C program. The issue of just
>>> using
>>> > > the
>>> > > > > C only program
>>> > > > > is that the actual implementation of interest is in the non-C
>>> program,
>>> > > and
>>> > > > > that's because
>>> > > > > that language facilitates other features that are *required* on
>>> our
>>> > > end.
>>> > > > >
>>> > > > >
>>> > > > >>
>>> > > > >> > That mmap call is failing (properly
>>> > > > >> > as we want) out from the driver's mmap implementation due to
>>> the
>>> > > fact
>>> > > > >> that
>>> > > > >> > the vm_flags have the VM_EXEC flag set. We do not want users
>>> to be
>>> > > able
>>> > > > >> to
>>> > > > >> > map the memory range as executable, so the driver should
>>> check for
>>> > > this
>>> > > > >> as
>>> > > > >> > it does. The issue is in the fact that somewhere between
>>> where mmap
>>> > > is
>>> > > > >> > called and when the parameters are given to the driver, the
>>> > > > >> vma->vm_flags
>>> > > > >> > are being set to 255. I've manually checked the values being
>>> given
>>> > > to
>>> > > > >> the
>>> > > > >> > mmap call in our non-C binary, and they are *equivalent* in
>>> value to
>>> > > > >> that
>>> > > > >> > of the C program.
>>> > > > >>
>>> > > > >> By "manually" do you mean strace? Could you show strace output
>>> for
>>> > > > >> both apps? And also could you show readelf -l output for both
>>> > > binaries?
>>> > > > >>
>>> > > > >
>>> > > > > By manually, I mean with a print call just before the mmap call
>>> in
>>> > > each of
>>> > > > > the
>>> > > > > programs. Right now, I'm working on getting a strace output, but
>>> I
>>> > > have to
>>> > > > > run that in qemu.
>>> > > > > To be able to run it in qemu in order to isolate the driver and
>>> all
>>> > > from
>>> > > > > my host, I have to build
>>> > > > > with buildroot. So I'll email that when I get it, but it'll be a
>>> while.
>>> > > > >
>>> > > > >
>>> > > > >>
>>> > > > >> >
>>> > > > >> > My question is, is there anything that can cause the
>>> vma->vm_flags
>>> > > to be
>>> > > > >> > changed in the trip between when the user land program calls
>>> mmap
>>> > > and
>>> > > > >> when
>>> > > > >> > control is delivered to the mmap callback?
>>> > > > >>
>>> > > > >> > _______________________________________________
>>> > > > >> > Kernelnewbies mailing list
>>> > > > >> > Kernelnewbies at kernelnewbies.org
>>> > > > >> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>> > > > >>
>>> > > > >>
>>> > > > >
>>> > >
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20160116/db1a8e41/attachment-0001.html 


More information about the Kernelnewbies mailing list