<div dir="ltr">Astonishing. I changed my non-C based binary to remove PROT_READ, and I found that the mmap test completed successfully! Now I just have to figure out how to edit the binary headers to remove the READ_IMPLIES_EXEC option and then test it.</div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 16, 2016 at 1:33 PM, Kenneth Adam Miller <span dir="ltr"><<a href="mailto:kennethadammiller@gmail.com" target="_blank">kennethadammiller@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">The particular non-C binary that I'm using is rust with musl support, so that I can statically compile the binary in order to eliminate all library dependencies and then run it on a buildroot based linux.</div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 16, 2016 at 1:32 PM, Kenneth Adam Miller <span dir="ltr"><<a href="mailto:kennethadammiller@gmail.com" target="_blank">kennethadammiller@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Wait, are you assuming that I'm using the latest kernel? Because I'm using 3.14.56...</div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 16, 2016 at 1:31 PM, Mike Krinkin <span dir="ltr"><<a href="mailto:krinkin.m.u@gmail.com" target="_blank">krinkin.m.u@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On Sat, Jan 16, 2016 at 01:16:42PM -0500, Kenneth Adam Miller wrote:<br>
> Ok, so you think that the format of the binary would influence the kernel<br>
> to change the permissions on the user's behalf? There's not much prose<br>
> explanation here, and I don't understand why the kernel would do something<br>
> like this.<br>
<br>
</span>That personality falg was introduced here with quite a detail explanation<br>
(which i don't understand though):<br>
<a href="http://lwn.net/Articles/94068/" rel="noreferrer" target="_blank">http://lwn.net/Articles/94068/</a><br>
<span><br>
> I just wanted to use a static binary to eliminate library<br>
> dependency issues between my host machine and the target machine. I had no<br>
> idea that settings like this would carry over to my task at hand.<br>
<br>
</span>I compiled simple hello world with -static flag, and GNU_STACK in the binary<br>
has no executable flag set, so static has probably nothing to do with this.<br>
<div><div><br>
><br>
> On Sat, Jan 16, 2016 at 1:08 PM, Mike Krinkin <<a href="mailto:krinkin.m.u@gmail.com" target="_blank">krinkin.m.u@gmail.com</a>> wrote:<br>
><br>
> > On Sat, Jan 16, 2016 at 12:45:17PM -0500, Kenneth Adam Miller wrote:<br>
> > > I got the strace output of my non-C binary (I filtered the noise out of<br>
> > the<br>
> > > output for you):<br>
> > ><br>
> > > mmap(NULL, 8192, PROT_READ | PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,<br>
> > 0)<br>
> > ><br>
> > > I also have readelf -l output:<br>
> > ><br>
> > > Elf file type is EXEC (Executable file)<br>
> > > Entry point 0x401311<br>
> > > There are 7 program headers, starting at offset 64<br>
> > ><br>
> > > Program Headers:<br>
> > > Type Offset VirtAddr PhysAddr<br>
> > > FileSiz MemSiz Flags Align<br>
> > > LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000<br>
> > > 0x00000000000db604 0x00000000000db604 R E 1000<br>
> > > LOAD 0x00000000000dc1c0 0x00000000004dd1c0 0x00000000004dd1c0<br>
> > > 0x0000000000006220 0x00000000000091dc RW 1000<br>
> > > NOTE 0x00000000000001c8 0x00000000004001c8 0x00000000004001c8<br>
> > > 0x0000000000000024 0x0000000000000024 R 4<br>
> > > GNU_EH_FRAME 0x00000000000d5680 0x00000000004d5680 0x00000000004d5680<br>
> > > 0x0000000000005f84 0x0000000000005f84 R 4<br>
> > > GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000<br>
> > > 0x0000000000000000 0x0000000000000000 RWE 0<br>
> ><br>
> > Well, probably this is a bit more relevant:<br>
> > <a href="http://lxr.free-electrons.com/source/mm/mmap.c#L1281" rel="noreferrer" target="_blank">http://lxr.free-electrons.com/source/mm/mmap.c#L1281</a><br>
> ><br>
> > As far as i can see, kernel sets READ_IMPLIES_EXEC flag here:<br>
> > <a href="http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L844" rel="noreferrer" target="_blank">http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L844</a><br>
> ><br>
> > if executable_stack != EXSTACK_DISABLE_X, and executable_stack initialized<br>
> > here:<br>
> > <a href="http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L781" rel="noreferrer" target="_blank">http://lxr.free-electrons.com/source/fs/binfmt_elf.c#L781</a><br>
> ><br>
> > if GNU_STACK has an executable flag set (and i suppose, that RWE means,<br>
> > that<br>
> > in your case GNU_STACK indeed has exectuable flag set).<br>
> ><br>
> > It may be a reason, i'm not shure though. May be this can help:<br>
> > <a href="http://man7.org/linux/man-pages/man2/personality.2.html" rel="noreferrer" target="_blank">http://man7.org/linux/man-pages/man2/personality.2.html</a><br>
> ><br>
> ><br>
> > > TLS 0x00000000000dc1c0 0x00000000004dd1c0 0x00000000004dd1c0<br>
> > > 0x0000000000000100 0x0000000000000100 R 10<br>
> > > GNU_RELRO 0x00000000000dc1c0 0x00000000004dd1c0 0x00000000004dd1c0<br>
> > > 0x0000000000005e40 0x0000000000005e40 RW 20<br>
> > ><br>
> > > Section to Segment mapping:<br>
> > > Segment Sections...<br>
> > > 00 .note.gnu.build-id .init .text .fini .gcc_except_table .rodata<br>
> > > .debug_gdb_scripts .eh_frame .eh_frame_hdr<br>
> > > 01 .tdata .data.rel.ro.local .<a href="http://data.rel.ro" rel="noreferrer" target="_blank">data.rel.ro</a> .init_array .got<br>
> > .got.plt<br>
> > > .data .bss<br>
> > > 02 .note.gnu.build-id<br>
> > > 03 .eh_frame_hdr<br>
> > > 04<br>
> > > 05 .tdata<br>
> > > 06 .tdata .data.rel.ro.local .<a href="http://data.rel.ro" rel="noreferrer" target="_blank">data.rel.ro</a> .init_array .got<br>
> > .got.plt<br>
> > ><br>
> > > Some notes:<br>
> > ><br>
> > > As a test, I changed the non-C binary's target device file to /dev/zero,<br>
> > > and then I could see that the non-C mmap attempt would succeed just fine.<br>
> > ><br>
> > > After further verification and debugging based on guidance from another<br>
> > > forum, I have convinced that the vm_flags change must be occuring<br>
> > somewhere<br>
> > > in kernel land after control flow has left user land. Now I need to<br>
> > figure<br>
> > > out how to use a kernel debugger or kprobes to walk through the execution<br>
> > > of mmap callback delegation and see where the flags parameter is being<br>
> > > changed.<br>
> > ><br>
> > > I was pointed out to this:<br>
> > > <a href="http://lxr.free-electrons.com/source/mm/mmap.c#L1312" rel="noreferrer" target="_blank">http://lxr.free-electrons.com/source/mm/mmap.c#L1312</a><br>
> > ><br>
> > > But why would my vm_flags be changed by the kernel? And what can I do to<br>
> > > get this to stop? Why is the kernel changing the vm_flags for a non-C<br>
> > > binary using my device file, but not for either a C binary using my<br>
> > device<br>
> > > file or any type of binary that's not using my device file?<br>
> > ><br>
> > > On Thu, Jan 14, 2016 at 12:28 PM, Kenneth Adam Miller <<br>
> > > <a href="mailto:kennethadammiller@gmail.com" target="_blank">kennethadammiller@gmail.com</a>> wrote:<br>
> > ><br>
> > > ><br>
> > > ><br>
> > > > On Thu, Jan 14, 2016 at 12:00 PM, Mike Krinkin <<a href="mailto:krinkin.m.u@gmail.com" target="_blank">krinkin.m.u@gmail.com</a>><br>
> > > > wrote:<br>
> > > ><br>
> > > >> Hi, i have a couple of questions to clarify, if you don't mind<br>
> > > >><br>
> > > >> On Thu, Jan 14, 2016 at 11:04:28AM -0500, Kenneth Adam Miller wrote:<br>
> > > >> > I have a custom drive and userland program pair that I'm using for a<br>
> > > >> very<br>
> > > >> > special use case at my workplace where we are mapping specific<br>
> > physical<br>
> > > >> > address ranges into userland memory with a mmap callback. Everything<br>
> > > >> works<br>
> > > >> > together well with a C userland program that calls into our driver's<br>
> > > >> ioctl<br>
> > > >> > and mmap definitions, but for our case we are using an alternative<br>
> > > >> systems<br>
> > > >> > language just for the userland program.<br>
> > > >><br>
> > > >> So you have userland app written in C, and another not written in C?<br>
> > > >> The former works well while the latter doesn't, am i right?<br>
> > > >><br>
> > > ><br>
> > > > Yes, the former works in so much as mmap completes successfully. I've<br>
> > > > verified that the<br>
> > > > parameters are identical in the non-C program. The issue of just using<br>
> > the<br>
> > > > C only program<br>
> > > > is that the actual implementation of interest is in the non-C program,<br>
> > and<br>
> > > > that's because<br>
> > > > that language facilitates other features that are *required* on our<br>
> > end.<br>
> > > ><br>
> > > ><br>
> > > >><br>
> > > >> > That mmap call is failing (properly<br>
> > > >> > as we want) out from the driver's mmap implementation due to the<br>
> > fact<br>
> > > >> that<br>
> > > >> > the vm_flags have the VM_EXEC flag set. We do not want users to be<br>
> > able<br>
> > > >> to<br>
> > > >> > map the memory range as executable, so the driver should check for<br>
> > this<br>
> > > >> as<br>
> > > >> > it does. The issue is in the fact that somewhere between where mmap<br>
> > is<br>
> > > >> > called and when the parameters are given to the driver, the<br>
> > > >> vma->vm_flags<br>
> > > >> > are being set to 255. I've manually checked the values being given<br>
> > to<br>
> > > >> the<br>
> > > >> > mmap call in our non-C binary, and they are *equivalent* in value to<br>
> > > >> that<br>
> > > >> > of the C program.<br>
> > > >><br>
> > > >> By "manually" do you mean strace? Could you show strace output for<br>
> > > >> both apps? And also could you show readelf -l output for both<br>
> > binaries?<br>
> > > >><br>
> > > ><br>
> > > > By manually, I mean with a print call just before the mmap call in<br>
> > each of<br>
> > > > the<br>
> > > > programs. Right now, I'm working on getting a strace output, but I<br>
> > have to<br>
> > > > run that in qemu.<br>
> > > > To be able to run it in qemu in order to isolate the driver and all<br>
> > from<br>
> > > > my host, I have to build<br>
> > > > with buildroot. So I'll email that when I get it, but it'll be a while.<br>
> > > ><br>
> > > ><br>
> > > >><br>
> > > >> ><br>
> > > >> > My question is, is there anything that can cause the vma->vm_flags<br>
> > to be<br>
> > > >> > changed in the trip between when the user land program calls mmap<br>
> > and<br>
> > > >> when<br>
> > > >> > control is delivered to the mmap callback?<br>
> > > >><br>
> > > >> > _______________________________________________<br>
> > > >> > Kernelnewbies mailing list<br>
> > > >> > <a href="mailto:Kernelnewbies@kernelnewbies.org" target="_blank">Kernelnewbies@kernelnewbies.org</a><br>
> > > >> > <a href="http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies" rel="noreferrer" target="_blank">http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies</a><br>
> > > >><br>
> > > >><br>
> > > ><br>
> ><br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>