[RFC]confusion about syscall
Peter Teoh
htmldeveloper at gmail.com
Sun Jul 15 11:24:45 EDT 2012
just sharing my analysis, correct me if wrong:
On Sun, Jul 15, 2012 at 8:36 PM, 王哲 <wangzhe5004 at gmail.com> wrote:
>
>
> 2012/7/15 Peter Teoh <htmldeveloper at gmail.com>
>
>> Hi Mulyadi and WangZhe,
>>
>> Nice to write to you again....:-).
>>
>> On Sun, Jul 15, 2012 at 1:49 PM, Mulyadi Santosa <
>> mulyadi.santosa at gmail.com> wrote:
>>
>>> Hi...
>>>
>>> On Sun, Jul 15, 2012 at 9:28 AM, 王哲 <wangzhe5004 at gmail.com> wrote:
>>> > and the second program:
>>> >
>>> > #include <stdio.h>
>>> > #include <unistd.h>
>>> >
>>> > int main(void)
>>> > {
>>> > unsigned long value = 0;
>>> > value = getpid();
>>> > return 0;
>>> > }
>>> >
>>> > and disassembling it:( objdump -d a.out)
>>> > ...
>>> > 08048300 <getpid at plt>:
>>> > 8048300: ff 25 00 a0 04 08 jmp *0x804a000
>>> > 8048306: 68 00 00 00 00 push $0x0
>>> > 804830b: e9 e0 ff ff ff jmp 80482f0 <_init+0x3c>
>>>
>>> Looks like jumping into vsyscall page to me...
>>>
>>>
>> after I start the process, and doing a gdb -p <pid>:
>>
>> (gdb) disassemble main
>> Dump of assembler code for function main:
>> 0x0000000000400564 <+0>: push %rbp
>> 0x0000000000400565 <+1>: mov %rsp,%rbp
>> 0x0000000000400568 <+4>: sub $0x10,%rsp
>> 0x000000000040056c <+8>: movq $0x0,-0x8(%rbp)
>> 0x0000000000400574 <+16>: mov $0x0,%eax
>> 0x0000000000400579 <+21>: callq 0x400460 <getpid at plt>
>> 0x000000000040057e <+26>: cltq
>> 0x0000000000400580 <+28>: mov %rax,-0x8(%rbp)
>> 0x0000000000400584 <+32>: movabs $0x9184e72a000,%rdi
>> 0x000000000040058e <+42>: mov $0x0,%eax
>> 0x0000000000400593 <+47>: callq 0x400470 <sleep at plt>
>> 0x0000000000400598 <+52>: mov $0x0,%eax
>> 0x000000000040059d <+57>: leaveq
>> 0x000000000040059e <+58>: retq
>> End of assembler dump.
>> (gdb) disassemble getpid
>> Dump of assembler code for function getpid:
>> 0x00007f19ae558530 <+0>: mov %fs:0x2d4,%edx
>> 0x00007f19ae558538 <+8>: cmp $0x0,%edx
>> 0x00007f19ae55853b <+11>: jle 0x7f19ae558540 <getpid+16>
>> 0x00007f19ae55853d <+13>: mov %edx,%eax
>> 0x00007f19ae55853f <+15>: retq
>> 0x00007f19ae558540 <+16>: jne 0x7f19ae558554 <getpid+36>
>> 0x00007f19ae558542 <+18>: mov %fs:0x2d0,%eax
>> 0x00007f19ae55854a <+26>: test %eax,%eax
>> 0x00007f19ae55854c <+28>: nopl 0x0(%rax)
>> 0x00007f19ae558550 <+32>: je 0x7f19ae558554 <getpid+36>
>> 0x00007f19ae558552 <+34>: repz retq
>> 0x00007f19ae558554 <+36>: mov $0x27,%eax
>> 0x00007f19ae558559 <+41>: syscall
>> 0x00007f19ae55855b <+43>: test %edx,%edx
>> 0x7f19ae55855d <getpid+45>: jne 0x7f19ae558552 <getpid+34>
>> 0x7f19ae55855f <getpid+47>: mov %eax,%fs:0x2d0
>> 0x7f19ae558567 <getpid+55>: retq
>>
>>
> Hi peter:
> question1: why your system is "0x00007f19ae558554 <+36>: mov
> $0x27,%eax",
> getpid syscall number is 0x14
>
> yes u are right - for 32-bit kernel:
In arch/x86/kernel>
grep getpid *.S
syscall_table_32.S: .long sys_getpid /* 20 */
but my linux kernel is 64-bit.
> question2: i use gdb disassemble getpid just like you and the
> result:
>
>
> (gdb) disassemble getpid
> Dump of assembler code for function getpid:
> 0xb7771a40 <+0>: mov %gs:0x6c,%edx
> 0xb7771a47 <+7>: cmp $0x0,%edx
> 0xb7771a4a <+10>: jle 0xb7771a50 <getpid+16>
> 0xb7771a4c <+12>: mov %edx,%eax
> 0xb7771a4e <+14>: repz ret
> 0xb7771a50 <+16>: jne 0xb7771a62 <getpid+34>
> 0xb7771a52 <+18>: mov %gs:0x68,%eax
> 0xb7771a58 <+24>: test %eax,%eax
> 0xb7771a5a <+26>: lea 0x0(%esi),%esi
> 0xb7771a60 <+32>: jne 0xb7771a4e <getpid+14>
> 0xb7771a62 <+34>: mov $0x14,%eax
> 0xb7771a67 <+39>: call *%gs:0x10
>
>
See the comment for gs in entry_32.S:
/*
* User gs save/restore
*
* %gs is used for userland TLS and kernel only uses it for stack
* canary which is required to be at %gs:20 by gcc. Read the comment
* at the top of stackprotector.h for more info.
*
* Local labels 98 and 99 are used.
*/
#ifdef CONFIG_X86_32_LAZY_GS
And inside stackprotector.h, content of which is still beyond my completely
understanding at the moment, I copied it here:
/*
* GCC stack protector support.
*
* Stack protector works by putting predefined pattern at the start of
* the stack frame and verifying that it hasn't been overwritten when
* returning from the function. The pattern is called stack canary
* and unfortunately gcc requires it to be at a fixed offset from %gs.
* On x86_64, the offset is 40 bytes and on x86_32 20 bytes. x86_64
* and x86_32 use segment registers differently and thus handles this
* requirement differently.
*
* On x86_64, %gs is shared by percpu area and stack canary. All
* percpu symbols are zero based and %gs points to the base of percpu
* area. The first occupant of the percpu area is always
* irq_stack_union which contains stack_canary at offset 40. Userland
* %gs is always saved and restored on kernel entry and exit using
* swapgs, so stack protector doesn't add any complexity there.
*
* On x86_32, it's slightly more complicated. As in x86_64, %gs is
* used for userland TLS. Unfortunately, some processors are much
* slower at loading segment registers with different value when
* entering and leaving the kernel, so the kernel uses %fs for percpu
* area and manages %gs lazily so that %gs is switched only when
* necessary, usually during task switch.
*
* As gcc requires the stack canary at %gs:20, %gs can't be managed
* lazily if stack protector is enabled, so the kernel saves and
* restores userland %gs on kernel entry and exit. This behavior is
* controlled by CONFIG_X86_32_LAZY_GS and accessors are defined in
* system.h to hide the details.
*/
Yes, gs register is valid for userspace TLS and thus is per-process, and
for more info:
http://www.akkadia.org/drepper/tls.pdf
http://www.ibm.com/developerworks/linux/library/l-user-space-apps/index.html
http://stackoverflow.com/questions/6021273/how-to-allocate-thread-local-storage
(and lots of relevant links besides it).
can you explain the meaning of "call *%gs:0x10"?
>
> Thanks!
>
>
>
>
>> And to check the address space:
>>
>> (gdb) info sharedlibrary
>> From To Syms Read Shared Object Library
>> 0x00007f19ae4cb8c0 0x00007f19ae5dec60 Yes (*) /lib/libc.so.6
>> 0x00007f19ae830af0 0x00007f19ae849704 Yes (*)
>> /lib64/ld-linux-x86-64.so.2
>> (*): Shared library is missing debugging information.
>>
>>
>> and if u want:
>>
>> cat /proc/2282/maps
>>
>> 7f19ae82a000-7f19ae82b000 rw-p 0017d000 08:05 9922
>> /lib/libc-2.11.1.so
>> 7f19ae830000-7f19ae850000 r-xp 00000000 08:05 8824
>> /lib/ld-2.11.1.so
>> 7ffff2031000-7ffff2052000 rw-p 00000000 00:00 0
>> [stack]
>> 7ffff21af000-7ffff21b0000 r-xp 00000000 00:00 0
>> [vdso]
>> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
>> [vsyscall]
>>
>> noticed also that static analysis tools like "objdump -d" is generally
>> avoided, if u want to understand dynamic addresses. From above, we can
>> conclude that the "sysenter" (this is intel syntax, or "syscall", in AMD
>> syntax as used by gdb disassembly above) is used for the transition to the
>> kernel - as embedded inside the libc.so.6.
>>
>>
>>> --
>>> regards,
>>>
>>> Mulyadi Santosa
>>> Freelance Linux trainer and consultant
>>>
>>> blog: the-hydra.blogspot.com
>>> training: mulyaditraining.blogspot.com
>>>
>>> _______________________________________________
>>> Kernelnewbies mailing list
>>> Kernelnewbies at kernelnewbies.org
>>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>
>>
>>
>>
>> --
>> Regards,
>> Peter Teoh
>>
>
>
--
Regards,
Peter Teoh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20120715/0c437dec/attachment-0001.html
More information about the Kernelnewbies
mailing list