I think this is useful:<div><br></div><div><a href="http://stackoverflow.com/questions/9355097/looking-for-system-calls-implementation-on-linux-kernel">http://stackoverflow.com/questions/9355097/looking-for-system-calls-implementation-on-linux-kernel</a><br>
<br><div class="gmail_quote">On Sun, Jul 15, 2012 at 11:24 PM, Peter Teoh <span dir="ltr"><<a href="mailto:htmldeveloper@gmail.com" target="_blank">htmldeveloper@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
just sharing my analysis, correct me if wrong:<br><br><div class="gmail_quote"><div><div class="h5">On Sun, Jul 15, 2012 at 8:36 PM, ÍõÕÜ <span dir="ltr"><<a href="mailto:wangzhe5004@gmail.com" target="_blank">wangzhe5004@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br><br><div class="gmail_quote"><div><div>2012/7/15 Peter Teoh <span dir="ltr"><<a href="mailto:htmldeveloper@gmail.com" target="_blank">htmldeveloper@gmail.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Mulyadi and WangZhe,<div><br></div><div>Nice to write to you again....:-).<br><br><div class="gmail_quote"><div>On Sun, Jul 15, 2012 at 1:49 PM, Mulyadi Santosa <span dir="ltr"><<a href="mailto:mulyadi.santosa@gmail.com" target="_blank">mulyadi.santosa@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi...<br>
<div><br>
On Sun, Jul 15, 2012 at 9:28 AM, ÍõÕÜ <<a href="mailto:wangzhe5004@gmail.com" target="_blank">wangzhe5004@gmail.com</a>> wrote:<br>
> and the second program:<br>
><br>
> #include <stdio.h><br>
> #include <unistd.h><br>
><br>
> int main(void)<br>
> {<br>
> unsigned long value = 0;<br>
> value = getpid();<br>
> return 0;<br>
> }<br>
><br>
> and disassembling it:( objdump -d a.out)<br>
> ...<br>
> 08048300 <getpid@plt>:<br>
> 8048300: ff 25 00 a0 04 08 jmp *0x804a000<br>
> 8048306: 68 00 00 00 00 push $0x0<br>
> 804830b: e9 e0 ff ff ff jmp 80482f0 <_init+0x3c><br>
<br>
</div>Looks like jumping into vsyscall page to me...<br>
<span><font color="#888888"><br></font></span></blockquote><div><br></div></div><div>after I start the process, and doing a gdb -p <pid>:</div><div><br></div><div><div>(gdb) disassemble main </div><div>Dump of assembler code for function main:</div>
<div> 0x0000000000400564 <+0>:<span style="white-space:pre-wrap">        </span>push %rbp</div><div> 0x0000000000400565 <+1>:<span style="white-space:pre-wrap">        </span>mov %rsp,%rbp</div>
<div> 0x0000000000400568 <+4>:<span style="white-space:pre-wrap">        </span>sub $0x10,%rsp</div><div> 0x000000000040056c <+8>:<span style="white-space:pre-wrap">        </span>movq $0x0,-0x8(%rbp)</div>
<div> 0x0000000000400574 <+16>:<span style="white-space:pre-wrap">        </span>mov $0x0,%eax</div><div> 0x0000000000400579 <+21>:<span style="white-space:pre-wrap">        </span>callq 0x400460 <getpid@plt></div>
<div> 0x000000000040057e <+26>:<span style="white-space:pre-wrap">        </span>cltq </div><div> 0x0000000000400580 <+28>:<span style="white-space:pre-wrap">        </span>mov %rax,-0x8(%rbp)</div>
<div> 0x0000000000400584 <+32>:<span style="white-space:pre-wrap">        </span>movabs $0x9184e72a000,%rdi</div><div> 0x000000000040058e <+42>:<span style="white-space:pre-wrap">        </span>mov $0x0,%eax</div>
<div> 0x0000000000400593 <+47>:<span style="white-space:pre-wrap">        </span>callq 0x400470 <sleep@plt></div><div> 0x0000000000400598 <+52>:<span style="white-space:pre-wrap">        </span>mov $0x0,%eax</div>
<div> 0x000000000040059d <+57>:<span style="white-space:pre-wrap">        </span>leaveq </div><div> 0x000000000040059e <+58>:<span style="white-space:pre-wrap">        </span>retq </div>
<div>End of assembler dump.</div><div>(gdb) disassemble getpid</div><div>Dump of assembler code for function getpid:</div><div> 0x00007f19ae558530 <+0>:<span style="white-space:pre-wrap">        </span>mov %fs:0x2d4,%edx</div>
<div> 0x00007f19ae558538 <+8>:<span style="white-space:pre-wrap">        </span>cmp $0x0,%edx</div><div> 0x00007f19ae55853b <+11>:<span style="white-space:pre-wrap">        </span>jle 0x7f19ae558540 <getpid+16></div>
<div> 0x00007f19ae55853d <+13>:<span style="white-space:pre-wrap">        </span>mov %edx,%eax</div><div> 0x00007f19ae55853f <+15>:<span style="white-space:pre-wrap">        </span>retq </div>
<div> 0x00007f19ae558540 <+16>:<span style="white-space:pre-wrap">        </span>jne 0x7f19ae558554 <getpid+36></div><div> 0x00007f19ae558542 <+18>:<span style="white-space:pre-wrap">        </span>mov %fs:0x2d0,%eax</div>
<div> 0x00007f19ae55854a <+26>:<span style="white-space:pre-wrap">        </span>test %eax,%eax</div><div> 0x00007f19ae55854c <+28>:<span style="white-space:pre-wrap">        </span>nopl 0x0(%rax)</div>
<div> 0x00007f19ae558550 <+32>:<span style="white-space:pre-wrap">        </span>je 0x7f19ae558554 <getpid+36></div><div> 0x00007f19ae558552 <+34>:<span style="white-space:pre-wrap">        </span>repz retq </div>
<div> 0x00007f19ae558554 <+36>:<span style="white-space:pre-wrap">        </span>mov $0x27,%eax</div><div> 0x00007f19ae558559 <+41>:<span style="white-space:pre-wrap">        </span>syscall </div>
<div> 0x00007f19ae55855b <+43>:<span style="white-space:pre-wrap">        </span>test %edx,%edx</div></div><div><div> 0x7f19ae55855d <getpid+45>:<span style="white-space:pre-wrap">        </span>jne 0x7f19ae558552 <getpid+34></div>
<div> 0x7f19ae55855f <getpid+47>:<span style="white-space:pre-wrap">        </span>mov %eax,%fs:0x2d0</div><div> 0x7f19ae558567 <getpid+55>:<span style="white-space:pre-wrap">        </span>retq </div>
<div><br></div></div></div></div></blockquote></div></div><div><br> Hi peter:<br> question1: why your system is "0x00007f19ae558554 <+36>:<span style="white-space:pre-wrap">        </span>mov $0x27,%eax",<br>
getpid syscall number is 0x14<br>
<br></div></div></blockquote></div></div><div>yes u are right - for 32-bit kernel:</div><div><br></div><div>In arch/x86/kernel></div><div>grep getpid *.S</div><div>syscall_table_32.S:<span style="white-space:pre-wrap">        </span>.long sys_getpid<span style="white-space:pre-wrap">        </span>/* 20 */</div>
<div><br></div><div>but my linux kernel is 64-bit. </div><div class="im"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote">
<div> question2: i use gdb disassemble getpid just like you and the result:<div>
<br><br> (gdb) disassemble getpid<br> Dump of assembler code for function getpid:<br></div> 0xb7771a40 <+0>: mov %gs:0x6c,%edx<br>
0xb7771a47 <+7>: cmp $0x0,%edx<br> 0xb7771a4a <+10>: jle 0xb7771a50 <getpid+16><br> 0xb7771a4c <+12>: mov %edx,%eax<br> 0xb7771a4e <+14>: repz ret <br> 0xb7771a50 <+16>: jne 0xb7771a62 <getpid+34><br>
0xb7771a52 <+18>: mov %gs:0x68,%eax<br> 0xb7771a58 <+24>: test %eax,%eax<br> 0xb7771a5a <+26>: lea 0x0(%esi),%esi<br> 0xb7771a60 <+32>: jne 0xb7771a4e <getpid+14><br>
0xb7771a62 <+34>: mov $0x14,%eax<br> 0xb7771a67 <+39>: call *%gs:0x10<br><br></div></div></blockquote><div><br></div><div><br></div></div><div>See the comment for gs in entry_32.S:</div><div><br>
</div>
<div>/*</div><div> * User gs save/restore</div><div> *</div><div> * %gs is used for userland TLS and kernel only uses it for stack</div><div> * canary which is required to be at %gs:20 by gcc. Read the comment</div><div>
* at the top of stackprotector.h for more info.</div><div> *</div><div> * Local labels 98 and 99 are used.</div><div> */</div><div>#ifdef CONFIG_X86_32_LAZY_GS</div><div> </div><div>And inside stackprotector.h, content of which is still beyond my completely understanding at the moment, I copied it here:</div>
<div><br></div><div><div>/*</div><div> * GCC stack protector support.</div><div> *</div><div> * Stack protector works by putting predefined pattern at the start of</div><div> * the stack frame and verifying that it hasn't been overwritten when</div>
<div> * returning from the function. The pattern is called stack canary</div><div> * and unfortunately gcc requires it to be at a fixed offset from %gs.</div><div> * On x86_64, the offset is 40 bytes and on x86_32 20 bytes. x86_64</div>
<div> * and x86_32 use segment registers differently and thus handles this</div><div> * requirement differently.</div><div> *</div><div> * On x86_64, %gs is shared by percpu area and stack canary. All</div><div> * percpu symbols are zero based and %gs points to the base of percpu</div>
<div> * area. The first occupant of the percpu area is always</div><div> * irq_stack_union which contains stack_canary at offset 40. Userland</div><div> * %gs is always saved and restored on kernel entry and exit using</div>
<div> * swapgs, so stack protector doesn't add any complexity there.</div><div> *</div><div> * On x86_32, it's slightly more complicated. As in x86_64, %gs is</div><div> * used for userland TLS. Unfortunately, some processors are much</div>
<div> * slower at loading segment registers with different value when</div><div> * entering and leaving the kernel, so the kernel uses %fs for percpu</div><div> * area and manages %gs lazily so that %gs is switched only when</div>
<div> * necessary, usually during task switch.</div><div> *</div><div> * As gcc requires the stack canary at %gs:20, %gs can't be managed</div><div> * lazily if stack protector is enabled, so the kernel saves and</div>
<div> * restores userland %gs on kernel entry and exit. This behavior is</div><div><div>* controlled by CONFIG_X86_32_LAZY_GS and accessors are defined in</div><div> * system.h to hide the details.</div><div> */</div></div>
</div><div><br></div><div><div>Yes, gs register is valid for userspace TLS and thus is per-process, and for more info:</div></div><div><br></div><div><a href="http://www.akkadia.org/drepper/tls.pdf" target="_blank">http://www.akkadia.org/drepper/tls.pdf</a></div>
<div><br></div><div><a href="http://www.ibm.com/developerworks/linux/library/l-user-space-apps/index.html" target="_blank">http://www.ibm.com/developerworks/linux/library/l-user-space-apps/index.html</a></div><div><br></div>
<div><a href="http://stackoverflow.com/questions/6021273/how-to-allocate-thread-local-storage" target="_blank">http://stackoverflow.com/questions/6021273/how-to-allocate-thread-local-storage</a></div>
<div><br></div><div>(and lots of relevant links besides it).</div><div><div class="h5"><div><br></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote">
<div> can you explain the meaning of "call *%gs:0x10"?<br> <br> Thanks! <br><br><br> <br></div><div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div><div class="gmail_quote"><div><div></div></div><div>And to check the address space:</div><div><br></div><div><div>(gdb) info sharedlibrary </div><div>From To Syms Read Shared Object Library</div>
<div>0x00007f19ae4cb8c0 0x00007f19ae5dec60 Yes (*) /lib/libc.so.6</div>
<div>0x00007f19ae830af0 0x00007f19ae849704 Yes (*) /lib64/ld-linux-x86-64.so.2</div><div>(*): Shared library is missing debugging information.</div></div><div><br></div><div><br></div><div>and if u want:</div><div>
<br>
</div><div><div>cat /proc/2282/maps </div><div><br></div><div>7f19ae82a000-7f19ae82b000 rw-p 0017d000 08:05 9922 /lib/<a href="http://libc-2.11.1.so" target="_blank">libc-2.11.1.so</a></div><div>7f19ae830000-7f19ae850000 r-xp 00000000 08:05 8824 /lib/<a href="http://ld-2.11.1.so" target="_blank">ld-2.11.1.so</a></div>
<div>7ffff2031000-7ffff2052000 rw-p 00000000 00:00 0 [stack]</div>
<div>7ffff21af000-7ffff21b0000 r-xp 00000000 00:00 0 [vdso]</div><div>ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]</div></div><div><br></div><div>noticed also that static analysis tools like "objdump -d" is generally avoided, if u want to understand dynamic addresses. From above, we can conclude that the "sysenter" (this is intel syntax, or "syscall", in AMD syntax as used by gdb disassembly above) is used for the transition to the kernel - as embedded inside the libc.so.6.</div>
<div>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><span><font color="#888888">
--<br>
regards,<br>
<br>
Mulyadi Santosa<br>
Freelance Linux trainer and consultant<br>
<br>
blog: <a href="http://the-hydra.blogspot.com" target="_blank">the-hydra.blogspot.com</a><br>
training: <a href="http://mulyaditraining.blogspot.com" target="_blank">mulyaditraining.blogspot.com</a><br>
</font></span></div><div><div><br><div>
_______________________________________________<br>
Kernelnewbies mailing list<br>
<a href="mailto:Kernelnewbies@kernelnewbies.org" target="_blank">Kernelnewbies@kernelnewbies.org</a><br>
<a href="http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies" target="_blank">http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies</a><br>
</div></div></div></blockquote></div><span><font color="#888888"><br><br clear="all"><div><br></div>-- <br>Regards,<br>Peter Teoh<br>
</font></span></div>
</blockquote></div></div></div><br>
</blockquote></div></div></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>-- <br>Regards,<br>Peter Teoh<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br>Regards,<br>Peter Teoh<br>
</div>