<div dir="ltr">Thank you ,I use this func for both kernel and user ,result are same.<div><div>void *memcpy(void *dest, const void *src, size_t n)</div><div>{</div><div><span style="white-space:pre">      </span>long d0, d1, d2;</div><div><span style="white-space:pre">      </span>asm volatile(</div><div><span style="white-space:pre">         </span>"rep ; movsq\n\t"</div><div><span style="white-space:pre">           </span>"movq %4,%%rcx\n\t"</div><div><span style="white-space:pre">         </span>"rep ; movsb\n\t"</div><div><span style="white-space:pre">           </span>: "=&c" (d0), "=&D" (d1), "=&S" (d2)</div><div><span style="white-space:pre">                </span>: "0" (n >> 3), "g" (n & 7), "1" (dest), "2" (src)</div><div><span style="white-space:pre">          </span>: "memory");</div><div><br></div><div><span style="white-space:pre">       </span>return dest;</div><div>}</div></div><div>kernel is indeed faster than user.</div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-07-10 14:22 GMT+08:00 Greg KH <span dir="ltr"><<a href="mailto:greg@kroah.com" target="_blank">greg@kroah.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Tue, Jul 10, 2018 at 12:50:21PM +0800, bing zhu wrote:<br>

> I agree !,just i think the problem is still there,memcpy is indeed faster in<br>

> kernel than in user,i've tried both ways .<br>

<br>

</span>Make sure you are actually using the same code for memcpy in both<br>

places.  Do not rely on your libc or the kernel library for such a<br>

thing, otherwise you are not comparing the same code exactly.<br>

<span class=""><br>

> schedule might be to blame.<br>

<br>

</span>Lots of things "might be to blame", but first off, try to work out<br>

exactly what you are trying to test, and why, and work on that.<br>

<br>

good luck!<br>

<br>

greg k-h<br>

</blockquote></div><br></div>