Userspace pages in UC mode

Mon Sep 14 09:37:34 EDT 2015

Hi Pranay,

2015-09-12 3:12 GMT+02:00 Pranay Srivastava <pranjas at gmail.com>:
> Hi Sabela,
>
> On Fri, Sep 11, 2015 at 8:29 PM, Sabela Ramos Garea
> <sabelaraga at gmail.com> wrote:
>> Sorry, little mistake copypasting and cleaning. The pages and vma
>> structs should look like that:
>>
>> struct page *pages --> struct page *pages[MAX_PAGES];
>> struct vma_area_struct *vma --> struct vma_area_struct *vma[MAX_PAGES];
>>
>> Where MAX_PAGES is defined to 5.
>>
>> Sabela.
>>
>> 2015-09-11 16:07 GMT+02:00 Sabela Ramos Garea <sabelaraga at gmail.com>:
>>> Dear all,
>>>
>>> For research purposes I need some userspace memory pages to be in
>>> uncacheable mode. I am using two different Intel architectures (Sandy
>>> Bridge and Haswell) and two different kernels (2.6.32-358 and
>>> 3.19.0-28).
>>>
>>> The non-temporal stores from Intel assembly are not a valid solution
>>> so I am programming a kernel module that gets a set of pages from user
>>> space reserved with posix_memalign (get_user_pages) and then sets them
>>> as uncacheable (I have tried set_pages_uc and set_pages_array_uc).
>>> When I use one page, the access times are not very coherent and with
>>> more than one page the module crashes (in both architectures and both
>>> kernels).
>>>
>>> I wonder if I am using the correct approach or if I have to use kernel
>>> space pages in order to work with uncacheable memory. Or if I have to
>>> remap the memory. Just in case it makes it clearer, I am attaching the
>>> relevant lines of a kernel module function that should set the pages
>>> as uncacheable. (This function is the .write of a misc device; count
>>> is treated as the number of pages).
>>>
>>> Best and Thanks,
>>>
>>> Sabela.
>>>
>>> struct page *pages; //defined outside in order to be able to set them
>>> to WB in the release function.
>>> int numpages;
>>>
>>> static ssize_t setup_memory(struct file *filp, const char __user *buf,
>>> size_t count, loff_t * ppos)
>>> {
>>>         int res;
>>>         struct vm_area_struct *vmas;
>>>
> shouldn't this be rounded this up?
>>>         numpages = count/4096;
>>>
For the current tests I am assuming that count is multiple of 4096 and
the user *buf is aligned. Anyway, isn't it safer if I just round down
so I don't mess with addresses outside the range of pages that have to
be set as uncached?

>>>         down_read(&current->mm->mmap_sem);
>>>         res = get_user_pages(current, current->mm,
>>>                                 (unsigned long) buf,
>>>                                 numpages, /* Number of pages */
>>>                                 0, /* Do want to write into it */
>>>                                 1, /* do force */
>>>                                 &pages,
>>>                                 &vmas);
>>>         up_read(&current->mm->mmap_sem);
>>>
>>>         numpages=res;
>>>
>>>         if (res > 0) {
>>>                 set_pages_uc(pages, numpages); /* Uncached */
>
> what about high-mem pages. set_memory_uc does __pa, so perhaps that's
> the reason for your kernel oops?
>

I have used kmap to map the user addresses in kernel space as follows:

 if (res > 0) {
                for(i=0; i<res; i++){
                        kaddress = kmap(pages[i]);
                        set_memory_uc(kaddress,1);//userspace
addresses doesn't have to be contiguous...
                }
                //set_pages_array_uc(pages, count); /* Uncached */
                printk("Write: %d pages set as uncacheable\n",numpages);
        }

But the effect in the test code (user space) that tries to measure
cached vs. uncached accesses obtains lower latency for uncached pages.
Accesses are performed and measure like that:
         CL_1 = (int *) buffer;
         CL_2 = (int *) (buffer+CACHELINE);

//flush caches
//get timestamp
         for(j=0;j<10;j++){
                CL_2 =  (int *) (buffer+CACHELINE);
                for (i=1; i<naccesses; i++){
                        *CL_1 = *CL_2+i;
                        *CL_2 = *CL_1+i;
                        CL_2 = (int *)((char *)CL_2+CACHELINE);
                }
        }
//get timestamp

I've tried to do it within the kernel space but the results are similar.

Thanks,

Sabela.