MADV_ZERO
Robert Nagy
ronag89 at gmail.com
Tue Sep 1 14:34:41 EDT 2015
I have recently switched over to Linux after encountering an issue that seems unsolvable in Windows with the hope of finding a solution in Linux.
Basically what I need to achieve is IPC persisted to a huge file. I’m doing this currently by memory-mapping a huge backing file and always sequentially writing to the file in a circular fashion 24/7.
However, this has some major throughput issues since overwriting pages will always cause a page-faults, even if entire pages are overwritten, which totally trashes disk performance.
Basically I would need a flag for madvice (e.g. MADV_ZERO) with similar functionality to FALLOC_FL_ZERO_RANGE so that I would get much faster zero fill page faults instead. The closest I’ve come is to use MADV_REMOVE before overwriting the range, however, that is a suboptimal as it will from my understanding fragment the backing file and potentially degrade performance over time.
What I’d like to be able to do is something like:
int mm_fast_write(void* dst, void* src, size_t length)
{
if (dst & ~PAGE_MASK)
return -EINVAL;
if (src & ~PAGE_MASK)
return -EINVAL;
if (length & ~PAGE_MASK)
return -EINVAL;
madvice(ptr, len, MADV_ZERO);
memcpy(ptr, src, len);
madvice(ptr, len, MADV_DONTNEED); // Might not do anything without msync?
return 0;
}
Is it possible to implement or emulate something like MADV_ZERO in user mode? Or should I look into modifying the kernel? I believe it could be implemented based on madvice_remove by simply replacing the FALLOC_FL_PUNCH_HOLE flag with FALLOC_FL_ZERO_RANGE?
More information about the Kernelnewbies
mailing list