Linux do_coredump() and SMP systems
Sudharsan Vijayaraghavan
sudvijayr at gmail.com
Wed Feb 18 01:14:32 EST 2015
We are doing prototype so much change have gone into kernel , we are
finding it difficult to upgrade to latest immediately
However I ran through the code once again, indeed kernel handles it
down_write(&mm->mmap_sem); in coredump_wait() makes sure the second
coredump is stopped and returns negative for core_waiters
I will debug further, thanks for confirming that kernel handles this scenario
On Tue, Feb 17, 2015 at 9:57 PM, Sudharsan Vijayaraghavan
<sudvijayr at gmail.com> wrote:
> We are doing prototype so much change have gone into kernel , we are
> finding it difficult to upgrade to latest immediately
> However I ran through the code once again, indeed kernel handles it
> down_write(&mm->mmap_sem); in coredump_wait() makes sure the second
> coredump is stopped and returns negative for core_waiters
>
> I will debug further, thanks for confirming that kernel handles this scenario
>
>
> On Tue, Feb 17, 2015 at 8:42 PM, Greg KH <greg at kroah.com> wrote:
>> On Tue, Feb 17, 2015 at 07:11:55PM +0530, Sudharsan Vijayaraghavan wrote:
>>> Hi All,
>>>
>>> We are running 3.8 kernel.
>>
>> That's pretty old and obsolete, why are you stuck with that version?
>>
>>> I have a unique scenario, where we hit on several issues in do_coredump.
>>> We have a SMP system with thousands of cores, one pthread is tied to
>>> one core. The main process containing these pthreads runs in the first
>>> core.
>>>
>>> Here is the issue # 1
>>> When one of threads core dump, we enter into do_coredump(), now one
>>> other thread in same process running in a different
>>> core can as well core dump(before SIGKILL was delivered to it as a
>>> consequence of first core dump)
>>> This gives way to entering into do_coredump more than once.
>>> Once we have two guys entering do_coredump() one can kill other with SIGKILL
>>> the result is completely unpredictable. No guarantee we will have two
>>> core files generated in the end
>>>
>>> Linux kernel does not seem to handle it at all.
>>> Adding a spin lock within do_coredump() will solve the case of
>>> multiple entries into do_coredump()
>>>
>>> I want to know whether Linux kernel really does not handle the above
>>> case or am I missing something?
>>
>> Odd, we should handle this just fine, try emailing the developers
>> responsible for this code and cc: the linux-kernel mailing list so they
>> can work it out.
>>
>> thanks,
>>
>> greg k-h
More information about the Kernelnewbies
mailing list