text segment corruption - probable suspects?

Greg Freemyer greg.freemyer at gmail.com
Fri Jan 24 09:20:18 EST 2014



"nayobix at nayobix.org" <nayobix at nayobix.org> wrote:

>On 01/23/2014 02:04 PM, Mulyadi Santosa wrote:
>> On Tue, Jan 7, 2014 at 10:09 PM, Boyan Vladinov <nayobix at nayobix.org>
>wrote:
>>> Hey guys,
>>>
>>> recently I experience .text segment corruption in Kernel space.
>Could
>>> someone give some hints which can be the reason
>>> for this text segment corruption, because as we know it is
>read-only?
>> Could you share what evidence did you get so you can get conclusion
>> there happen such corruption?
>>
>> IMHO, I suspect it is due RAM error...so it's not corruption coming
>> from kernel image itself
>>
>I compared the addresses/instructions from the crash with these in 
>vmlinux image and they were different in 1 or 2 bits. Yep, we also 
>suspect RAM issues, but is it possible to be something other like DMA 
>transfers?

Lots of hardware issues cause bit flips.  I've seen it be ram, cpu cache, motherboard, disk controller, disk cables, power supply.  I'm sure they are other culprits (or more specific culprits).

Start with a ram test just because it is easy and a failure of memtest is pretty definitive about it being bad ram.  Even then it could be the power supply or a motherboard issue. 

Otherwise, I find using md5sum to hash a large file, then copying it and rehashing will trigger a failure in many cases.  If so, swap out parts until the problem goes away.  My personal number 1 failure is data cables.  They may get you the right data 99.9999% of the time, but that won't even let you reliably copy a 1 gb file.

I find new memory is often bad, but I rarely get failures after the DOA chips are replaced.

Greg
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.



More information about the Kernelnewbies mailing list