Determining patch impact on a specific config

Greg KH greg at kroah.com
Wed Aug 17 11:39:27 EDT 2016


On Wed, Aug 17, 2016 at 02:49:22PM +0000, Nicholas Mc Guire wrote:
> On Wed, Aug 17, 2016 at 04:17:19PM +0200, Greg KH wrote:
> > On Wed, Aug 17, 2016 at 02:01:28PM +0000, Nicholas Mc Guire wrote:
> > > On Wed, Aug 17, 2016 at 03:52:16PM +0200, Greg KH wrote:
> > > > On Wed, Aug 17, 2016 at 03:25:44PM +0200, Greg KH wrote:
> > > > > On Wed, Aug 17, 2016 at 12:39:39PM +0000, Nicholas Mc Guire wrote:
> > > > > > 
> > > > > > Hi !
> > > > > > 
> > > > > >  For a given patch I would like to find out if it impacts a
> > > > > >  given configuration or not. Now of course one could compile the
> > > > > >  kernel for the configuration prior to the patch, then apply the
> > > > > >  patch and recompile to find out if there is an impact but I would
> > > > > >  be looking for some smarter solution. Checking files only 
> > > > > >  unfortunately will not do it, due to ifdefs and friends so make
> > > > > >  would detect a change and recompile even if the affeted code 
> > > > > >  area is actualy dropped by the preprocessor.
> > > > > > 
> > > > > >  What Im trying to do is find out is, how many of the e.g. stable
> > > > > >  fixes of 4.4-4.4.14 would have impacted a given configuration - the
> > > > > >  whole exercise is intended for some statistical analysis of bugs
> > > > > >  in linux-stable.
> > > > 
> > > > Also, are you going to be analyizing the bugs in the stable trees, or
> > > > the ones we just happen to fix?
> > > > 
> > > > Note, that's not always the same thing :)
> > > >
> > > what we have been looking at first is the stable fixes
> > > for which the bug-commit is known via Fixes: patch. That only
> > > a first approximation but correlates very good with the
> > > overall stable fix rates. And from the regression analysis
> > > of the stable fix rates over versions one then can exstimate the
> > > residual bugs if one knows the distribution of the bug 
> > > survival times - which one again can estimate based on the
> > > bug-fixes that have Fixes: tags. 
> > 
> > That is all relying on the Fixes: tags, which are not used evenly across
> > the kernel at all.  Heck, there are still major subsystems that NEVER
> > mark a single patch for the stable trees, let alone adding Fixes: tags.
> > Same thing goes for most cpu architectures.
> 
> Well for the config we studied it was not that bad
> 
> 4.4 - 4.4.13 stable bug-fix commits 
>          total   with    % with
>          fix     Fixes:  Fixes
>          commits tag     tag in
>          1643    589     subsys
> kernel   3.89%   4.75%   43.7%
> mm       1.82%   2.17%   53.3%
> block    0.36%   0.84%   83.3%!
> fs       8.76%   4.92%*  20.1%*
> net      9.31%   12.56%  48.3%
> drivers  47.96%  49.23%  36.8%
> include  6.87%   19.18%  28.3%*
> arch/x86 4.50%   12.56%  33.7%
>  (Note that the precentages here do not add up
>   to 100% because we just picked out x86 and did not 
>   include all subsystems e.g. lib is missing).
> 
>  So fs is significantly below and include a bit - block is 
>  hard to say simply because it was only 6 stable fixes of 
>  which 5 had Fixes: tags so that sample is too small.
>  Correlating overall stable-fixes distribution over sublevels
>  with stabel-fixes with Fixes: tag gives me an R^2 of 0.76
>  so that does show that for any trending using Fixes: tags
>  is resonable. As noted we are looking at statistic properties
>  to come up with expected values nothing more.

But you aren't comparing that to the number of changes that are
happening in a "real" release.  If you do that, you will see the
subsystems that never mark things for stable, which you totally miss
here, right?

For example, where are the driver subsystems that everyone relies on
that are changing upstream, yet have no stable fixes?  What about the
filesystems that even more people rely on, yet have no stable fixes?
Are those code bases just so good and solid that there are no bugs to be
fixed?  (hint, no...)

So because of that, you can't use the information about what I apply to
stable trees as an indication that those are the only parts of the
kernel that have bugs to be fixed.

> > So be careful about what you are trying to measure, it might just be not
> > what you are assuming it is...
> 
> A R^2 of 0.76 does indicate that the commits with Fixes: tags in 4.4 series
> is quite well representing the overall stable fixes. 

"overall stable fixes".  Not "overall kernel fixes", two very different
things, please don't confuse the two.

And because of that, I would state that "overall stable fixes" number
really doesn't mean much to a user of the kernel.

> > > I dont know yet how robust these models will be at the end
> > > but from what we have until now I do think we can come up
> > > with quite sound predictions for the residual faults in the
> > > kernel.
> > 
> > Based on what I know about how stable patches are picked and applied, I
> > think you will find it is totally incorrect.  But hey, what do I know?
> > :)
> 
> Well if I look at the overall stable fixes developlment - not just those
> with Fixes: tags I get very clear trends if we look at at stable fixes
> over sublevels (linear model using gamma-distribution)
> 
> ver  intercept slope      p-value DoF AIC
> 3.2  4.2233783 0.0059133  < 2-16  79  2714.8
> 3.4  3.9778258 -0.0005657 0.164 * 110 4488
> 3.10 4.3841885 -0.0085419 < 2-16  98  2147.1
> 3.12 4.7146752 -0.0014718 0.0413  58  1696.9
> 3.14 4.6159638 -0.0131122 < 2-16  70  2124.8
> 3.18 4.671178  -0.006517  7.34-5  34  1881.2
> 4.1  4.649701  -0.004211  0.09    25  1231.8
> 4.4  5.049331  -0.039307  7.69-11 12  571.48
> 
> So while the confidence levels of some (notable 3.4) is not
> that exciting the overall trend does look resonably establshied
> that the slop is turning negative - indicating that the
> number of stable-fixes of sublevels systematically decreases
> with sub-lvels, which does indicate a stable development process.

I don't understand.  Not everyone uses "fixes:" so you really can't
use that as an indication of anything.  I know I never do for any patch
that I write.

Over time, more people are using the "fixes:" tag, but then that messes
with your numbers because you can't compare the work we did this year
with the work we did last year.

Also, our rate of change has increased, and the number of stable patches
being tagged has increased, based on me going around and kicking
maintainers.  Again, because of that you can't compare year to year at
all.

There's also the "bias" of the long-term and stable maintainer to skew
the patches they review and work to get applied based on _why_ they are
maintaining a specific tree.  I know I do that for the trees I maintain,
and know the other stable developers do the same.  But those reasons are
different, so you can't compare what is done to one tree vs. another one
very well at all because of that bias.

So don't compare 3.10 to 3.4 or 3.2 and expect even the motivation to be
identical to what is going on for that tree.

> > > Some early results where presented at ALS in Japan on July 14th
> > > but this still needs quite a bit of work.
> > 
> > Have a pointer to that presentation?
> >
> They probably are somewher on the ALS site - but I just dropped
> them to our web-server at
>   http://www.opentech.at/Statistics.pdf and
>   http://www.opentech.at/TechSummary.pdf
> 
> This is quite a rough summary - so if anyone wants the actual data
> or R commands used - let me know - no issue with sharing this and having
> people tell me that Im totally wrong :)

Interesting, I'll go read them when I get the chance.

But I will make a meta-observation, it's "interesting" that people go
and do analysis of development processes like this, yet never actually
talk to the people doing the work about how they do it, nor how they
could possible improve it based on their analysis.

We aren't just people to just be researched, we can change if asked.
And remember, I _always_ ask for help with the stable development
process, I have huge areas that I know need work to improve, just no one
ever provides that help...

And how is this at all a kernelnewbies question/topic?  That's even
odder to me...

sorry for the rant,

greg k-h



More information about the Kernelnewbies mailing list