Determining patch impact on a specific config

Greg KH greg at kroah.com
Wed Aug 17 13:34:02 EDT 2016


On Wed, Aug 17, 2016 at 04:50:30PM +0000, Nicholas Mc Guire wrote:
> > But you aren't comparing that to the number of changes that are
> > happening in a "real" release.  If you do that, you will see the
> > subsystems that never mark things for stable, which you totally miss
> > here, right?
> 
> we are not looking at the runup to 4.4 here we are looking at
> the fixes that go into 4.4.1++ and for those we look at all
> commits in linux-stable. so that should cover ALL subsystems 
> for which bugs were discovered and fixed (either in 4.4.X or
> ported from other 4.X findings).

No, because (see below)...

> > For example, where are the driver subsystems that everyone relies on
> > that are changing upstream, yet have no stable fixes?  What about the
> > filesystems that even more people rely on, yet have no stable fixes?
> > Are those code bases just so good and solid that there are no bugs to be
> > fixed?  (hint, no...)
> 
> that is not what we are claiming - the model here is that the 
> operation is uncovering bugs and the critical bugs are being 
> fixed in stable releases. That there are more fixes and lots 
> of cleanups that go into stable is clear but with respect to 
> the usability of the kernel we do assume that if a bug in 
> driver X is found that results in this driver being unusable 
> or destabilizing the kernel it would be fixed in the stable 
> fixes as well (which is also visible in the close to 50% 
> fixes being in drivers) - now if that assumption is overly 
> naive then you are right - and the assessment will not hold

No, that's not how bugs normally get found/fixed. They aren't found in
older kernels for the most part, they are found in the "latest" kernel
and then sometimes tagged that they should be backported.

All of the automated testing/debugging that we have going on to fix
issues are on the latest kernel release, not the older releases.  We
might get lucky and get bug reports from a good distro like Debian or
Fedora that is running the latest stable kernel, but usually those
reports are "backport these fixes to the stable kernel please" as the
fixes have already been made by the community.

But this does depened on the subsystem/area as well.  Some arches don't
even test every kernel, they only wake up once a year and start sending
patches in.  Some subsystems are the same (scsi was known for this...)
So things are all over the place.

Also, you have the subsystems and arches that are just quiet for
long stretches of time (like scsi used to be), where patches would queue
up for many releases before they finally got merged.  Some arches only
send patches in every other year for things that are more complex than
build-breakage bugs because they just don't care.

So please, don't assume that the patches I apply to a LTS kernel are due
to someone noticing it in that kernel.  It's almost never the case, but
of course there are exceptions.

Again, I think you are trying to attribute a pattern to something that
doesn't have it, based on how I have been seeing kernel development work
over the years.

> > So because of that, you can't use the information about what I apply to
> > stable trees as an indication that those are the only parts of the
> > kernel that have bugs to be fixed.
> 
> so a discovered critical bug found in 4.7 that also is found
> to apply to say 4.4.14 would *not* be fixed in 4.4.15 stable 
> release ? 

Maybe, depends on the subsystem.  I know some specific ones that the
answer to that would be no.  And that's the subsystem maintainers
choice, I can't tell him to do extra work just because I decided to
maintain a specific kernel version for longer than expected.

> > > > So be careful about what you are trying to measure, it might just be not
> > > > what you are assuming it is...
> > > 
> > > A R^2 of 0.76 does indicate that the commits with Fixes: tags in 4.4 series
> > > is quite well representing the overall stable fixes. 
> > 
> > "overall stable fixes".  Not "overall kernel fixes", two very different
> > things, please don't confuse the two.
> 
> Im not - we are looking at stable fixes - not kernel fixes the 
> reason for that simply being that for kernel fixes it is not
> possible to say if they are bug-fixes or optimzations/enhancements
> - atleast not in any automated way.

I agree, it's hard, if not impossible to do that :)

> The focus on stable dot releases and their fixes was chosen 
>  * because it is manageable
>  * because we assume that critical bugs discovered will be fixed
>  * and because there are no optimizations or added features 

The first one makes this easier for you, the second and third are not
always true.  There have been big patchsets get merged into longterm
stable kernel releases that were done because they were "optimizations"
and the maintainer of that subsystem and I discussed it and deemed it
was a valid thing to accept.  This happens every 6 months or so if you
look closely.  The mm subsystem is known for this :)

And as for #2, again, I'm at the whim of the subsystem maintainer to
mark the patches as such.  And again, this does not happen for all
subsystems.

> > And because of that, I would state that "overall stable fixes" number
> > really doesn't mean much to a user of the kernel.
> 
> It does for those that are using some LTS release and it says 
> something about the probability of a bug in a stable relase
> being detected. Or would you say that a 4.4.13 is not to be
> expected to be better off than 4.4.1 ?

Yes, I would hope that it is better, otherwise why would I have accepted
the patches to create that kernel?  :)

But you can't make the claim that all bugs that are being found are
being added to the stable kernels, and especially not the lts kernels.

> From the data we have
> looked at so far: life-time of a bug in -stable as well as with 
> respect to the discovery rate of bugs in sublevel releases
> it seems clear that the reliability of the kernel over
> sublevel releases is increasing and that this can be utilized
> to select a kernelversion more suitable for HA or critical
> systems based on trending/analysis.

That's good, I'm glad we aren't regressing.  But the only way you can be
sure to get all fixes is to always use the latest kernel release.
That's all the kernel developers will ever guarantee.

"critical" and HA systems had better be updating to newer kernel
releases as they have all of the fixes in it that they need.  There
shouldn't be any "fear" of changing to a new kernel any more than they
should fear moving to a new .y stable release.

> > Over time, more people are using the "fixes:" tag, but then that messes
> > with your numbers because you can't compare the work we did this year
> > with the work we did last year.
> 
> sur why not ? You must look at relative usage and correlation
> of the tags - currently about 36% of the stable commits in the
> dot-releases (sublevels) are a uable basis - if the use of
> Fixes: increases all the better - it just means we are moving
> towards an R^2 of 1 - results stay comparable, it just means
> that the confidence intervals for the current data are wider
> than for the data of next year.

Depends on how you describe "confidence" levels, but sure, I'll take
your word for it :)

> > Also, our rate of change has increased, and the number of stable patches
> > being tagged has increased, based on me going around and kicking
> > maintainers.  Again, because of that you can't compare year to year at
> > all.
> 
> why not ? We are not selecting a specific class of bugs in any
> way - the Fixes are neatly randomly distributed across the 
> effective fixes in stable - it may be a bit biased because some
> maintainer does not like Fixes: tags and her subsystem is 
> significantly more complex/more buggy/better tested/etc. than
> the average bussystem - so we would get a bit of a bias into it
> all - but that does not invalidate the results. 
> You can ask the voters in 3 states who they will elect president
> and this will give you a less accurate result than if you ask in
> all 51 states but if you factor in that uncertainty into the
> result its perfectly valid and stays comparable to other results 
> 
> Im not saying that you simply can compare numeric values for
> 2016 with those from 2017 but you can compare the trends and
> the expectations if you model uncertainties. 

Ok, fair enough.  As long as we continue to do better I'll be happy.

> Note that we have a huge advantage here - we can make predictions
> from models - say predict 4.4.16 and then actually check our models

That's good, and is what I've been telling people that they should be
doing for a long time.  Someone actually went and ran regression tests
on all 3.10.y kernel releases and found no regressions for their
hardware platform.  That's a good thing to see.

> Now if there are really significant changes like the task struct
> bein redone then that may have a large impact and the assumption
> that the convoluted parameter "sublevel" is describing a more or
> less stable development might be less correct - it will not be
> completely wrong - and consequently the prediction quality will
> suffer - but does that invalidate the approach ?

I don't know, you tell me :)

> > There's also the "bias" of the long-term and stable maintainer to skew
> > the patches they review and work to get applied based on _why_ they are
> > maintaining a specific tree.  I know I do that for the trees I maintain,
> > and know the other stable developers do the same.  But those reasons are
> > different, so you can't compare what is done to one tree vs. another one
> > very well at all because of that bias.
> 
> If the procedures applied do not "jump" but evolve then bias is
> not an issue - you can find many factors that will increas the
> uncertainty of any such prediction - but if the parameters, which
> all are convoluted - be it by presonal preferences of maintainers
> selection of a specific FS in mainline distributions, etc - stil
> represent the overall development and as long as your bias as you
> called it does not flip-flop from 4.4.6 to 4.4.7 we do not care
> to much.

Ok, but I don't think the users of those kernels will like that, as you
can't represent bias in your numbers and perhaps a whole class of users
is being ignored for one specific LTS release.  Then they would get no
bugfixes for their areas :(

> > > > > Some early results where presented at ALS in Japan on July 14th
> > > > > but this still needs quite a bit of work.
> > > > 
> > > > Have a pointer to that presentation?
> > > >
> > > They probably are somewher on the ALS site - but I just dropped
> > > them to our web-server at
> > >   http://www.opentech.at/Statistics.pdf and
> > >   http://www.opentech.at/TechSummary.pdf
> > > 
> > > This is quite a rough summary - so if anyone wants the actual data
> > > or R commands used - let me know - no issue with sharing this and having
> > > people tell me that Im totally wrong :)
> > 
> > Interesting, I'll go read them when I get the chance.
> > 
> > But I will make a meta-observation, it's "interesting" that people go
> > and do analysis of development processes like this, yet never actually
> > talk to the people doing the work about how they do it, nor how they
> > could possible improve it based on their analysis.
> 
> I do talk to the people - Ive been doing this quit a bit - one of
> the reasons for hoping over to ALS was precisely that. We ahve been
> publishing our stuff all along including any findings, patches
> etc. 

What long-term stable kernel maintainer have you talked to?

Not me :)

> BUT: Im not going to go to LinuxCon and claim that I know how
>      to do better - not based on the preliminary data we have now
>  
> Once we think we have something solid - I´ll be most happy to sit 
> down and listen.
> 
> > 
> > We aren't just people to just be researched, we can change if asked.
> > And remember, I _always_ ask for help with the stable development
> > process, I have huge areas that I know need work to improve, just no one
> > ever provides that help...
> 
> And we are doing our best to support that - be it by documentation
> fixes, compliance analysis, type safety analysis and appropriate
> patches Ive been pestering maintainers with.

You have?  As a subsystem maintainer I haven't seen anything like this,
I guess no one relies on my subsystems :)

> But you do have to give us the time to have SOLID data first
> and NOT rush conclusions - as you pointed out here your self
> some of the assumptions we are making might well be wrong so 
> what kind of suggestions do you expect here ? 
>  First get the data
>   -> make a model
>    -> deduce your analysis/sample/experiements
>     -> write it all up and present it to the community 
>      -> get the feedback and fix the model
> and if after tha some significant findings are left - THEN
> we will show up at LinuxCon and try to find someone to listen
> to what we think we have to say...

No need to go to LinuxCon, email works.  And lots of us go to much
better confernces as well (Plumbers, Kernel Recipes, FOSDEM, etc.) :)

thanks,

greg k-h



More information about the Kernelnewbies mailing list