Query on skb buffer (Kumar amit mehta)

Pranay Kumar Srivastava Pranay.Shrivastava at hcl.com
Thu Mar 7 00:51:48 EST 2013



> -----Original Message-----
> From: kernelnewbies-bounces at kernelnewbies.org [mailto:kernelnewbies-
> bounces at kernelnewbies.org] On Behalf Of kernelnewbies-
> request at kernelnewbies.org
> Sent: Thursday, March 07, 2013 8:52 AM
> To: kernelnewbies at kernelnewbies.org
> Subject: Kernelnewbies Digest, Vol 28, Issue 12
> 
> Send Kernelnewbies mailing list submissions to
> 	kernelnewbies at kernelnewbies.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> or, via email, send a message with subject or body 'help' to
> 	kernelnewbies-request at kernelnewbies.org
> 
> You can reach the person managing the list at
> 	kernelnewbies-owner at kernelnewbies.org
> 
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of Kernelnewbies digest..."
> 
> 
> Today's Topics:
> 
>    1. Query on skb buffer  (Kumar amit mehta)
>    2. Re: Query on skb buffer (Valdis.Kletnieks at vt.edu)
>    3. Several unrelated beginner questions. (Konstantin Kowalski)
>    4. Re: Several unrelated beginner questions. (Gaurav Jain)
>    5. Re: Several unrelated beginner questions.
>       (Valdis.Kletnieks at vt.edu)
>    6. zap_low_mappings (ishare)
>    7. Re: zap_low_mappings (Valdis.Kletnieks at vt.edu)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 6 Mar 2013 10:39:13 -0800
> From: Kumar amit mehta <gmate.amit at gmail.com>
> Subject: Query on skb buffer
> To: kernelnewbies at kernelnewbies.org
> Message-ID: <20130306183913.GA3328 at gmail.com>
> Content-Type: text/plain; charset=us-ascii
> 
> My current understanding is that the skb, while being passed along various
> layers in linux network stack, will be manipulated majorly, using the
> skb->{head|data|tail|end|len} fields.
> 
> Suppose that my application (say 'ping') sends a ICMP echo request with a
> large packet size of 4k, i.e. $ ping -s 4096 <dest addr> Now, if alloc_skb(4096,
> GFP_KERNEL) is the routine that gets called to allocate the kernel buffer
> then, how does the kernel manages such prospective memory allocation
> failures and how kernel manages large packet requests from the application.
> 
> -Amit
[Pranay Kumar Srivastava] Perhaps you should've a look at linear and non-linear data (skb_frags to be specific). That's how large data is handled however I don't think you'll be doing that with ICMP or UDP. Reading directly from skbuffs for UDP would also give you header information however with TCP it doesn't. So unless there's any need for it perhaps it can be done in userland or use sock_sendmsg or sendfile (for zero copy).
	--P.K.S
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 06 Mar 2013 14:32:27 -0500
> From: Valdis.Kletnieks at vt.edu
> Subject: Re: Query on skb buffer
> To: Kumar amit mehta <gmate.amit at gmail.com>
> Cc: kernelnewbies at kernelnewbies.org
> Message-ID: <9932.1362598347 at turing-police.cc.vt.edu>
> Content-Type: text/plain; charset="us-ascii"
> 
> On Wed, 06 Mar 2013 10:39:13 -0800, Kumar amit mehta said:
> 
> > Now, if alloc_skb(4096, GFP_KERNEL) is the routine that gets called to
> > allocate the kernel buffer then, how does the kernel manages such
> > prospective memory allocation failures and how kernel manages large
> > packet requests from the application.
> 
> Did you actually look at the source for use of alloc_skb() and how it handles
> error returns?
> 
> (Hint - the kernel doesn't do the same thing at every use of alloc_skb(),
> because an allocation failure needs to be handled differently depending on
> where it happens.  At some places, just bailing out and dropping the packet
> on the floor without any notification to anybody is appropriate.  At other
> places, we need to propagate an error condition to the caller).
> 
> Typical pattern (from net/core/sock.c:)
> 
> /*
>  * Allocate a skb from the socket's send buffer.
>  */
> struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force,
>                              gfp_t priority) {
>         if (force || atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
>                 struct sk_buff *skb = alloc_skb(size, priority);
>                 if (skb) {
>                         skb_set_owner_w(skb, sk);
>                         return skb;
>                 }
>         }
>         return NULL;
> }
> EXPORT_SYMBOL(sock_wmalloc);
> 
> and then the caller does something like this (net/ipv4/ip_output.c, in
> function __ip_append_data():
> 
>                          } else {
>                                 skb = NULL;
>                                 if (atomic_read(&sk->sk_wmem_alloc) <=
>                                     2 * sk->sk_sndbuf)
>                                         skb = sock_wmalloc(sk,
>                                                            alloclen + hh_len + 15, 1,
>                                                            sk->sk_allocation);
>                                 if (unlikely(skb == NULL))
>                                         err = -ENOBUFS;
>                                 else
>                                         /* only the initial fragment is
>                                            time stamped */
>                                         cork->tx_flags = 0;
>                         }
>                         if (skb == NULL)
>                                 goto error;
> 
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 865 bytes
> Desc: not available
> Url :
> http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
> 306/3d3208a1/attachment-0001.bin
> 
> ------------------------------
> 
> Message: 3
> Date: Wed, 06 Mar 2013 18:19:09 -0500
> From: Konstantin Kowalski <kostya-kow at mail.ru>
> Subject: Several unrelated beginner questions.
> To: kernelnewbies at kernelnewbies.org
> Message-ID: <5137CEED.4000807 at mail.ru>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Hello everyone,
> 
> I am interested in Linux kernel programming (and OS kernels and general),
> and I am currently reading several books about Linux kernel. I have a few
> questions about it:
> 
> 
> 1.) Currently, I am reading 2 books about Linux kernel: Linux Device Drivers
> (3rd edition) and Linux Kernel Development (3rd edition).
> 
> I like both books and I am learning a lot from them.
> 
> I heard that both of this books are outdated, but so far all the information in
> this books seems valid and applicable. Is there better books you would
> recommend?
> 
> 2.) In Linux Device Drivers, it states that module_exit(function) is discarded if
> module is built directly into kernel or if kernel is compiled with option to
> disallow loadable modules. But what if the module still has to do something
> during shutdown? Releasing memory is unimportant since it does not persist
> over reboot, but what if the module has to write something to a disk file, or
> do some other action?
> 
> 3.) What's the deal with different kernel versions? I heard back in the 2.x
> days, even kernels were stable and odd versions were experimental, but
> with 2.6 it changed.
> 
> So with 3.x kernels, are all of them experimental in the beginning and stable
> in the end? Also, with 3.x new versions seem to be released more often than
> in 2.1-2.5 days. Did the release cycle get smaller or is it just my imagination?
> Also, what does rc number mean?
> 
> 4.) Currently, I am running linux-next, and it works great. Am I correct to
> assume that linux-next is supposed to have newest, shiniest and most
> unstable features? `uname -a` says that I am still running 3.8-next, but there
> is already 3.9 out. So which version is more experimental and least stable?
> Which one is the newest?
> 
> 5.) How exactly does make/.config work? When I run `make oldconfig`, does
> it use the everything from the previous .config and only ask how to configure
> new features? And when I run `make` does it re-use old object files if
> nothing was changed in the specific file, or does it re-compile everything
> from scratch?
> 
> Thank you,
> 
> Kostyantyn Kovalskyy (Konstantin Kowalski)
> 
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Thu, 7 Mar 2013 00:36:31 +0100
> From: Gaurav Jain <gjainroorkee at gmail.com>
> Subject: Re: Several unrelated beginner questions.
> To: Konstantin Kowalski <kostya-kow at mail.ru>
> Cc: Kernel Newbies <kernelnewbies at kernelnewbies.org>
> Message-ID:
> 	<CAAFF8wTV+EqQEPUNijBO67R+3SyKrFKbakA=P-
> i4vB7rrG3b0w at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Specifically regarding (3) and (4), please refer to this:
> http://unixtravails.blogspot.ch/2012/07/linux-versioning-system-and-
> development.html
> 
> Best Regards
> Gaurav Jain
> 
> 
> 
> On Thu, Mar 7, 2013 at 12:19 AM, Konstantin Kowalski <kostya-
> kow at mail.ru>wrote:
> 
> > Hello everyone,
> >
> > I am interested in Linux kernel programming (and OS kernels and
> > general), and I am currently reading several books about Linux kernel.
> > I have a few questions about it:
> >
> >
> > 1.) Currently, I am reading 2 books about Linux kernel: Linux Device
> > Drivers (3rd edition) and Linux Kernel Development (3rd edition).
> >
> > I like both books and I am learning a lot from them.
> >
> > I heard that both of this books are outdated, but so far all the
> > information in this books seems valid and applicable. Is there better
> > books you would recommend?
> >
> > 2.) In Linux Device Drivers, it states that module_exit(function) is
> > discarded if module is built directly into kernel or if kernel is
> > compiled with option to disallow loadable modules. But what if the
> > module still has to do something during shutdown? Releasing memory is
> > unimportant since it does not persist over reboot, but what if the
> > module has to write something to a disk file, or do some other action?
> >
> > 3.) What's the deal with different kernel versions? I heard back in
> > the 2.x days, even kernels were stable and odd versions were
> > experimental, but with 2.6 it changed.
> >
> > So with 3.x kernels, are all of them experimental in the beginning and
> > stable in the end? Also, with 3.x new versions seem to be released
> > more often than in 2.1-2.5 days. Did the release cycle get smaller or
> > is it just my imagination? Also, what does rc number mean?
> >
> > 4.) Currently, I am running linux-next, and it works great. Am I
> > correct to assume that linux-next is supposed to have newest, shiniest
> > and most unstable features? `uname -a` says that I am still running
> > 3.8-next, but there is already 3.9 out. So which version is more
> > experimental and least stable? Which one is the newest?
> >
> > 5.) How exactly does make/.config work? When I run `make oldconfig`,
> > does it use the everything from the previous .config and only ask how
> > to configure new features? And when I run `make` does it re-use old
> > object files if nothing was changed in the specific file, or does it
> > re-compile everything from scratch?
> >
> > Thank you,
> >
> > Kostyantyn Kovalskyy (Konstantin Kowalski)
> >
> >
> > _______________________________________________
> > Kernelnewbies mailing list
> > Kernelnewbies at kernelnewbies.org
> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >
> 
> 
> 
> --
> Gaurav Jain
> Associate Software Engineer
> VxVM Escalations Team, SAMG
> Symantec Software India Pvt. Ltd.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
> 307/58f38a72/attachment-0001.html
> 
> ------------------------------
> 
> Message: 5
> Date: Wed, 06 Mar 2013 19:05:09 -0500
> From: Valdis.Kletnieks at vt.edu
> Subject: Re: Several unrelated beginner questions.
> To: Konstantin Kowalski <kostya-kow at mail.ru>
> Cc: kernelnewbies at kernelnewbies.org
> Message-ID: <78569.1362614709 at turing-police.cc.vt.edu>
> Content-Type: text/plain; charset="us-ascii"
> 
> On Wed, 06 Mar 2013 18:19:09 -0500, Konstantin Kowalski said:
> 
> > 1.) Currently, I am reading 2 books about Linux kernel: Linux Device
> > Drivers (3rd edition) and Linux Kernel Development (3rd edition).
> >
> > I like both books and I am learning a lot from them.
> >
> > I heard that both of this books are outdated, but so far all the
> > information in this books seems valid and applicable. Is there better
> > books you would recommend?
> 
> They're both still mostly applicable.  The concepts listed are still valid - certain
> things need to be locked at certain times, things have lifetimes, and so on.
> The "outdated" is mostly places where the API has changed slightly - for
> instance, where api_foo(struct bar *a, struct baz *b) is now api_quux(struct
> bar *a, struct baz *b, int blat).  So you can't cut-n-paste the code and expect
> it to still work.
> 
> > 2.) In Linux Device Drivers, it states that module_exit(function) is
> > discarded if module is built directly into kernel or if kernel is
> > compiled with option to disallow loadable modules. But what if the
> > module still has to do something during shutdown? Releasing memory is
> > unimportant since it does not persist over reboot, but what if the
> > module has to write something to a disk file, or do some other action?
> 
> If your module has allocated 128M for a graphics buffer, you'll think releasing
> memory is important. :)
> 
> Strictly speaking, a module *should* have already been quiesced and taken
> care of business before module_exit() is called - there shouldn't be much of
> anything left to do at that point.
> 
> (Hint - this is exactly the same question as "why is an empty ->release()
> function considered a Bad Thing" - it's because release() and similar are
> supposed to do the clean-up before the module exits)
> 
> > 3.) What's the deal with different kernel versions? I heard back in
> > the 2.x days, even kernels were stable and odd versions were
> > experimental, but with 2.6 it changed.
> 
> > So with 3.x kernels, are all of them experimental in the beginning and
> > stable in the end? Also, with 3.x new versions seem to be released
> > more often than in 2.1-2.5 days. Did the release cycle get smaller or
> > is it just my imagination? Also, what does rc number mean?
> 
> The 3.x series is exactly the same policy as 2.6 was - Linus just decided that
> 2.6.42 was too much and reset the counter, and he's been holding to pretty
> close to every three months for releases for all that time.
> 
> And 2.1 got up to 2.1.142 or something insane like that in fewer years than it
> took 2.6 to get to .42, so it isn't like releases are more frequent these days
> :)
> 
> > 4.) Currently, I am running linux-next, and it works great. Am I
> > correct
> 
> Lucky you.  I manage to break at least 2-3 things in linux-next per release
> cycle. ;)
> 
> > to assume that linux-next is supposed to have newest, shiniest and
> > most unstable features? `uname -a` says that I am still running
> > 3.8-next, but there is already 3.9 out. So which version is more
> > experimental and least stable? Which one is the newest?
> 
> Do another pull of the linux-next tree, it will say you're on 3.9-rc1-next now.
> And even when it said 3.8-next, that was already "3.8 plus all the patches
> queued for 3.9".  Now that Linus's tree is at 3.9-rc1, (closing the merge
> window for major additions for 3.9) people will be dumping 3.10 material into
> the linux-next tree.
> 
> > 5.) How exactly does make/.config work? When I run `make oldconfig`,
> > does it use the everything from the previous .config and only ask how
> > to configure new features?
> 
> Yes, that's what *should* happen.
> 
> >                          And when I run `make` does it re-use old
> > object files if nothing was changed in the specific file, or does it
> > re-compile everything from scratch?
> 
> Try it and see. :)  Note that sometimes, an apparently innocuous config
> change can result in the rebuild of lots of files.  This is because some
> commonly used .h file has a #ifdef CONFIG_FOO in it - and when you change
> FOO, then everybody that includes that .h (even indirectly) ends up
> rebuilding.
> 
> But in general, if you touch only 1 or 2 .c files and no widely used .h files,
> you'll just have to rebuild those .c's if they're modules.  If they're kernel
> builtins, there's another 10 or 12 things that have to happen, but it's still a lot
> faster than a full rebuild.
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 865 bytes
> Desc: not available
> Url :
> http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
> 306/0c5da86e/attachment-0001.bin
> 
> ------------------------------
> 
> Message: 6
> Date: Thu, 7 Mar 2013 10:33:18 +0800
> From: ishare <june.tune.sea at gmail.com>
> Subject: zap_low_mappings
> To: kernelnewbies at kernelnewbies.org
> Message-ID: <20130307023318.GA2940 at debian.localdomain>
> Content-Type: text/plain; charset=us-ascii
> 
> 
>   kernel halts because the page mapping has been modified by
> zap_low_mappings ,
> 
> 
>   why we should do zap_low_mappings in init procedure ? this will disorder
> the page mapping.
> 
>   thanks!
> 
> 
> 
> ------------------------------
> 
> Message: 7
> Date: Wed, 06 Mar 2013 22:19:28 -0500
> From: Valdis.Kletnieks at vt.edu
> Subject: Re: zap_low_mappings
> To: ishare <june.tune.sea at gmail.com>
> Cc: kernelnewbies at kernelnewbies.org
> Message-ID: <8572.1362626368 at turing-police.cc.vt.edu>
> Content-Type: text/plain; charset="us-ascii"
> 
> On Thu, 07 Mar 2013 10:33:18 +0800, ishare said:
> >
> >   kernel halts because the page mapping has been modified by
> > zap_low_mappings
> >
> >
> >   why we should do zap_low_mappings in init procedure ? this will disorder
> the page mapping.
> 
> You might want to get yourself an up to date kernel, as the code you're
> asking about was removed almost 2 1/.2 years ago.
> 
> zap_low_mappings was removed in October 2010 by this commit:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/x
> 86/mm/init_32.c?id=b40827fa7268fda8a62490728a61c2856f33830b
> 
> x86-32, mm: Add an initial page table for core bootstrapping
> 
> This patch adds an initial page table with low mappings used exclusively for
> booting APs/resuming after ACPI suspend/machine restart. After this,
> there's no need to add low mappings to swapper_pg_dir and zap them later
> or create own swsusp PGD page solely for ACPI sleep needs - we have
> initial_page_table for that.
> 
> Signed-off-by: Borislav Petkov <bp at alien8.de> LKML-
> Reference:<20101020070526.GA9588 at liondog.tnic>
> Signed-off-by: H. Peter Anvin <hpa at linux.intel.com>
> 
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 865 bytes
> Desc: not available
> Url :
> http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
> 306/2abb9c96/attachment.bin
> 
> ------------------------------
> 
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> 
> 
> End of Kernelnewbies Digest, Vol 28, Issue 12
> *********************************************


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------




More information about the Kernelnewbies mailing list