Linux Kernel contains only C code?

valdis.kletnieks at vt.edu valdis.kletnieks at vt.edu
Fri Feb 2 01:52:30 EST 2018


On Thu, 01 Feb 2018 11:37:26 -0500, Aruna Hewapathirane said:

> Somethings are not so obvious like what could possibly be a *.y file or
> *.tc file ? If you type in find -name "*.y" in my case i see:
>
> aruna at debian:~/linux-4.15$ find -name "*.y"

> Now if I pass that to the 'file' command ...
>
> aruna at debian:~/linux-4.15$ file `find -name "*.y"` // yes you need those
> back ticks :)
>
> ./drivers/scsi/aic7xxx/aicasm/aicasm_macro_gram.y: C source, ASCII text

> So 'file' tells us these are C program files ? Lets verify ? If you 'cat'
> any of these files you will see it is actual C code. Why does it have a
> file extension of .y ?

Actually, if you look more closely at it, it's *not* actually C code.  The 'file'
command makes its best guess, triggered off things like '#include', etc.

The 'tl;dr' answer is "The *.y are input files for bison, and the *.l are
input files for flex".

The more detailed explanation, with 50 years of computer history....

[/usr/src/linux-next] head -20 tools/perf/util/expr.y
/* Simple expression parser */
%{
#include "util.h"
#include "util/debug.h"
#define IN_EXPR_Y 1
#include "expr.h"
#include "smt.h"
#include <string.h>

#define MAXIDLEN 256
%}

%pure-parser
%parse-param { double *final_val }
%parse-param { struct parse_ctx *ctx }
%parse-param { const char **pp }
%lex-param { const char **pp }

%union {
	double num;

That's got a bunch of % in unusual places for C, doesn't it? :)

Let's hit the rewind button back 5 decades or so, when tools for building
programs were just becoming created.  And everybody who wanted to write a
compiler for a language, or parsing data that wasn't strict 'ID is in columns
5-12' formatting, or a whole bunch of other stuff, had to write a parser to do
the parsing.

For those who have never done it, writing lexical scanners and parsers by hand
is a thankless job. I know from experience that the parse table for an LALR
parser for Pascal ends up being essentially a spreadsheet with some 300 rows
and 400 columns that you get to fill in by hand one at a time - and getting one
entry wrong means you have a buggy compiler (I took Compiler Design in college
the last year we had to do it by hand)

The first few compiled languages (COBOL, FORTRAN, and a few others) also had to
make do with hand-coded parsers.  And then in 1958, Algol happened, and it
spawned all sorts of languages - everything from C to PL/I to Pascal and
probably 200+ others (pretty much every language that allows nested
declarations and has begin/end tokens of some sort owes it to Algol).  And the
other thing about Algol was that it was a much "bigger" language than previous
ones, so John Backus invented a meta-language called BNF to provide a formal
specification of the syntax.

(For those who are curious, a EBNF specification for Pascal syntax is here:
http://www.fit.vutbr.cz/study/courses/APR/public/ebnf.html

The interesting thing about BNF is that it has these things called "production
rules" which define what legal programs look like - and the test for "legal"
can be done with a parser using a software/mathematical construct called a
"finite state machine" (and the 3 of you who understand the difference between
a context-sensitive grammar and a context-free grammar can stop giggling right
now.. ;)

So somebody had the bright idea that if you had a formal BNF specification, you
could write a program that would read the BNF, and spit out the C source for a
parser skeleton based on a finite state machine.  And thus were born two
programs called 'lex' (a lexical scanner - something that reads the source, and
outputs code that says "Hey, that's the word 'struct'" or "we just saw a
'for"). and another called 'yacc' (Yet Another Compiler Compiler) which did the
higher level "this is a legal function, but *that* right there is a messed-up
'if' statement that has a syntax error" stuff.  Oh, and generate output code, too.

Of course, that was decades ago, and eventually somebody wrote a faster 'lex' -
and thus was born /usr/bin/flex.  And yacc needed work, so the improved version
was, of course, called bison (think about it for a bit..)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180202/ce8d0b93/attachment-0001.sig>


More information about the Kernelnewbies mailing list