Transcript for:
Understanding Context-Free Grammar and BNF

this video introduces the concept of a context-free grammar which is a form of grammar ideally suited to describing the syntax of programming languages the notation typically used to describe such grammars is a Bacchus naur form or BNF which was first used to describe the Algol 60 programming language Bacchus sonow are the names of the two individuals who developed and refine this particular form of presentation let's start off with an example here is a single grammar rule the digit is the left-hand side and the part to the right of this arrow is the right-hand side each of these individual numbers are terminals in the language of grammars they are also called lexemes the left-hand side is always a non-terminal though we'll also see some non terminals on the right-hand side in a moment non terminals have these angled brackets around them indicating that they can be expanded into something else these vertical bars represent or so what this rule says is that the entity named digit can transform into a 0 or a 1 or a 2 or so on basically any of the digits terminals are aspects of the grammar that cannot be simplified any further so this rule defines a single digit but what if we want to define numbers consisting of multiple digits well this next example will be another rule and it will use a non terminal on the right-hand side and in fact it will be a recursive rule so now our grammar has two rules or productions or production rules as they're sometimes called we have one rule for digit and then another rule for gnat which I'm using to represent natural number natural numbers are integers from 0 and up basically the non-negative integers so a natural number can be one of two things notice we still have a vertical bar but because this wouldn't fit nicely on one line the vertical bar is starting the new line but it still means or so a NAT is either a digit which is a non-terminal but we can find out what it did it is by looking at this other rule or a NAT is a digit followed by a net so NAT is defined recursively in terms of itself this first case with only the digit is a base case and this second case where NAT is called again is a recursive case these two rules allow us to create numbers of arbitrary length as shown by the following derivation a derivation starts with a given non terminal and then expands it one step at a time according to the rules now I can pick whatever rules I want but I will choose ones leading to a result which I think is illustrative so there's two cases for an at either digit or digit NAT I picked the second one and so that's one expansion now I'm gonna be doing what is called a leftmost derivation meaning I'm always going to expand things on the left first so a digit has to expand to one of these things so I'm going to arbitrarily pick nine that leaves us with nine that and at the next step I will have to expand than that because nine is the terminal and can no longer be expanded so this will expand to digit gnat and at the next level I'll have to pick a terminal for that digit and I will pick three and we still have the gnat there and then this level I have nine and three and what will I expand this to now well this time I'm gonna pick the digit rule so it simply expands to digit and we're only one step away from being done we expand this out into any of the digits and I'll pick the terminal zero so it was possible to represent the number 930 using our grammar and this is a derivation of this particular grammatical sentence unfortunately this grammar also allows us to produce sentences like the following gnat could expand into digit and that but then at the next level the particular digit I expand here is going to be a zero so I have zero in that and you'll see that the problem is that I have a number in which the very first digit is a zero and we would like to avoid that I only want my numbers to start with nonzero digits that's a more proper representation of a natural number so how do we fix this problem one solution is to transform our grammar as follows so I'm going to start from scratch and redefine a gnat to be either a single digit or something called a non zero followed by digits so these are two new non terminals and also I need to redefine digit because I've erased that and it's going to be slightly different this time around so what are digits so we have a new non terminal called digits with an S and this is going to be the recursive non-terminal so digits is either a digit or a single digit followed by digits so now we need to define digit and non-zero well a digit is either the terminal 0 or a non zero and non zero will be the collection of all the terminal digits from 1 to 9 so this grammar allows us to generate any non-negative integer also known as a natural number the trick is that we will start with a digit or a non zero so if we have a digit that is the only one we have that one could be a standalone zero or any of the values from 1 to 9 however if we're gonna have something that follows that first digit then that first one cannot be zero so we only allow a non zero here therefore will have a value 1 through 9 and then the digits portion of this recursively fills in some arbitrary number of digits which could include a zero because of this definition here and that's how we get all of the non-negative integers we'll be seeing many more examples of grammars as we explore other aspects of syntax in the upcoming videos