Notes

Notes - notes.io

Prithvi Narayan Campus Institute of Science and Technology
Department Of Computer Science and Information Technology
Tribhuvan University

Lab Report On
Compiler Design and Construction (CSC – 365)

Submitted to:
Mr. Prithvi Raj Paneru

Submitted by:
Kabita Subedi
Roll No: 27557/077

Lab Index
Name: Kabita Subedi Roll. No: 27557/077
Level: Bachelor Year/Sem: 3rd/6th
Subject: Compiler Design and Construction Course No.: CSC-365

S.N Experiment Date of
submission Remarks
1. TO SIMULATE DETERMINISTIC FINITE AUTOMATA.
2081/02/21
2. TO SIMULATE NON-DETERMINISTIC FINITE AUTOMATA.
2081/02/21
3. TO DESIGN SIMPLE LEXICAL ANALYZER FOR TOKENIZATION.
2081/02/21
4. TO CHECK WHETHER THE IDENTIFIER IS VALID OR NOT.
2081/02/21
5. TO CHECK VALID SINGLE LINE OR MULTILINE COMMENTS.
2081/02/21
6. TO COMPUTE FIRST AND FOLLOW IN A CONTEXT FREE GRAMMAR.
2081/02/21
7. TO CONSTRUCT A LL(1) PARSER. 2081/02/21
8. TO GENERATE THREE ADDRESS CODE FOR GIVEN SOURCE CODE. 2081/02/21
9. TO GENERATE ASSEMBLY CODE FOR GIVEN THREE ADDRESS CODE. 2081/02/21

LAB 1: Finite Automata

Experiment 1: To simulate deterministic finite automata
1. Objective:
• To learn to create DFA.
• To write a C program to implement the created DFA.
2. Theory:
Deterministic Finite Automaton (DFA)
In DFA, for each input symbol, one can determine the state to which the machine will move. Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is called Deterministic Finite Machine or Deterministic Finite Automaton.

Formal Definition of a DFA
A DFA is defined by a Quin tuplet (5- tuple) (Q, ∑, δ, q0, F) where −
Q: a finite set of states.
∑: a finite set of symbols called the alphabet.
δ: the transition function where δ: Q × ∑ → Q
q0: the initial state from where any input is processed (q0 ∈ Q).
F: a set of final state/states of Q (F ⊆ Q). Representation of DFA

Representation of DFA
We can represent DFA into two forms i.e., transition diagram (graphical) and transition
table
Representation using transition diagram:
A DFA is represented by digraphs called state diagram.
1) For each state in DFA, there must be a node in a graphical representation.
2) For each transition function denoted as δ(p,a)=q there is an arc from state p to q labelled with a.
3) There must be a input arc to starting state with label START.
4) Accepting states/final states are represented by double circled node.
Representation using transition table:
• The row of table corresponds to the states while column corresponds to the input symbol.
• The starting state in the table is represented by δ followed by state where as final state is represented as *(state).

Example:
Let a deterministic finite automaton be →
I. Consider a DFA;
Q = {q0, q1, q2, q3}
Σ = {0, 1}
q0 = q0
F = {q0}
δ = Q × Σ -> Q
Then the transition table for above DFA is as follows:

δ 0 1
* -> q0 q2 q1
q1 q3 q0
q2 q0 q3
q3 q1 q2

This DFA accepts strings having both an even number of 0‘s & even number of 1‘s.
The corresponding transition diagram is

3. Demonstration:
Program 1: Program to simulate DFA accepting language L = { w | w starts with 01 over ∑ = {0,1}

Source Code
#include <stdio.h>
#include <stdlib.h>
enum state {q0,q1,q2,qf};
enum state delta(enum state s, char ch)
{
enum state current_state;
switch (s)
{
case q0:
if (ch == '0')
current_state = q1;
else
current_state = q2;
break;
case q1:
if (ch == '1')
current_state = qf;
else
current_state = q2;
break;
case qf:
if (ch == '0')
current_state = qf;
else
current_state = qf;
break;
case q2:
if (ch == '0')
current_state = q2;
else
current_state = q2;
break;
}
return current_state;
}
int main()
{
char ch, choice;
char input[100];
enum state current_state = q0;
int i = 0;
printf("------DFA TO ACCEPT STRING STARTS WITH 01------");
printf("n Enter the String: ");
gets(input);
ch = input[i];
while (ch != '')
{
current_state = delta(current_state, ch);
ch = input[++i];
}
if (current_state == qf)
printf("Acceptedn");
else
printf("Rejectedn");
return 0;
}

Output

Result and Discussion:
Program 1 is written to simulate DFA accepting language L = {w | w starts with 01 over ∑ = {0,1} which successfully accepts the strings 011010101010 but rejects the strings 1010.

Program 2: Program to simulate DFA accepting language L = { w | w ends with 011 over
∑ = {0,1}
Source Code
#include <stdio.h>
#include <stdlib.h>
enum state {q0,q1,q2,qf};
enum state delta(enum state s, char ch)
{
enum state current_state;
switch (s)
{
case q0:
if (ch == '0')
current_state = q1;
else
current_state = q1;
break;
case q1:
if (ch == '1')
current_state = q2;
else
current_state = q1;
break;
case q2:
if (ch == '1')
current_state = qf;
else
current_state = q1;
break;
case qf:
if (ch == '0')
current_state = q1;
else
current_state = q0;
break;
}
return current_state;
}

int main()
{
char ch, choice;
char input[100];
enum state current_state = q0;
int i = 0;
printf("------DFA TO ACCEPT STRING THAT ENDS WITH 011------");
printf("n Enter the String: ");
gets(input);
ch = input[i];
while (ch != '')
{
current_state = delta(current_state, ch);
ch = input[++i];
}
if (current_state == qf)
{
printf("Acceptedn");
}
else
{
printf("Rejectedn");
}
return 0;
}

Output

Result and Discussion:
Program 2 is written to simulate DFA accepting language L = {w | w ends with 011 over
∑ = {0,1} which successfully accepts the strings 1010011 and but rejects the strings 01110101001.

4. Conclusion:
Hence, in this experiment we simulate two different types of DFA, one which accepts any string that starts with 01 and another DFA which accepts any string that ends 011. These DFA were successfully created and implemented using C program.

Experiment 2: To simulate non-deterministic finite automata

1. Objective:
• To learn to create NFA.
• To write a C program to implement the created NFA.
2. Theory:
Non-deterministic Finite Automaton
In NDFA, for a particular input symbol, the machine can move to any combination of the states in the machine. In other words, the exact state to which the machine moves cannot be determined. Hence, it is called Non-deterministic Automaton. As it has finite number of states, the machine is called Non-deterministic Finite Machine or Non-deterministic Finite Automaton.
Formal Definition of an NDFA
An NDFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −
• Q is a finite set of states.
• ∑ is a finite set of symbols called the alphabets.
• δ is the transition function where δ: Q × ∑ → 2Q
• (Here the power set of Q (2Q) has been taken because in case of NDFA, from a state, transition can occur to any combination of Q states)
• q0 is the initial state from where any input is processed (q0 ∈ Q).
• F is a set of final state/states of Q (F ⊆ Q).

Representation of NFA
We can represent DFA into two forms i.e., transition diagram (graphical) and transition table
Representation using transition diagram:
An NDFA is represented by digraphs called state diagram.
The vertices represent the states.
• The arcs labeled with an input alphabet show the transitions.
• The initial state is denoted by an empty single incoming arc.
• The final state is indicated by double circles.
Representation using transition table:
• The row of table corresponds to the states while column corresponds to the input symbol.
• The starting state in the table is represented by 🡪 followed by state where as final state is represented as *(state).

Example: NFA accepting all strings that end in 01.

Here, from state q1, there is no any arc for input symbol 0 & no any arc out of q2 for 0 & 1. So, we can conclude in a NFA, there may be zero no. of arcs out of each state for each input symbol. While in DFA, it has exactly one arc out of each state for each input symbol.
3. Demonstration:
Program 1: Program to simulate NFA accepting language L = { w | w starts with 01 over ∑ = {0,1}
#include <stdio.h>
int main()
{
enum state {q0,q1,qf};
char ch;
char input[100];
enum state current_state = q0;
int i = 0;
int flag = 0;
printf("------NDFA TO ACCEPT STRING STARTS WITH 01------");
printf("n Enter the String: ");
gets(input);
ch = input[i];
while (ch != ''){
switch(current_state)
{
case q0:
if (ch == '0')
current_state = q1;
else
flag = 1;
break;
case q1:
if (ch == '1')
current_state = qf;
else
flag=1;
break;
case qf:
if(ch=='0'||ch=='1')
current_state = qf;
break;
}
if(flag)
break;
ch=input[++i];}
if (current_state == qf)
printf("The string %s is Accepted.",input);
else
printf("The string %s is Rejected.",input);
return 0;}

Output

Result and Discussion:
Program 1 is written to simulate NFA accepting language L = {w | w starts with 01 over ∑ = {0,1} which successfully accepts the strings 010000110 but rejects the string 10101001.

Program 2: Program to simulate NFA over ∑ = {0,1} that accepts all the strings containing 001 as substring
Source Code
#include <stdio.h>
#include <string.h>
char input[20];
int l, flag;
void s(int i)
{
flag =1;
}
void r(int i)
{
if(i<l)
{
if(input[i]=='1')
{
++i;
s(i);
}
}
}
void q(int i)
{
if(i<l)
{
if(input[i]=='0')
{
++i;
r(i);
}
}

}
void p(int i)
{
if(i<l)
{
int k=i;
if(input[i]=='0')
{
k++;
p(k);
q(k);
}
else
{
if(input[i]=='1')
{
i++;
p(i);
}
}
}
}

int main()
{
printf("-----NFA to accept 001 as substring-----n");
printf("Enter a string: ");
gets(input);
l=strlen(input);
int i=0;
flag=0;
p(i);
if(flag==1)
{
printf("nThe string is accepted");
}
else
{
printf("nThe string is rejected");
}
return 0;
}

Output

Result & Discussion:
Program 2 is written to simulate DFA accepting language L = {w | w ends with 011 over ∑ = {0,1} which successfully accepts the strings 100100011 but rejects the strings 101010111.

4. Conclusion:
Hence, in this experiment we simulate two different types of NFA, one which accepts any string that starts with 01 and another NFA which accepts any string that contains 001 as substring. These NFA were successfully created and implemented using C program.

LAB 2: Lexical Analysis

Experiment 1: To design simple lexical analyzer for tokenization.

Objective:
• To learn about C identifiers and keywords
• To write a C program to identify the valid C identifiers and keywords.
2. Theory:
Tokenization is the process of breaking a given input text into smaller units called tokens. Each token represents a meaningful unit of information, such as words, symbols, or numbers. In a C program, tokenization can be implemented using string manipulation functions and loops.
Keywords:
• Keywords are predefined words for a C programming language.
• All keywords have fixed meaning and these meanings cannot be changed.
• The keywords cannot be used as variable names.
A list of 32 keywords in the c language is given below:

C identifiers:
Identifiers are names given to different entities such as constants, variables, structures,
functions, etc.
Every word used in C program to identify the name of variables, functions, arrays,
pointers and symbolic constants are known as identifiers.
 These are user defined names consisting of arbitrarily long sequence of letters and digits
with either a letter or the underscore ( _ ) as a first character.
 There are certain rules that should be followed while naming C identifiers:
 They must begin with a letter or underscore (_).
 They must consist of only letters, digits, or underscore. No other special character is
allowed.
 It should not be a keyword.
 It must not contain white space.
 It should be up to 31 characters long as only first 31 characters are significant.
 Uppercase and lowercase letters are not interchangeable
Eg: int money;
double accountBalance;
Here, money and accountBalance are identifiers.
Also remember, identifier names must be different from keywords. we cannot use int as an identifier .
3. Demonstration:
Program 1: Program to design simple lexical analyzer for tokenization.
Source Code:
//To implement tokenization in c program
#include<stdlib.h>
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<process.h>
int iskeywords(char b[])
{
char keywords[32][10] =
{
"auto","double","int","struct","break","else","long","switch","case","enum","register","typedef","char","extern","return","union","const","float","short","unsigned","continue","for","signed","void","default","goto","sizeof","volatile","do","if","static","while"
};
int i, flag=0;
for(i=0; i<32; ++i)
{
if(strcmp(keywords[i],b)==0)
{
flag=1;
break;
}
}
return flag;
}
int main()
{
char ch, buffer[15], operators[]="+-*/%=";
FILE *fp;
int i, j=0;
fp=fopen("aa.txt","r");
if(fp==NULL)
{
printf("n File Cannot be opened");
exit(0);
}
while((ch=fgetc(fp))!=EOF)
{
for(i=0; i<6; i++)
{
if(ch==operators[i])
printf("%c is operator n",ch);
}
if(isalnum(ch))
{
buffer[j++]=ch;
}
else if((ch==''|| ch=='n')&&j!=0)
{
buffer[j]='';
j=0;
if(iskeywords(buffer)==1)
printf("%s is a keyword n",buffer);
else
printf("%s is a identifern",buffer);
}
}
fclose(fp);
return 0;
}

OUTPUT:

Result and Discussion: The program is written to identify the given string as valid C identifiers and keywords, where voidmain was identified as identifer, inabc and cab was identified as valid C identifier and = and + are operator.

4. Conclusion:
Hence, in this experiment we learned about valid C identifiers and keywords along with a C Program which can identify whether a string is C identifier or operator.
Experiment 2:

Title: To check whether the identifier is a valid or not.

1. Objective:
• To learn about C identifiers.
• To write a C program to identify the identifier is a valid or not.

2. Theory:
Given a string str, the task is to check if the string is a valid identifier or not. In order to qualify as a valid identifier, the string must satisfy the following conditions:
• It must start with an either underscore() or any of the characters from the ranges [‘a’, ‘z’] and [‘A’, ‘Z’].
• There must not be any white space in the string.
• And, all the subsequent characters after the first character must not consist of any special characters like $, #, % etc.
Eg: int money;
double accountBalance;
Here, money and accountBalance are identifiers.
Also remember, identifier names must be different from keywords. we cannot use int as an identifier.

3. Demonstration:
Program 1: Program to test whether the identifier in a program is a valid or not.
Source Code:
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
#include<process.h>
int main()
{
char a[10];
int flag, i=1;
printf("n Enter the Identifier: ");
gets(a);
if(isalpha(a[0])||a[0]=='_')
flag=1;
else
{
printf("n Not a valid identifer");
exit (0);
}

while(a[i]!='')
{
if(!isdigit(a[i])&& !isalpha(a[i])&& a[i]!='_')
{
flag=0;
break;
}
i++;
}
if(flag==1)
printf("n Valid Identifer") ;
else
printf("n Invalid Identifer");
}

OUTPUT:

Result and Discussion:
The program is written to identify the identifiers is a valid identifier or not.

4. Conclusion:
Hence, in this experiment we learned to identify the given string is a valid identifier or not.

Experiment 3:
Title: To check valid single line or multiline comments.
1. Objectives:
The objective of this lab is to create a simple program in C that validates single-line and multi-line comments. The program will determine whether a given input string is a valid comment in the C programming language.

2. Theory:
To determine whether a given string is a valid comment, we need to define the rules for what constitutes a valid comment in the programming language or context you are working with. In the code you provided, the following rules are used to determine the validity of a comment:

• Single-line Comment: If the input string starts with "//", it is considered a valid comment.
• Multi-line Comment: If the input string starts with "/*" and ends with "*/", it is considered a valid comment.

3. Demonstration:
Source Code:
//To test whether given entered string is valid single line or multiline comments.
#include <stdio.h>
#include <string.h>

// Function to check if a string is a valid single-line comment
int isValidSingleLineComment(const char* str) {
// Check if the string starts with //
return strncmp(str, "//", 2) == 0;
}

// Function to check if a string is a valid multiline comment
int isValidMultilineComment(const char* str) {
int len = strlen(str);
// Check if the string starts with /* and ends with */
return len >= 4 && strncmp(str, "/*", 2) == 0 && strncmp(str + len - 2, "*/", 2) == 0;
}

int main() {
char input[256];

printf("Enter a comment to validate: ");
fgets(input, sizeof(input), stdin);

// Remove newline character at the end if present
size_t len = strlen(input);
if (len > 0 && input[len - 1] == 'n') {
input[len - 1] = '';
}

if (isValidSingleLineComment(input)) {
printf("The input is a valid single-line comment.n");
} else if (isValidMultilineComment(input)) {
printf("The input is a valid multiline comment.n");
} else {
printf("The input is not a valid comment.n");
}

return 0;
}

Output:

4. Result and Discussion:
The above C program produced the expected results, successfully identifying valid single-line and multiline comments as well as invalid comments.

5. Conclusion:
The implemented C program effectively validates whether a given input string is a valid single-line or multiline comment according to the C programming language syntax. This validation can be useful for parsing and analyzing C source code, ensuring that comments are correctly formatted.

LAB 3: Syntax Analysis

Experiment 1: To compute FIRST and FOLLOW in a Context Free Grammar.

1. Objectives:
The objective of this lab is to create a C program that computes the FIRST and FOLLOW sets for a given Context-Free Grammar (CFG). The FIRST and FOLLOW sets are fundamental concepts in compiler design and are used in syntax analysis, particularly in constructing parsing tables for LL(1) parsers.

2. Theory:
In a CFG, the FIRST set of a non-terminal is the set of terminals that begin the strings derivable from that non-terminal. The FOLLOW set of a non-terminal is the set of terminals that can appear immediately to the right of that non-terminal in some "sentential" form.

These sets help in predictive parsing by determining which rule to apply based on the next input symbol.

Methodology to compute:
1. Define Structures:
- Create structures to represent the grammar, FIRST sets, and FOLLOW sets.

2. Input Grammar:
- Read the grammar rules from the user or a file.

3. Compute FIRST Sets:
- Implement a function to compute the FIRST set for each non-terminal.

4. Compute FOLLOW Sets:
- Implement a function to compute the FOLLOW set for each non-terminal.

3. Demonstration:
Source Code:
//To compute FIRST and FOLLOW in a context free grammar.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

#define MAX 10

struct Grammar {
char production[MAX][MAX];
int numProductions;
char nonTerminals[MAX];
int numNonTerminals;
} grammar;

struct Sets {
char first[MAX][MAX];
char follow[MAX][MAX];
} sets;

void addToSet(char *set, char ch) {
int len = strlen(set);
for (int i = 0; i < len; i++) {
if (set[i] == ch) {
return;
}
}
set[len] = ch;
set[len + 1] = '';
}

void computeFirst(char symbol, char *result) {
if (!isupper(symbol)) {
addToSet(result, symbol);
return;
}
for (int i = 0; i < grammar.numProductions; i++) {
if (grammar.production[i][0] == symbol) {
if (grammar.production[i][2] == '') {
addToSet(result, '#');
} else {
int j = 2;
while (grammar.production[i][j] != '') {
computeFirst(grammar.production[i][j], result);
if (strchr(sets.first[grammar.production[i][j] - 'A'], '#') == NULL) {
break;
}
j++;
}
}
}
}
}

void computeFollow(char symbol) {
if (grammar.nonTerminals[0] == symbol) {
addToSet(sets.follow[symbol - 'A'], '$');
}
for (int i = 0; i < grammar.numProductions; i++) {
for (int j = 2; j < strlen(grammar.production[i]); j++) {
if (grammar.production[i][j] == symbol) {
if (grammar.production[i][j + 1] != '') {
computeFirst(grammar.production[i][j + 1], sets.follow[symbol - 'A']);
}
if (grammar.production[i][j + 1] == '' || strchr(sets.first[grammar.production[i][j + 1] - 'A'], '#') != NULL) {
computeFollow(grammar.production[i][0]);
strcat(sets.follow[symbol - 'A'], sets.follow[grammar.production[i][0] - 'A']);
}
}
}
}
}

int main() {
printf("Enter the number of productions: ");
scanf("%d", &grammar.numProductions);
printf("Enter the productions (e.g., A=BC):n");
for (int i = 0; i < grammar.numProductions; i++) {
scanf("%s", grammar.production[i]);
grammar.nonTerminals[i] = grammar.production[i][0];
}
grammar.numNonTerminals = strlen(grammar.nonTerminals);

// Compute FIRST sets
for (int i = 0; i < grammar.numNonTerminals; i++) {
computeFirst(grammar.nonTerminals[i], sets.first[grammar.nonTerminals[i] - 'A']);
}

// Compute FOLLOW sets
for (int i = 0; i < grammar.numNonTerminals; i++) {
computeFollow(grammar.nonTerminals[i]);
}

// Print FIRST sets
printf("nFIRST sets:n");
for (int i = 0; i < grammar.numNonTerminals; i++) {
printf("FIRST(%c) = { %s }n", grammar.nonTerminals[i], sets.first[grammar.nonTerminals[i] - 'A']);
}

// Print FOLLOW sets
printf("nFOLLOW sets:n");
for (int i = 0; i < grammar.numNonTerminals; i++) {
printf("FOLLOW(%c) = { %s }n", grammar.nonTerminals[i], sets.follow[grammar.nonTerminals[i] - 'A']);
}

return 0;
}

Output:

4. Result and Discussion:
The program was tested with various grammars and produced the expected FIRST and FOLLOW sets, confirming its correctness.

5. Conclusion:
The implemented C program effectively computes the FIRST and FOLLOW sets for a given context-free grammar. These sets are essential for constructing parsing tables and understanding the structure of the grammar, aiding in syntax analysis during the compilation process.

Experiment 2:
Title: To construct a LL(1) parser.

1. Objectives:
The objective of this lab is to create an LL(1) parser in the C programming language. An LL(1) parser is a type of top-down parser for context-free grammars that uses a single lookahead token to make parsing decisions. This report outlines the steps to construct such a parser and demonstrates its functionality with a sample grammar.

2. Theory:
An LL(1) parser processes input from left to right and produces a leftmost derivation with 1 lookahead token. It requires the following:
- FIRST and FOLLOW sets for the grammar.
- A parsing table constructed from the FIRST and FOLLOW sets.
- A stack-based parsing mechanism to read and match input tokens against the grammar rules.

Methodology to construct:
1. Define Structures:
- Structures to represent grammar rules, FIRST and FOLLOW sets, and the parsing table.

2. Input Grammar:
- Read the grammar rules from the user or a file and compute FIRST and FOLLOW sets.

3. Construct Parsing Table:
- Build the LL(1) parsing table using the FIRST and FOLLOW sets.

4. Parsing Mechanism:
- Implement a stack-based parser that uses the parsing table to parse input strings.

3. Demonstration:
Source Code:
//To construct a LL(1) parser using C program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX 10
#define MAX_RULES 10

struct Grammar {
char productions[MAX_RULES][MAX];
int numProductions;
char nonTerminals[MAX];
int numNonTerminals;
char terminals[MAX];
int numTerminals;
} grammar;

struct Sets {
char first[MAX][MAX];
char follow[MAX][MAX];
} sets;

char parsingTable[MAX][MAX][MAX];

void addToSet(char *set, char ch) {
if (!strchr(set, ch)) {
int len = strlen(set);
set[len] = ch;
set[len + 1] = '';
}
}

void computeFirst(char symbol, char *result) {
if (!isupper(symbol)) {
addToSet(result, symbol);
return;
}
for (int i = 0; i < grammar.numProductions; i++) {
if (grammar.productions[i][0] == symbol) {
if (grammar.productions[i][2] == '') {
addToSet(result, '#');
} else {
for (int j = 2; grammar.productions[i][j] != ''; j++) {
computeFirst(grammar.productions[i][j], result);
if (!strchr(sets.first[grammar.productions[i][j] - 'A'], '#')) {
break;
}
}
}
}
}
}

void computeFollow(char symbol) {
if (grammar.nonTerminals[0] == symbol) {
addToSet(sets.follow[symbol - 'A'], '$');
}
for (int i = 0; i < grammar.numProductions; i++) {
for (int j = 2; grammar.productions[i][j] != ''; j++) {
if (grammar.productions[i][j] == symbol) {
if (grammar.productions[i][j + 1] != '') {
computeFirst(grammar.productions[i][j + 1], sets.follow[symbol - 'A']);
}
if (grammar.productions[i][j + 1] == '' || strchr(sets.first[grammar.productions[i][j + 1] - 'A'], '#')) {
computeFollow(grammar.productions[i][0]);
strcat(sets.follow[symbol - 'A'], sets.follow[grammar.productions[i][0] - 'A']);
}
}
}
}
}

void constructParsingTable() {
for (int i = 0; i < grammar.numNonTerminals; i++) {
for (int j = 0; j < grammar.numTerminals; j++) {
parsingTable[i][j][0] = '';
}
}
for (int i = 0; i < grammar.numProductions; i++) {
char nonTerminal = grammar.productions[i][0];
char firstSet[MAX] = "";
computeFirst(grammar.productions[i][2], firstSet);
for (int j = 0; j < strlen(firstSet); j++) {
if (firstSet[j] != '#') {
int row = nonTerminal - 'A';
int col = firstSet[j] - 'a';
strcpy(parsingTable[row][col], grammar.productions[i]);
}
}
if (strchr(firstSet, '#')) {
char followSet[MAX] = "";
strcat(followSet, sets.follow[nonTerminal - 'A']);
for (int j = 0; j < strlen(followSet); j++) {
int row = nonTerminal - 'A';
int col = followSet[j] - 'a';
strcpy(parsingTable[row][col], grammar.productions[i]);
}
}
}
}

void parseString(char *input) {
char stack[MAX] = "$S";
int top = 1;
int i = 0;
printf("nParsing Steps:n");
printf("%-10s%-10s%-10sn", "Stack", "Input", "Action");

while (top >= 0) {
printf("%-10s%-10s", stack, input + i);

char stackTop = stack[top];
char currentInput = input[i];

if (stackTop == '$' && currentInput == '$') {
printf("Acceptedn");
return;
}
if (stackTop == currentInput) {
top--;
i++;
printf("Match %cn", currentInput);
} else if (isupper(stackTop)) {
int row = stackTop - 'A';
int col = currentInput - 'a';
if (strlen(parsingTable[row][col]) == 0) {
printf("Errorn");
return;
}
printf("Output %sn", parsingTable[row][col]);
top--;
for (int j = strlen(parsingTable[row][col]) - 1; j >= 2; j--) {
if (parsingTable[row][col][j] != '#') {
stack[++top] = parsingTable[row][col][j];
}
}
} else {
printf("Errorn");
return;
}
}
}

int main() {
printf("Enter the number of productions: ");
scanf("%d", &grammar.numProductions);
printf("Enter the productions (e.g., S=AB|a):n");
for (int i = 0; i < grammar.numProductions; i++) {
scanf("%s", grammar.productions[i]);
}

grammar.numNonTerminals = 0;
grammar.numTerminals = 0;

for (int i = 0; i < grammar.numProductions; i++) {
char nonTerminal = grammar.productions[i][0];
if (!strchr(grammar.nonTerminals, nonTerminal)) {
grammar.nonTerminals[grammar.numNonTerminals++] = nonTerminal;
}
for (int j = 2; grammar.productions[i][j] != ''; j++) {
if (!isupper(grammar.productions[i][j]) && !strchr(grammar.terminals, grammar.productions[i][j])) {
grammar.terminals[grammar.numTerminals++] = grammar.productions[i][j];
}
}
}

for (int i = 0; i < grammar.numNonTerminals; i++) {
computeFirst(grammar.nonTerminals[i], sets.first[grammar.nonTerminals[i] - 'A']);
}

for (int i = 0; i < grammar.numNonTerminals; i++) {
computeFollow(grammar.nonTerminals[i]);
}

constructParsingTable();

printf("nFIRST sets:n");
for (int i = 0; i < grammar.numNonTerminals; i++) {
printf("FIRST(%c) = { %s }n", grammar.nonTerminals[i], sets.first[grammar.nonTerminals[i] - 'A']);
}

printf("nFOLLOW sets:n");
for (int i = 0; i < grammar.numNonTerminals; i++) {
printf("FOLLOW(%c) = { %s }n", grammar.nonTerminals[i], sets.follow[grammar.nonTerminals[i] - 'A']);
}

printf("nParsing Table:n");
for (int i = 0; i < grammar.numNonTerminals; i++) {
for (int j = 0; j < grammar.numTerminals; j++) {
if (strlen(parsingTable[i][j]) > 0) {
printf("M[%c, %c] = %sn", grammar.nonTerminals[i], grammar.terminals[j], parsingTable[i][j]);
}
}
}
}
Output:

4. Result and Discussion:
The above C program produced the expected results, successfully identifying valid LL(1) parsing table for a given context free grammar.

5. Conclusion:
The implementation of an LL(1) parser in C successfully demonstrates the construction and operation of a top-down parser for a given context-free grammar. The program correctly computes the FIRST and FOLLOW sets, constructs the parsing table, and uses a stack-based approach to parse input strings.

LAB 4: Intermediate Code Generation and Machine Code Generation.

Experiment 1: To generate three address code for given source code

1.Objectives:
The objective of this lab is to create a C program that generates three-address code (TAC) for a given source code.

2. Theory:
Three-address code is a type of intermediate representation in compiler design where each instruction at most contains three addresses (operands). It is useful for optimization and simplifies the translation to machine code. TAC typically includes operations like assignment, arithmetic operations, and conditional jumps.

3. Demonstration:
Source Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX 100

typedef struct {
char op[3];
char arg1[10];
char arg2[10];
char result[10];
} TAC;

TAC tac[MAX];
int tempCount = 0;
int tacCount = 0;

void generateTAC(char* expr);
void parseExpression(char* expr, char* result);

void generateTemp(char* temp) {
sprintf(temp, "t%d", tempCount++);
}

void addTAC(char* op, char* arg1, char* arg2, char* result) {
strcpy(tac[tacCount].op, op);
strcpy(tac[tacCount].arg1, arg1);
strcpy(tac[tacCount].arg2, arg2);
strcpy(tac[tacCount].result, result);
tacCount++;
}

void parseExpression(char* expr, char* result) {
char *pos = strchr(expr, '+');
if (pos != NULL) {
char left[MAX], right[MAX];
strncpy(left, expr, pos - expr);
left[pos - expr] = '';
strcpy(right, pos + 1);
char temp1[10], temp2[10];
parseExpression(left, temp1);
parseExpression(right, temp2);
generateTemp(result);
addTAC("+", temp1, temp2, result);
return;
}

pos = strchr(expr, '*');
if (pos != NULL) {
char left[MAX], right[MAX];
strncpy(left, expr, pos - expr);
left[pos - expr] = '';
strcpy(right, pos + 1);
char temp1[10], temp2[10];
parseExpression(left, temp1);
parseExpression(right, temp2);
generateTemp(result);
addTAC("*", temp1, temp2, result);
return;
}

if (isalpha(expr[0])) {
strcpy(result, expr);
} else {
generateTemp(result);
addTAC("=", expr, "", result);
}
}

void generateTAC(char* expr) {
char result[10];
parseExpression(expr, result);
addTAC("=", result, "", expr);
}

int main() {
char expr[MAX];
printf("Enter the expression: ");
scanf("%s", expr);

generateTAC(expr);

printf("nThree-Address Code:n");
for (int i = 0; i < tacCount; i++) {
if (strlen(tac[i].arg2) > 0) {
printf("%s = %s %s %sn", tac[i].result, tac[i].arg1, tac[i].op, tac[i].arg2);
} else {
printf("%s = %sn", tac[i].result, tac[i].arg1);
}
}

return 0;
}

4.Output:

5. Result and Discussion:
The above C program produced the expected results, successfully identifying valid three address code for a given source code as into intermediate representation form.

6. Conclusion:
The implementation of a three-address code generator in C successfully demonstrates the conversion of high-level source code into an intermediate representation. The generated TAC simplifies the process of translating source code into machine code.

Experiment 2:
Title: To generate assembly code for given three address code.

1.Objectives:
The objective of this lab is to create a C program that generates assembly code from given three-address code (TAC).

2.Theory:
Three-address code is an intermediate representation used in compilers, where each instruction contains at most three addresses (two operands and one result). Assembly code is a low-level programming language that is closely related to machine code instructions.

3.Demonstration:
Source Code:
//To generate assembly code for given source code using C program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX 100

typedef struct {
char op[3];
char arg1[10];
char arg2[10];
char result[10];
} TAC;

TAC tac[MAX];
int tacCount = 0;

void readTAC();
void generateAssembly();
void generateInstruction(char* instruction, char* result, char* arg1, char* arg2);

int main() {
readTAC();
generateAssembly();
return 0;
}

void readTAC() {
printf("Enter the number of three-address code instructions: ");
scanf("%d", &tacCount);

printf("Enter the three-address code instructions (format: result op arg1 arg2):n");
for (int i = 0; i < tacCount; i++) {
scanf("%s %s %s %s", tac[i].result, tac[i].op, tac[i].arg1, tac[i].arg2);
}
}

void generateAssembly() {
char instruction[MAX];

printf("nGenerated Assembly Code:n");
for (int i = 0; i < tacCount; i++) {
if (strcmp(tac[i].op, "=") == 0) {
sprintf(instruction, "MOV %s, %s", tac[i].result, tac[i].arg1);
} else if (strcmp(tac[i].op, "+") == 0) {
generateInstruction("ADD", tac[i].result, tac[i].arg1, tac[i].arg2);
} else if (strcmp(tac[i].op, "-") == 0) {
generateInstruction("SUB", tac[i].result, tac[i].arg1, tac[i].arg2);
} else if (strcmp(tac[i].op, "*") == 0) {
generateInstruction("MUL", tac[i].result, tac[i].arg1, tac[i].arg2);
} else if (strcmp(tac[i].op, "/") == 0) {
generateInstruction("DIV", tac[i].result, tac[i].arg1, tac[i].arg2);
}
printf("%sn", instruction);
}
}

void generateInstruction(char* instruction, char* result, char* arg1, char* arg2) {
char line1[MAX], line2[MAX], line3[MAX];
sprintf(line1, "MOV R1, %s", arg1);
sprintf(line2, "%s R1, %s", instruction, arg2);
sprintf(line3, "MOV %s, R1", result);
printf("%sn%sn%sn", line1, line2, line3);
}

4.Output:

5. Result and Discussion:
The above C program produced the expected results, successfully generating a assembly code for the given three address code.

6. Conclusion:
The implementation of the C program successfully generates assembly code from given three-address code. The program reads the TAC instructions, translates each instruction into corresponding assembly code, and outputs the result.

Notes is a web-based application for online taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000+ notes created and continuing...

With notes.io;

* You can take a note from anywhere and any device with internet connection.
* You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
* You can quickly share your contents without website, blog and e-mail.
* You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
* Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.

Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.

Easy: Notes.io doesn’t require installation. Just write and share note!

Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )

Free: Notes.io works for 14 years and has been free since the day it was started.

You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;

Email: [email protected]

Twitter: http://twitter.com/notesio

Instagram: http://instagram.com/notes.io

Facebook: http://facebook.com/notesio

Regards;
Notes.io Team

Notes

Notes - notes.io

Shortened Note Link

Long File

Notes