CS
4850 - Programming Languages
HW1
SummerII
2007
Given: July 2, 2007
Due: PART1: July 11, 2007 - postponed to 7/13/07 (1:30pm);
PART2: July 13, 2007 - postponed to 7/18/07 (1:30pm)
This assignment
involves recognizing tokens, building a symbol table,
and constructing token strings for a LISP-type language. You
are to write all code in C++/Java. As an example of what your program
should produce for its token strings, consider the following:
Input
( CAR '(B D C))
(+ +15 ( * -20 2))
(CONS '(B D C) (CONS ``A (B)'C D" '(BB BB)))
^Z
Output: NOTE:
Separate the output lines for each line of input, as shown)
( CAR '(B D C))
( 157 161 ( 250 80
9 ) )
(+ +15 ( * -20 2))
( 35 100 ( 74 102 91 ) )
(CONS '(B D C) (CONS ``A (B)'C D" '(BB BB)))
( 34 161 ( 250 80
9 ) ( 34 209 161 ( 65 65 ) ) )
:
Symbol Table
Address |
Contents |
Type |
9 |
C |
Token |
34 |
CONS |
Token |
35 |
+ |
Token |
65 |
BB |
Token |
74 |
* |
Token |
80 |
D |
Token |
81 |
( |
LeftParenTok |
91 |
2 |
Integer |
100 |
+15 |
Integer |
102 |
-20 |
Integer |
157 |
CAR |
Token |
161 |
' |
QuoteTok |
168 |
) |
RightParenToken |
209 |
"A (B) 'C D" |
String |
250 |
B |
Token |
Of course, the Symbol Table
addresses shown are arbitrary; yours may be quite different. Print actual
parentheses, but symbol addresses as shown for other symbols. For now,
the only types you must recognize are "Token", "LeftParenTok",
"RightParenTok", "QuoteTok", "Integer",
and "String".
When testing your program you
may use a terminal for I/O, but for the later runs use the input file
hw1.dat.
Before printing the token string for each line of output, you are to print
the input line (as in the example). Compress the output line to 6 characters
per symbol except parentheses which should be two characters (ex: "(
"). After the EOF (^Z) is encountered, print the non-empty symbol
table elements in order.
Symbol Table Construction
Use 256 elements numbered 0
- 255. Each element should allow for a string of maximum length of 28
characters and a type code. Access the table through a very simple hashing
function. Of course you must be able to handle collisions.
Recognizing Tokens and Constructing Token Strings
Implement this first through
a finite state machine (fsm) and then using lex, both of which will be
discussed in class. Use the following rules.
Parentheses, blanks, single
quotes, and double quotes are delimiters. Each identifies the end of the
current token. In addition, parentheses and single quotes themselves are
tokens. Double quotes delineate strings and should be stored as a part
of the string. Of course blanks, parentheses, and single quotes within
strings are not delimiters.
C++/Java Design and Implementation
There are two parts to this
assignment. Part1: Implement the FSM directly using a
high level language such as C/C++/Java. Part2: Use lex
to generate the FSM and the required output. Start by doing part1 and
then move onto part2.
You should know that most of
what you develop for this assignment in part2 will also be used in the
semester project, so do a GOOD JOB! Use top-down and object-oriented
design. Use extensive modularization, abstraction, and information hiding.
The main program can be a simple controller for the rest of the program.
Document your program well. All packages/classes should tell where they
are used and who uses them. You do not need to handle any exceptions for
this assignment. Programming assignments that are just spaghetti / monolithic
programs coded in Java/C++ will be heavily penalized or not accepted at
all.
Implement token strings as
linked lists of symbol table addresses.
Turn in: (with your name and WMU-ID number hand printed
on the outside of your hardcopy submission)
- Verified script showing
input and output for assigned test runs.
- Complete listing of program
and documentation.
- A complete table showing
your fsm. This may be computer printed or hand printed, but should be
carefully labeled for readability.
- To get the script/log file
use "script" command of Unix. (After
script, cat the source program files and input files, run the program,
cat the output files - for steps 1 and 2 [and 3 if you typed in your
fsm].)
- Your top-down and object-oriented
design diagrams.
- HIGHLIGHT APPROPRIATE CODE
WITH YELLOW.
- Label the items you turn
in and turn them all together.
- A copy of your zipped file
consisting of of your source codes, scripts (to run your program), a
couple of sample executions of your solution and report.
- Use <hw#cs4850_yourlastname_mmddyy.{zip,ppt,doc,tex}>
as the naming convention for your zipped, ppt, MS-Word, or LaTeX files
when emailing your submission to gupta@cs.wmich.edu. Replace '#' with
the appropriate homework number.
Penalty for Late Submission
10% per day (including weekends).
Turn in programs at the start of class when due.
Any student may be asked to
show and discuss his solution in class, so be ready with your presentation.
For programming assignments, submit a zipped file of your source codes,
scripts (to run your program) and report along with a hardcopy of your
source codes, scripts, a couple of sample executions of your solution
and report.
Use <hw#cs4850_yourlastname_mmddyy.{zip,ppt,doc,tex}>
as the naming convention for your zipped, ppt, MS-Word, or LaTex files
when emailing your submission to ajay.gupta@wmich.edu. . Replace '#' with
the appropriate homework number.
REMINDER:
You
are responsible for making yourself aware of and understanding the policies
and procedures in the undergraduate
(pp. 268-270) [Graduate
(pp. 24-26)] Catalog that
pertain to Academic Integrity. Additionally,
easy availability of information, material, source codes, lecture notes
etc on the Internet may make it possible to find solutions to your assignments
on the Internet or elsewhere. It is okay to refer to those, understand
them and use them to enhance your solutions, generate your own ideas etc.
However, you must give proper and full credit to original authors of the
work, if you include their ideas. Failing to do so is part of academic
and professional dishonesty. It will not be tolerated in this class. Do
not give in to temptations....
|