(Frequently Asked Questions)
Currently available in plain text only.
This is the PRECC FAQ, compiled by P.T. Breuer, last updated, June 4
1995, July 20 1995, for preccx 2.42 and 2.43.
ptb@dit.upm.es ptb@comlab.ox.ac.uk
CONTENTS
1. Support/maintenance etc.
1.1) Does precc run/compile on ... Sun3, Sun4, SPARC, VAXen, IBM PC? Gcc,
1.2) I'd like to get the latest version of precc. Can you point me to an
1.3) I'd like to be updated as to the current status of PRECC. Thanks.
1.4) Subject: precc .y files wanted, please
1.5) Are there other documents about PRECC, in particular, tutorial
1.6) There is a typo in the example in the man pages ...
1.7) Can you give me a list of languages which have its grammar already
1.8) Is PRECCX able to use YACC-grammars?
1.9) Some error messages are being sent into the .c generated by PRECC
1.10) Do you have any plans to make PRECCX available for with C++ (GNU g++) ?
2. Language issues.
2.1) )foo( )bar( is parsed wrong.
2.2) One question - how does one use a : within an action?
2.3) Can I use attributes other than integers, floats and pointers.
2.4) How can I match every identifier except a keyword?
2.5) How can I construct the integer represented by a string of digits?
2.6) How do I access the token buffer?
2.7) Why is an empty line needed between grammar rules?
2.8) It is described nowhere what kind of value [] and {}* and {}+ return.
2.9) Can I write /* empty */ as in yacc grammar rules?
2.10) how do I change the "semantic type" of values as in bison
2.11) Can I have different "semantic types" in the same script?
2.12) Is it possible to access a parse tree structure to enable an enhanced
2.13) Precc doesn't seem to understand a = b("1") properly.
2.14) a=b\foo on its own is rejected by precc.
3. Run-time problems.
3.1) I compile OK but the executable bombs out immediately.
3.2) I get an "Illegal instruction" signal.
3.3) Precc seems to simply stop generating C code near the end of the script.
3.4) My tiny test program crashes horribly with Segmentation fault
3.5) I have a program that works under Unix, but not under DOS.
3.6) I cannot seem to match a newline.
3.7) I cannot parse \\ at the end of a line.
4. Compilation problems.
4.1) I have trouble compiling under Borland/MC IDE on a PC.
4.2) linking fails, because "segment _TEXT exceeds 64K".
4.3) Linking fails because "_atexit" is doubly defined.
4.4) Compilation fails because I don't have "alloc.h"
4.5) Compilation fails because I don't have "coreleft()".
4.6) Compilation fails because make doesn't like the makefile format.
4.7) "getchar/putchar are redefined with wrong type"
4.8) "multiple def. of p_1 in preccx in function hid56, hid64, ..."
4.9) I have totally weird compilation problems.
4.10) I have compilation warnings on a MIPS system.
4.11) There are two included copies of yylex which confuses the loader
4.12) preccx.c:24: warning: implicit declaration of function `printf'
4.13) The compiler warns about "implicit declaration of function brk()"
4.14) Changing the type of PARAM to char causes warning messages and
4.15) I have difficulty installing precc on a DEC ALPHA running OSF 3.0
4.16) Does lex/flex work with precc? yytchar seems to be missing.
4.17) I get macro `p_andparse0n' used with too many (5) args
4.18) I get an F_SCOPY unresolved external message from the linker
4.19) My DOS compiler refuses to compile a part of the preccx source code.
4.20) The code precc produces from @foo = ... (isdigit) ... won't compile.
5. Misc.
5.1) How can I debug precc parsers?
5.2) I have seen that you plan to make the generated parsers type-safe ...
5.3) Does precc pass "purify" memory leak tests?
=========================================================================
1. Support/maintenance etc.
---------------------------
1.1) Does precc run/compile on ... Sun3, Sun4, SPARC, VAXen, IBM PC? Gcc,
TurboC MC, ...
Yes. Provided x is sufficiently ANSI-ish, and whoever coded y or setup
your box did z correctly. If you have gcc, you are home. I have TurboC
4.3.0 and PC-DOS 6.3 on my home PC, Borland C 2.0 and MS-DOS 6.2 at the
office. Sun4's and gcc 2.5.8 lot's of places, IBM PC's with
Linux/FreeBSD and gcc. It works on all. I used to have no problems with
HP9000's and their proprietary ANSI compiler, but since they stopped
supporting the series I have lost track of how to navigate through
their source file configuration maze. Get gcc.
There are only two known portability problems. Both can be worked around.
One is that I make a call to the brk() function to set the stack size -
if your system doesn't have this call, just erase the reference. Oh - and
ditto for atexit(). Check "man 3 brk" and "man 3 atexit".
The other is that to do some fast work with varargs/stdargs -type functions,
I map the arguments of a varargs call into a local array using low-level
structures. I assumed that all C implementations hold function varargs in
a simple linear array, but I have recently been told that on a DEC ALPHA
this is not the case (they use an array of unions). That needs fixing on
a per-machine basis to avoid sacrificing speed. I will be happy to produce
the fix if you have problems. Check "man 3 stdarg", or read stdarg.h and
tell me if the va_arg macro has a "." or a "->" in it!
I know for a fact that nobody has ever had problems on a NeXT (:-).
---------------------------
1.2) I'd like to get the latest version of precc. Can you point me to an
ftp site? I currently have 2.40a from May 1993.
The central site is ftp.comlab.ox.ac.uk:/pub/Programs/ . There are many
mirrors.
---------------------------
1.3) I'd like to be updated as to the current status of PRECC. Thanks.
Preccx is supported and will continue to be supported, maintained and
improved in the same way that is has been so far. It is not a
commercial venture (and therefore there is a lot invested in it! -
kudos, mostly). It has been around for about 5 years so far. The
source code is free. Legally binding maintenance agreements are
available for a nominal sum (look at LICENCE.DOC in the documentation).
Commercial redistribution of preccx without consent is forbidden. Other
forms of distribution of preccx are encouraged, provided that the whole
package is kept together.
You have the rights to the code you generate with preccx. If you need
to distribute the library code along with your commercial product (very
likely!), that is OK too. You can sell your product and provide the
library code with it (but you will not be selling the latter, just
providing it). The details are in the licence documentation that should
come with your distribution, or at least be pointed to by it.
---------------------------
1.4) Subject: precc .y files wanted, please
> I got distribution 2.30 & 2.40 of your compiler compiler. I recently
> installed it quite easily under SunOS 4.1.3 using SYSV makefile although
> SunOS is more BSD-like than SYSV-like as far as I know.
> I would be very pleased if you could send me some examples of
> precc .y files (the more the better) as well as any advice or additional
> documentation.
There is a huge examples directory in the DOS sub-directory of preccx
on the ftp site(s). It wasn't included in the Unix distribution in
order to keep the size down for mailers. Look for .../examples/....
Cobol, oberon2 and more.
The DOS directory contains a file preccxe.zip (examples).
Method Size Ratio Date Time CRC-32 Attr Name
------ ------ ----- ----- ---- ---- ------ ---- ----
3105 Implode 1044 67% 24-08-92 16:30 c631b5b9 --w FIB/FIB.C
12728 Stored 12728 0% 17-05-92 21:24 b77e22ca --w FIB/FIB.EXE
12728 Unknown 11998 6% 17-05-92 21:24 b77e22ca --w FIB.EXE
1097 Implode 635 43% 17-05-92 06:06 80769622 --w FIB/FIB.Y
83 Shrunk 71 15% 15-08-92 08:59 c80cbe9c --r YACC/Y_AUX.Y
331 Implode 166 50% 14-08-92 08:06 ebe6ec81 --r YACC/YACC.Y
821 Implode 425 49% 26-05-92 16:01 8a46db7c --w YACC/BISONEX2.Y
1546 Implode 666 57% 14-06-92 23:08 5d0d63e4 --w YACC/BISONEX3.Y
2627 Implode 766 71% 23-08-92 09:05 341ae7b0 --w YACC/Y_C.Y
1808 Implode 585 68% 23-08-92 11:15 cc3d8801 --w YACC/Y_RULES.Y
436 Implode 245 44% 15-08-92 10:44 f42779cd --r YACC/Y_INIT.Y
4278 Implode 1440 67% 22-08-92 22:47 aa74e1cc --w YACC/LEX.Y
1741 Implode 553 69% 14-08-92 08:08 4b311cd2 --r YACC/Y_DECL.Y
3719 Implode 1661 56% 23-08-92 11:26 cfe5a602 --w YACC/YACC-TOP.Y
5912 Implode 2038 66% 24-08-92 16:28 059fd1d1 --w YACC/YACC-TOP.C
494 Implode 213 57% 24-08-92 16:28 9f3e5ab5 --w YACC/Y_AUX.C
1540 Implode 423 73% 24-08-92 16:28 acf2d245 --w YACC/YACC.C
8753 Implode 1801 80% 24-08-92 16:28 6d5d1154 --w YACC/Y_RULES.C
2194 Implode 590 74% 24-08-92 16:28 129b8872 --w YACC/Y_INIT.C
10753 Implode 2194 80% 24-08-92 16:28 ac785791 --w YACC/Y_C.C
9113 Implode 2247 76% 24-08-92 16:28 ff661db6 --w YACC/LEX.C
8165 Implode 1681 80% 24-08-92 16:28 c2c35e3a --w YACC/Y_DECL.C
607 Implode 355 42% 13-06-92 15:16 c537e0b5 --w YACC/YACC.H
16497 Stored 16497 0% 10-05-92 02:00 49677612 --w OCCAM/OCCAM.EXE
277 Shrunk 179 36% 08-05-92 06:35 7c8afa7a --w OCCAM/TYPE.Y
1272 Implode 371 71% 08-05-92 06:53 2be04218 --w OCCAM/EXPRESS.Y
2987 Implode 774 75% 08-05-92 07:32 1a66b83a --w OCCAM/CONSTRUC.Y
374 Implode 217 42% 08-05-92 07:46 279d0827 --w OCCAM/ELEMENT.Y
120 Shrunk 108 10% 08-05-92 07:44 9f893d51 --w OCCAM/DECLARAT.Y
1090 Implode 677 38% 08-05-92 07:35 f64c4a16 --w OCCAM/OCCAM.Y
etc. etc.
---------------------------
1.5) Are there other documents about PRECC? In particular, tutorial
documents which gives examples on how to write a PRECC input grammar
etc.
A DOS distribution is in a directory in the same place that you may have
got the Unix tar file from (ftp.comlab.ox.ac.uk//pub/Programs) and
contains many examples.
There is also a WWW home page (thanks to Jonathan Bowen!):
http://www.comlab.ox.ac.uk/archive/redo/precc.html
which contains four articles on line.
A PREttier Compiler-Compiler: Generating Higher Order Parsers in C, Peter
Breuer and Jonathan Bowen. Programming Research Group Technical Report
PRG-TR-20-92, 25pp, November 1992. Provisionally accepted by Software
- Practice and Experience.
A PREttier Compiler-Compiler: higher order programming in C, Peter Breuer.
In Proc. TOULOUSE 92: Fifth International Conference on Software Engineering
and its Applications, Toulouse, France, 7-11 December 1992. Available
from EC2, 269/287 rue de la Garenne, 92024 Nanterre Cedex, France
Occam's Razor: The Cutting Edge of Parser Technology, Jonathan Bowen and
Peter Breuer. In Proc. TOULOUSE 92: Fifth International Conference on
Software Engineering and its Applications, Toulouse, France, 7-11 December
1992. Available from EC2, 269/287 rue de la Garenne, 92024 Nanterre Cedex,
France.
The PRECC Compiler-Compiler, Peter Breuer and Jonathan Bowen. In Elwyn
Davies and Andrew Findlay (eds.), Proc. UKUUG/SUKUG Joint New Year 1993
Conference, Oxford, UK, 6-8 January 1993. UKUUG/SUKUG Secretariat, Owles
Hall, Buntingford, Herts SG9 9PL, UK, pp 167-182, 1993.
Here are FTP references:
A PREttier Compiler-Compiler: Generating Higher Order Parsers in C
ftp.comlab.ox.ac.uk:/pub/Documents/techreports/TR-20-92.ps.Z
Occam's Razor: The Cutting Edge of Parser Technology
ftp.comlab.ox.ac.uk:/pub/Documents/techpapers/Jonathan.Bowen/toulouse92.ps.Z
The PRECC Compiler-Compiler
ftp.comlab.ox.ac.uk:/pub/Documents/techpapers/Jonathan.Bowen/preccx-uug.ps.Z
There is at least one more paper that I can send by e-mail as compressed
uuencoded postscript on request. Publication is pending so I cannot put
it online (ptb@dit.upm.es).
---------------------------
1.6) There is a typo in the example in the man pages ...
Yes, yes, I know!
> [BTW, there are some small typos in the example; the line
>
> @ | int
> should be
> @ | anyint
> and
> @ top = expr\x {: printf("=%d0,$x); :}
> should be
> @ top = expr\x {: printf("=%d",$x); :}
>
> Also, the line
> gcc -Wall -ansi -o foo foo.c -L -lcc
> doesn't work out of the box, since the delivered makefile installs the
> library as "libcc1".
etc.
One mars bar to anyone who finds a new one.
---------------------------
1.7) Can you give me a list of languages which have its grammar already
implemented using Precc?
I don't know the total. I have been personally involved with Cobol,
ratfor, oberon2, uniform, Z, and some other smaller projects
(including precc itself).
---------------------------
1.8) Is PRECCX able to use YACC-grammars?
Not without translation into precc format (:-).
I did use to include a rough-and-ready yacc -> precc translator in the
DOS package (it might still be there) but nowadays I incline to the
opinion that people ought to _think_ about what they are writing, given
that the yacc grammar was probably wrong in the first place.
yacc rules of the form
foo: bar
| boo bee
(actually, I forget what yacc rules look like exactly - it has been so long)
translate into precc rules of the form
@ foo = boo bee
@ | bar
(the order reversal is because yacc grammars typically have the shortest
overlapping match first, if there is an overlap, and you want it to be the
other way round for precc so that it tries the longest possible match first
and gets a chance to resolve which it should be)
---------------------------
1.9) Some error messages are being sent into the .c generated by PRECC
This is a bug that was fixed in 2.42 (I believe). If you have it, look
for the printf statement that produces it in the precc source code, and
change it to a fprintf(stderr, ...). Then recompile precc. It's that
simple (sorry!).
---------------------------
1.10) Do you have any plans to make PRECCX available for with C++ (GNU g++) ?
No.
=========================================================================
2. Language issues.
---------------------------
2.1) )foo( )bar( is parsed wrong.
It's ambiguous. Don't do it that way. Write )foo && bar( if you meant
that. Here's a real life example, "before":
@ ident = )RESET_IDENT(
@ (ISalpha)
@ (ISidentrest)*
@ )CHECK_IDENT(
@ )CHECK_IDENT(
@ whitespace
And "after":
@ ident = )RESET_IDENT(
@ (ISalpha)
@ (ISidentrest)*
@ )CHECK_IDENT && CHECK_IDENT(
@ whitespace
---------------------------
2.2) One question - how does one use a : within an action?
Duh - one couldn't without using a macro to hide it pre-2.41. That was
my mistake. Nowadays, you just make sure that your actions use the
"{: foo; :}" format and not the older ": foo; :" one. (The latter is
supported but not encouraged).
---------------------------
2.3) Can I use attributes other than integers, floats and pointers.
It depends. I haven't found a platform-independent way of doing that
yet. Itemized support per platform is is in 2.43 for it but something
less fussy is being worked on.
It seems that you will get support automatically on older releases if
your platform + compiler do the trick of placing pointers on the C call
stack instead of data (that means, you're OK with gcc for 386 machines)
but not if the compiler reserves space on the call stack for the full
size data (sun4's and gcc).
The truly portable technique is to use pointers and write your own
pointer to data structure dispenser.
---------------------------
2.4) How can I match every identifier except a keyword?
> @ identifier = [keyword] alpha alphanum*
>
> which I think is rather neat! If you present a keyword, it will not be
> recognized as an identifier because it does not have a trailing alpha.
---------------------------
2.5) How can I construct the integer represented by a string of digits?
You will need to pass an accumulator as a parameter, dive into
recursions with an increased accumulator, and finally surface with the
accumulator as the synthesized attribute.
@ dnum(acc) = digit\x dnum(10*acc+$x)
@ | digit\x {@ 10*acc+$x @}
I could have written the attribute constructed in the top line explicitly,
but by default the last term's attribute is used, so it wasn't worth it:
@ dnum(acc) = digit\x dnum(10*acc+$x)\y {@ $y @}
@ ...
> @ dnum = digit
> @ | digit\y dnum\x {@ $x*10+$y @}
>
> did help only marginally (no core dump, but still the wrong answer). Didn't
> think about the ordering of rules at all, but in hindsight it's rather
> obvious, iff you get the idea that order can be significant ;-)
Ye-es. The obvious things are the ones the author can't see!
I have to add that the rewritten dnum(acc) specification has to be used
via a call to dnum(0) in order to enter it with accumulator set to zero.
You may even need dnum((PARAM)0) (see the manual) depending on how your
system treats manifest constants.
You can also do the trick without using a parameter at all, but I don't
like this so much. You can use an external accumulator and use
side-effects on it, then pick it up as an attribute later, making sure
that you have passed a cut mark (`!'):
int acc;
@ dnum = {: acc=0; :} { digit\x {: acc=10*acc+$x; :} }+ ! {@ acc @}
Notice that you can't backtrack now, because the ! executed the actions.
I don't like killing backtrack possibilities at low levels in a script,
and dnum is really at the token level. On the other hand, this is very
efficient code.
---------------------------
2.6) How do I access the token buffer?
Use the variable
extern TOKEN *pstr; /* parsed stream */
If you really have to. This can always be avoided.
> e.g. in rules such as
> ---------------------------------------------------------------------------
> @ alnum = (isalnum)
>
> @ alpha = (isalpha)
>
> @ pos = {} {@ pstr @}
>
> @ id = pos\beg alpha { alnum }* pos\end {: printf("id ='%s' length=%d\n",beg,end-beg); :}
> ---------------------------------------------------------------------------
>
> mostly does the job; is there a better documented / more "correct" way?
Try
@ alnums = alnum\x alnums\y {@ cons(x,y) @}
@ word = alpha\x almums\y {@ cons(x,y) @}
@ id = word\x {: printf(id = '%s' length=%d\n",x,strlen(x)); :}
where cons() does what you think it should! Or else use a buffer
(sufficiently long):
char ibuff[MAXIDLEN], *iptr;
@ word = {: iptr=ibuff; :}
@ alpha\x {: *iptr++=x; :}
@ { alnum\y {: *iptr++=y; :} }*
@ {: *iptr=0; :}
@ id = word {: printf(id = '%s' length=%d\n",ibuff,strlen(ibuff)); :}
but remember that that is side-effecting. You can't mix calls to word
here with calls to word from other parsers. You need separate buffers
per incantation if you plan on doing that, and let the caller supply
it:
@ alnums(iptr) = alnum\x alnums((iptr[0]=x,iptr[1]=0,iptr+1))
@ | {}
@ word(iptr) = alpha\x alnums((iptr[0]=x,iptr[1]=0,iptr+1))
@ id = word(ibuff) {: printf(id = '%s' length=%d\n",ibuff,strlen(ibuff)); :}
---------------------------
2.7) Why is an empty line needed between grammar rules?
Indeed, this is mentioned nowhere in the manual. It is part of the
"literate programming" paradigm. There has to be an empty line before
and after every group of rules (too).
> Re `literate programming': Surely you are joking, Mr Breuer ! :_)
> The "@" convention is probably the most illiterate way of marking embedded
> meta-code from C code Ever Invented By Computer Scientist. Something like
>
> @rule
> .... stuff ...
> @end
>
> would look much nicer, especially when the C chunks within the {@@} and
> {::} get larger. And emacs' C indentation functions might even work within
> it.
Well, I think it looks nice. The code to change is in the yylex()
function in preamble.c if anyone wants to.
---------------------------
2.8) It is described nowhere what kind of value [] and {}* and {}+ return.
These are presently unspecified. For the record, they return the last
attribute of the enclosed sequence of terms if there was one, and zero
otherwise. It is likely that that is the way it will be specified
sometime,
---------------------------
2.9) Can I write /* empty */ as in yacc grammar rules?
Yes. Comments are allowed inside the rules. You can also write "{ }".
I prefer nothing!
---------------------------
2.10) how do I change the "semantic type" of values as in bison
> I want something like: "$x" or "$x.type". Does this make any sense?
Yes use
#define VALUE foo
#include "cc.h"
etc.
Then recompile the libraries too with -DVALUE=foo as a compiler switch.
Compile your application as normal afterwards.
---------------------------
2.11) Can I have different "semantic types" in the same script?
> My meaning was: how can I have different types of values. For instance:
>
> @ Exp( op ) = Ident\x op\y Exp( op )\z
>
> where x would be a string (char *), op a character, and z a number
> (real, for instance). In the action I would like to write:
>
> {@ $y == '+' ? value( $x ) + $z : ( $y == '-' ? etc... ) @}
a) what you probably ought to do:
# define VALUE union { char *String; char Char; int Int }
# include "ccx.h"
# define myaction(x,y,z) y.Char=='+'?value(x.String)+ z.Int:y.Char=='-'?etc
@ Exp( op ) = Ident\x op\y Exp( op )\z
@ {@ myaction($x,$y,$z) @}
For precc 2.42 you should CHECK !!!!! that your version of C puts
integer sized unions on the call stack, or uses a pointer. Either way,
you are OK. Watch out that it doesn't put a structure bigger than a
long int on the call stack.
b) what I would do. Just use casts instead of a union.
# define VALUE double
# include "ccx.h"
# define CHAR(x) ((char)(x))
# define STRING(x) ((char*)(x))
# define INT(x) ((int)(x))
# define myaction(x,y,z) CHAR(y)=='+'?value(STRING(x))+ INT(z):CHAR(y)=='-'?etc
@ Exp( op ) = Ident\x op\y Exp( op )\z
@ {@ myaction($x,$y,$z) @}
---------------------------
2.12) Is it possible to access a parse tree structure to enable an enhanced
semantic analysis?
I suggest you build the parse tree first (:). Seriously - a parse tree
is not built by precc itself. It is trivial to add the commands to do
so, however. Just attach a new node as a synthesized attribute to the
parse:
@ foo = boo\a bee\b {@ mknode(a,b) @}
@ | bar
But remember that precc might backtrack. It would be a good idea to garbage
collect abandoned nodes after precc has finished making the tree.
You can use inherited attributes too, and mix them up,or use the synthesized
attributes as inherited ones later in the parse:
@ foo(x) = boo(x)\a bee(x,a)\b {@ mknode(a,b) @}
@ | bar
etc. etc.
---------------------------
2.13) Precc doesn't seem to understand a = b("1") properly.
You are running version 2.40. Upgrade (it was a bug in the findbrkt()
code in preamble.c).
---------------------------
2.14) a=b\foo on its own is rejected by precc.
That's right. the "\foo" is an infix construct. a=b\foo {} would be
legal, though!
Of course there is no sense in introducing the foo parameter if you are not
going to use it subsequently, so this is not a bug, but a feature (it
prevents you writing useless code).
=========================================================================
3. Run-time problems.
---------------------------
3.1) I compile OK but the executable bombs out immediately.
> Check that you compiled the libraries with the same VALUE, TOKEN and PARAM
> types as you are using in your script.
---------------------------
3.2) I get an "Illegal instruction" signal.
> This probably means that the internal program stack or another stack has
> overflowed. There is a limit on how long precc can store actions before
> running out of storage space. The limit is determined by parameters that
> you can adjust, however. See the manual pages.
>
> Try inserting '!' more frequently in your grammar.
Early problems in the 2.3x series were fixed by the patches up to 2.32.
The 2.4x series should never get this error.
---------------------------
3.3) Precc seems to simply stop generating C code near the end of the script.
You are running under DOS using pre-compiled libraries, and you are
trying to divert stdout to your file. This is a DOS bug. Call precc
as "preccx infile outfile" instead of "preccx outfile".
There is internal code to work-around that problem. And try recompiling
the library on your system!
---------------------------
3.4) My tiny test program crashes horribly with Segmentation fault
You probably have discovered left recursion. NEVER write
"foo = foo bar | bar"!!! I think you meant "foo = bar*", or
"foo = bar foo | bar". Preccx parsers are top-down and left to right.
Any recursion on the left hand side of a parse tree is going to take a
looooooooooooooong time to resolve (:-).
> But the first tiny test spec I wrote for myself crashes horribly. The
> grammar is
> ---------------------------------------------------------------------------
> #define TOKEN char
> #define VALUE int
> #define BEGIN printf("\ncrashme!> ");
>
> #include "ccx.h"
> #include
>
> @ digit = (isdigit)\x {@ $x-'0' @}
>
> @ dnum = digit
> @ | dnum\x digit\y {@ $x*10+$y @}
**** **** Aaaaargh!!
>
> @ expr = dnum\x {: printf("%d\n",$x); :}
>
> MAIN(expr)
> ---------------------------------------------------------------------------
BTW - a particularly subtle way of getting a left recursion is writing
something that reduces to
foo = bar**
since this will match against nothing (a possible bar*) infinitely often.
---------------------------
3.5) I have a program that works under Unix, but not under DOS.
You are probably missing some casts. Under most Unix systems, size
differences between short, int and long are nil, but DOS setups seem to
prefer having them all different.
For the record, here's the the "Fibonacci sequence" parser from the
manual in a format that should be safe against that sort of thing.
Try it on different systems and let me know if it behaves right,
> /** fib.y test file for PRECCX v2.42 P.T. Breuer August 1994 **/
>
> # define TOKEN char
> # define VALUE char*
> # include "ccx.h"
> # include
>
> # define INT(x) ((int)(x))
> # define DIV(m,n) INT(INT(m)/INT(n))
> # define MOD(m,n) INT(INT(m)%INT(n))
> # define DBLE(n) ((double)(n))
> # define LOG10(n) INT(log10(DBLE(n)))
> # define TEN DBLE(10)
> # define ZERO DBLE(0)
> # define FIRSTDIGIT(n) \
> ((PARAM)(0!=(n)?DIV((n),pow(TEN,DBLE(LOG10(n)))):ZERO))
> # define LASTDIGITS(n) \
> ((PARAM)(0!=(n)?MOD((n),pow(TEN,DBLE(LOG10(n)))):ZERO))
>
> MAIN(fibber)
>
> @fibber = { fibs $! }*
>
> @fibs = fib((PARAM)1,(PARAM)1)\k
> @ {: printf("%d terms OK\n",(int)$k); :}
>
> @fib(a,b) = number(a) <','> fib(b,a+b)\k {@ $k+1 @}
> @ | <'.'> <'.'>
> @ {: printf("Next terms are %d,%d,..\n",(int)a,(int)b); :}
> @ {@ 0 @}
>
> @number(n)= digit(n)
> @ | digit(FIRSTDIGIT(n)) number(LASTDIGITS(n))
>
> @digit(n) = /* rep. of 1 digit n */
>
> /************************************************************
> The following are some example inputs and responses:
>
> 1,1,2,3,5,..
> Next terms are 8,13,..
> 5 terms OK
>
> 1,1,2,3,5,8,13,21,34,51,85,..
> error: failed parse: probable error at <>1,85,..
> ************************************************************/
---------------------------
3.6) I cannot seem to match a newline.
This will take some explaining. Suppose that you are using char TOKENs
and precc's default parser. Then a construction like
>'}'< { <' '> | <'\t'> | <'\n'> }*
will NOT match a newline. You'd think it would because both the
"not-a-right-brace" and the "explicit new line" look like they should!
But new lines are mapped to the zero token by that particular parser,
and none of the literal constructions in precc will match a zero. No,
not even <'\000'>. You have to use $ or !$ (the first if you don't want
a cut, which will prevent backtracking) to match the newline. These
match the zero token.
Incidentally, $$ matches the EOF marker (the -1 token). Nobody has ever
been caught out by that yet.
---------------------------
3.7) I cannot parse \\ at the end of a line.
You have TOKEN set to char and you are using precc's default parser.
The latter treats a \ at the end of a line as an escape of the following
newline and will skip both it and the newline. So your \\ is being
seen as a single \.
Comment out the marked lines in the default lexer (in yystuff.c) and
recompile the precc library.
=========================================================================
4. Compilation problems.
---------------------------
4.1) I have trouble compiling under Borland/MC IDE on a PC.
Check those well-hidden set up options in the IDE! Make sure that the
"library" directory is where you placed the preccx library, etc. I
supply ".proj" files with the DOS distribution at the archive sites. It
works for me.
---------------------------
4.2) linking fails, because "segment _TEXT exceeds 64K".
You are using too small a model under DOS. The LARGE model allows any
size data/text. Change your compilers options.
---------------------------
4.3) Linking fails because "_atexit" is doubly defined.
Your system has atexit on board and you don't need the version I
supplied in case you don't have it. Leave atexit.o out of the library.
Don't bother compiling it.
---------------------------
4.4) Compilation fails because I don't have "alloc.h"
Check where your malloc routines are prototyped and use that instead.
It is either alloc.h or malloc.h. #include or as
need be.
---------------------------
4.5) Compilation fails because I don't have "coreleft()".
You don't, and you are on an MSDOS system? Congrats. You have infinitely
much memory. Replace the calls to coreleft in the code with some large
constant and cross your fingers.
---------------------------
4.6) Compilation fails because make doesn't like the makefile format.
Here's a clue:
- some changes to the Makefile, especially the make
does not like the ending ;\ in the line "ar rv ..."
and the -L directive with two parameters (I had to include
a second -L; no problem)
and I recommend you download and install GNU make (gmake) rather
than bothering to read your machines manual page for make - which,,
followed by some twiddling, is an alternative,
---------------------------
4.7) "getchar/putchar are redefined with wrong type"
Omitting the declaration of getchar/putchar from wherever it complains
solves the problem.
---------------------------
4.8) "multiple def. of p_1 in preccx in function hid56, hid64, ..."
I have never seen this. But here is a report:
maybe some compiler switch is not OK; no problem,
because all second declarations of p_1 can be transformed into
assignments.
---------------------------
4.9) I have totally weird compilation problems.
> Thanks for your second response. I had originally fetched 2.40 and tried
> to compile it and had some problem which I now attribute to a cockpit error.
> I then got the preccx.tar.Z file and untar'd it into the same directory.
> That seems to have left some artifacts around which caused my difficulties.
> The version is indeed in precc.h and is 2.30.
>
> Anyhow, I cleaned out the directory, untar'd everything, and it all seems
> to work. Thanks again for your help. It looks like a nice system.
---------------------------
4.10) I have compilation warnings on a MIPS system.
|> preamble.c: In function `yylex':
|> preamble.c:182: warning: cast to pointer from integer of different size
|> preamble.c:224: warning: cast to pointer from integer of different size
|> preamble.c:259: warning: cast to pointer from integer of different size
|> preamble.c:275: warning: cast to pointer from integer of different size
|
|Don't worry. Those are all OK. The warnings are too conservative. As I
|recall they arise because a long integer is written into a pointer. The
|semantics says that only a short integer is stored in it, but the
|compiler can't see that.
---------------------------
4.11) There are two included copies of yylex which confuses the loader
There are two copies in the source code for PRECC itself, because PRECC
needs its own lexer, and clients need one (different, default) lexer for
applications. The default one is in yystuff.c and preccx' is in
preamble.c.
Clients shouldn't ever see the one in preamble.c (it doesn't go into
the library) but maybe your linker is seeing both when you are building
precc. It should take the one that comes first in the list of modules
(i.e. the one in preamble.c) but maybe it isn't that smart. In that
case, JUST for building preccx itself, comment out the one in
yystuff.c,
---------------------------
4.12) preccx.c:24: warning: implicit declaration of function `printf'
> I compiled both preccx230.tar.Z and preccx240.tar.Z on a Sun Sparc IPC
> workstation. Both versions produced a lot of warning messages during
> compilation. For example:
> preccx.c:24: warning: implicit declaration of function `printf'
> preccx.c:34: warning: implicit declaration of function `fprintf'
>
> Is this normal???
It is normal on a sun. Nobody can tell me how to turn off this silly
warning. Apparently stdlib.h and stdio.h do not have prototypes for
the printf family on a sun. If I declare them in the scripts, some
other compiler and environment will complain.
The other point of view is:
> It compiled very smoothly on the Suns here (the odd warning but no
> more). I've installed it under /users/news/bin.
---------------------------
4.13) The compiler warns about "implicit declaration of function brk()"
Don't worry about it. You are lacking a prototype. The implicit
declaration is compatible with what the prototype would say. You
probably don't even need this function if you live in Unix (your
processes stack space is big enough that you don't need to worry about
changing it).
---------------------------
4.14) Changing the type of PARAM to char causes warning messages and
subtle misbehaviour.
PARAM has to be at least as big as VALUE, for technical reasons.
This is a good example of what's going on:
> ---------------------------------------------------------------------------
> #define TOKEN char
> #define VALUE long
> #define PARAM long
> #define BEGIN printf("\ncrashme!> ");
>
> #include "ccx.h"
>
> /* this grammar expects strings of the kind (ab)^n, where a and b are
> * arbitrary characters.
> */
>
> @ anychar = ?\x {@ $x @}
>
> @ rest(a,b) = rest(a,b)\n {@ n+1 @}
> @ | {} {@ 0 @}
>
> @ start = ?\x ?\y rest(x,y)\n {: printf("(%c%c)^%ld\n",(char)$x,(char)$y,1+$n); :}
>
> MAIN(start)
> ---------------------------------------------------------------------------
>
> This works fine & recognizes what it should.
>
> Now comes the bug: if the type of PARAM is changed to
> #define PARAM char
> (this should be possible, since the params are both characters, right?)
> I get dozens of error messages like
> ---------------------------------------------------------------------------
> crashme.c: In function `hid1':
> crashme.c:22: warning: extern declaration of `hid0' doesn't match global one
> [..30 more lines like this..]
> --------------------------------------------------------------------------
> and a subtly erroneous program: it counts only up to 255. Why? Because the
> generated functions are typed as
>
> static STATUS hid4(PARAM a,PARAM b,PARAM n){
> return p_atch0((PARAM)(n+1));
> }
>
> But the "n" arguments should be typed as value, i.e.:
>
> static VALUE hid4(PARAM a,PARAM b,VALUE n){
> return p_atch0((VALUE)(n+1));
> }
>
That is why PARAM has to be at least as big as VALUE.
---------------------------
4.15) I have difficulty installing precc on a DEC ALPHA running OSF 3.0
> /usr/lib/cmplrs/cc/cfe: Error: ccx.c, line 54: Improper cast of non-scalar
> type expression
Yes, I know about this problem on that system. I have not yet
discovered a satisfactory cure. All you can do is edit ccx.h and
un-comment the "generic" form of the CALL macros, replacing the other
used version. It is much slower but will work.
The problem is that "va_list" is a union structure over in alpha/OSF
land, and I never expected it to be. The following conversation may
help give some insight.
> The problem is with the lines
> a = (PARAM*)ap;
>
> (!! Innocent - isn't it!) Try the following code:
>
> # include
> main()
> {
> long *a;
> va_list ap;
> a = (PARAM*)ap;
> }
>
> and tell me what happens!!!
>
> In all the implementations of stdargs that I have seen, va_list is a
> pointer, but here it appears to have some union structure to it. So it
> cannot be cast (?).
>
> What has happened is that to speed things up, instead of pulling arguments
> off a function call (stack) one by one, then putting them in the a[] array,
> I simply set the a address to be the same as the stack address of the first
> argument, and hey presto! (normally :-().
>
> This looks difficult with the DEC implementation. !! If you look at the
> comment in the ccx.h by the CALL macro, you will see that it says that the
> macro is implementation dependent -
>
> Instead of a = (PARAM*)ap, you have to replace that line of the CALL
> macro with (m is the number of args waiting to come off the stack):
>
> {int k= -1;
> while(++k {
> a[k] = va_arg(ap,PARAM);
> }
> }
>
> This is the universal way, but you can see how much slower it is! Can you
> not find something faster? The DEC stdargs code looks vaguely comprehensible
> to me, and I think one might be able to work a similar trick.
>
> After replacing
>
> a = (PARAM*)ap;
>
> as suggested in the CALL macro, also eliminate the next line:
>
> ap=(va_list)&a[m];
>
> since ap will now be set to the right value by the repeated va_arg calls.
---------------------------
4.16) Does lex/flex work with precc? yytchar seems to be missing.
Both lex and flex work for preccx. You have to add the yytchar variable
yourself in your code if they don't set it. Look at the oberon2 example
in the examples archive for preccx (preccxe.zip for MS-DOS and
preccx-examples.tar.gz for Unix at
ftp.comlab.ox.ac:/pub/Programs/precc*). For flex, for example just use:
int yytchar = 0;
#define yyterminate() yytchar = EOF; return ( YY_NULL )
But preccx works fine as a lexical scanner too. It does sometimes help to use
another scanner first just to get rid of things like white space and comments,
but it isn't really much of a problem.
---------------------------
4.17) I get macro `p_andparse0n' used with too many (5) args
You need #include "ccx.h" and not #include "cc.h". Nowadays there are
very few occasions on which you can get away without using the full
ccx.h definitions. Use it as default.
---------------------------
4.18) I get an F_SCOPY unresolved external message from the linker
You are compiling an application under DOS using the pre-compiled
libraries. Recompile the libraries yourself and the problem should go
away. This is due to incompatibilities between different IDE library
releases.
---------------------------
4.19) My DOS compiler refuses to compile a part of the preccx source code.
There are bugs in many DOS compilers and there are usually work-arounds
involving reformatting the code or expressing it differently. For
example, in engine.c you will find the following work around for a bug
in the emitted code from TurboC 3.0.
#ifndef __MSDOS__
instr=program[pc++];
n=Opcode(instr);
switch (n)
#else /* do it quickly to avoid MS-DOS bug */
switch (n = Opcode(instr=program[pc++]))
#endif
However, I am told that that code won't compile under TurboC 2.0. You have
to switch the cases round to compile it! I am told that there seem to be
no errors in the output code (under 2.0) if you do this, but I haven't
checked personally.
So the advice is to phone your compiler manufacturers if you find something
like that.
---------------------------
4.20) The code precc produces from @foo = ... (isdigit) ... won't compile.
isdigit may be a macro and not a function on your installation. You need
a function name inside the parentheses. Try #undef isdigit and see if
that makes visible some hidden function. It often seems to be the case
that if isdigit exists in macro form, then it also is provided as a
function. Otherwise you will just have to code a myisdigit() yourself,
and use that instead.
=========================================================================
5. Misc.
---------------------------
5.1) How can I debug precc parsers?
With gcc, recompile the libraries with a -g flag and run under gdb or
equivalent.
If you are doing this because you got a hard-coded warning message
(statistically, from engine.c), put a break point in the library
function concerned at the point where the error message appears.
Useful items of informations that will be required to make a report -
o the pc (program counter) value at that point
o the value of instr (the instruction cache).
o program code in program[pc-1] and program[pc+1]
o the parse string pointer pstr
o tokens in pstr-1 and pstr+1
And don't worry.
---------------------------
5.2) I have seen that you plan to make the generated parsers type-safe ...
Yes. Someday soon PARAM and VALUE will be components of a union
structure PARAM_OR_VALUE. Not yet.
---------------------------
5.3) Does precc pass "purify" memory leak tests?
> As far as I recall there is only one call to malloc. A big one at the
> start in precc_creat_data() to make the buffers it will need. Those
> buffers are released by precc_destr_data() when precc dies.
You can call the create and destroy routines yourself when diving in
and out of other code to save memory.
---------------------------
Acknowledgements. Input from the following correspondents (without their
permission) has gone into producing this FAQ. Their help and information
is gratefully acknowledged.
Rainer Gimnich
Jonathan Karges
Jean-Marie Condom
Jonathan.Bowen@comlab.oxford.ac.uk
P.A.Keller@bath.ac.uk
Peter Breuer
allison@shasta.stanford.edu (Dennis Allison)
Dennis Allison
alweiner@clark.net (Alan Weiner)
Peter Keller
Carl Witty
draco@br.puc-rio.inf (DRACO)
draco@inf.puc-rio.br
glm@libd1.univ-bpclermont.fr ( Equipe Genie LOgiciel)
gnmc@eagle.inesc.pt (Goncalo Nuno Martins Costa)
gnmc@erika.inesc.pt (Goncalo Nuno Martins Costa)
greg@miranda.ba.swin.edu.au (Gregory Dehollander)
konkin@ca.usask.cs (Doug Konkin)
Doug Konkin
lambolez@irit.irit.fr
Pierre-Yves LAMBOLEZ
lma@dayton.stanford.edu Computer Systems Lab
(James Mansion LADS LDN X4923)
Markus Freericks
L. Beth Millar
Michael Koch
(Manoel Pedro Sa)
Uwe Nauerth
nauerth@n1.imsd.uni-mainz.DE
pp@goggins.bath.ac.uk
Peter T. Breuer
Ralf Laemmel
rogoff@metasphere.com (Brian Rogoff)
Wolfgang Grieskamp
Wing Chung