The Generated Scanner PreviousNext

For each scanner description file given as input, gelex will generate an Eiffel class as output. By default gelex sends its output to the standard output but this can easily be overridden with option -o filename.

The deferred class YY_SCANNER, which is part of Gobo Eiffel Lexical Library, provides an abstraction for scanners. Every scanner class generated by gelex will be a descendant of this class. The main feature of YY_SCANNER is routine read_token which, when called, scans tokens from the input buffer until it either reaches an end-of-file or last_token is given a non-negative value. Each time read_token is called again it continues processing tokens from where it last left off until it either reaches the end of the file or last_token is given a non-negative value again. On the other hand, the routine scan scans the input buffer until it reaches the end-of-file, ignoring successive values of last_token. Class YY_SCANNER is equipped with three procedures which should be used as creation routines in descendant classes. These routines are make which prepares the scanner for scanning from the standard input, make_with_file or make_with_unicode_file for scanning from a given file, and make_with_buffer for scanning from a given input buffer. Before scanning a new input buffer, the scanner should be reinitialized using routine reset. Also available to descendants of YY_SCANNER is a set of features which can be called from the semantic actions.

An implementation for most of these routines described above is provided in class YY_SCANNER_SKELETON. This class has itself three descendants, each of them providing a different flavor of the pattern-matching engine - a Deterministic Finite Automaton (or DFA for short) - used for the implementation of feature read_token. The first of these three classes is YY_COMPRESSED_SCANNER_SKELETON whose DFA is optimized in terms of memory space. Gelex should in that case compress the generated tables by taking advantage of similar transition functions for different states. This is what gelex does by default. The second class is YY_FULL_SCANNER_SKELETON whose DFA is optimized in terms of execution speed but uses big tables. In this case the option full has to be specified for gelex to generate non-compressed tables. The last class is YY_INTERACTIVE_SCANNER_SKELETON whose DFA can deal with interactive input such as input from the keyboard. An interactive scanner is one that only looks ahead to decide what token has been matched if it absolutely has to. It turns out that always looking one extra character ahead, even if the scanner has already seen enough text to disambiguate the current token, is a bit faster than only looking ahead when necessary. But scanners that always look ahead give dreadful interactive performance; for example, when one types a newline, it is not recognized as a newline token until one enters another token, which often means typing in another whole line. YY_INTERACTIVE_SCANNER_SKELETON is an heir of YY_COMPRESSED_SCANNER_SKELETON and only requires from gelex to generated compressed tables. Because of the performance penalty already implied by this more intuitive interactive behavior, this facility is not available in conjunction with full table scanners.

Gelex does not automatically generate the note, class header, formal generics, obsolete, inheritance, creation and invariant clauses. These have to be specified in Eiffel declarations in the first section and in the user code section of the scanner description file. In particular, one has to specify which of the three scanner skeleton classes described above (namely YY_COMPRESSED_SCANNER_SKELETON, YY_FULL_SCANNER_SKELETON or YY_INTERACTIVE_SCANNER_SKELETON) to inherit from as an implementation for its pattern-matching engine. The following example shows a typical scanner description file:

%{
class MY_SCANNER

inherit

    YY_COMPRESSED_SCANNER_SKELETON
        redefine
            make, reset
        end

create

    make
%}

%%

...patterns...   ...actions...

%%

feature {NONE} -- Initialization

    make
            -- Create a new scanner with
            -- standard input as input file.
        do
            Precursor
            create buffer.make (256)
        end

feature -- Initialization

    reset
            -- Reset scanner before scanning next input file.
        do
            Precursor
            buffer.wipe_out
        end

feature -- Access

    buffer: STRING
            -- Scanner's buffer

invariant

    buffer_not_void: buffer /= Void

end 

The generated scanner class, named MY_SCANNER, uses a pattern-matching engine optimized in terms of memory space. It redefines features make and reset inherited from YY_COMPRESSED_SCANNER_SKELETON to take into account its internal buffer and preserve the invariant. The last two examples are excerpts from descriptions of a scanner optimized in terms of execution speed and an interactive scanner:

%{
class MY_FAST_SCANNER

inherit

    YY_FULL_SCANNER_SKELETON

create

    make
%}

%option full

%%

...patterns...   ...actions...

%%

end 

Please note in particular in the above example the %option full directing gelex to generate non-compressed tables.

%{
class MY_INTERACTIVE_SCANNER

inherit

    YY_INTERACTIVE_SCANNER_SKELETON

create

    make
%}

%%

...patterns...   ...actions...

%%

end 

Copyright © 1998-2019, Eric Bezault
mailto:
ericb@gobosoft.com
http:
//www.gobosoft.com
Last Updated: 25 September 2019

HomeTocPreviousNext