AuD
Lecture 'Algorithmen und Datenstrukturen' (code examples)
aud.util.LexicalScanner Class Reference

Base class for a simple lexical scanner. More...

+ Inheritance diagram for aud.util.LexicalScanner:
+ Collaboration diagram for aud.util.LexicalScanner:

Classes

class  Rule
 a rule for lexical scanner
 

Public Member Functions

 LexicalScanner (Rule[] rules, String input)
 create new scanner processing input @endiliteral
More...
 
void setInput (String input)
 set input (resets scanner state) More...
 
String matchedText ()
 get text of last match or call to next More...
 
int matchedTokenId ()
 get result of last call to next() More...
 
String remainder ()
 get remaining text More...
 
boolean endOfInput ()
 reached end of input? More...
 
int next ()
 match remainder to rules provided to constructor More...
 

Static Public Member Functions

static void main (String[] args)
 testing and example for usage More...
 

Static Public Attributes

static final int END_OF_INPUT = -1
 no more input More...
 
static final int NO_MATCH = -2
 no match (usually implies a syntax error) More...
 
static final Pattern P_WHITESPACE = Pattern.compile("\\s+")
 white space More...
 
static final Pattern P_IDENTIFIER
 identifiers More...
 
static final Pattern P_FLOAT
 floating point number More...
 

Protected Member Functions

void eatWhiteSpace ()
 ignore white space (called by match More...
 
boolean match (Pattern p)
 Match remainder against pattern p. More...
 
int next (Rule[] rules)
 match remainder to table of rules @endiliteral
More...
 

Detailed Description

Base class for a simple lexical scanner.

A lexical scanner splits input into tokens. Here, token types are referenced by integer constants given in a table of rules. Each token is characterized by a regular expression.

Note: java.util.Scanner is often a better choice. However, this requires definition of a delimiter, which is problematic in this context.

Definition at line 19 of file LexicalScanner.java.

Constructor & Destructor Documentation

◆ LexicalScanner()

aud.util.LexicalScanner.LexicalScanner ( Rule[]  rules,
String  input 
)

create new scanner processing input @endiliteral

Parameters
rulestable of rules used by next, rules will be applied sequentially in the given order
inputtext to be analyzed

Definition at line 68 of file LexicalScanner.java.

68 {
69 rules_=rules;
70 input_=input;
71 }

Member Function Documentation

◆ eatWhiteSpace()

void aud.util.LexicalScanner.eatWhiteSpace ( )
protected

ignore white space (called by match

Definition at line 74 of file LexicalScanner.java.

74 {
75 if (!endOfInput()) {
76 Matcher m=P_WHITESPACE.matcher(input_);
77 if (m.lookingAt()) {
78 input_=input_.substring(m.end(),input_.length());
79 }
80 }
81 }
boolean endOfInput()
reached end of input?
static final Pattern P_WHITESPACE
white space

References aud.util.LexicalScanner.endOfInput(), and aud.util.LexicalScanner.P_WHITESPACE.

Referenced by aud.util.LexicalScanner.next().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ endOfInput()

boolean aud.util.LexicalScanner.endOfInput ( )

reached end of input?

Definition at line 94 of file LexicalScanner.java.

94 {
95 return input_==null || input_.length()==0;
96 }

Referenced by aud.util.LexicalScanner.eatWhiteSpace(), and aud.util.LexicalScanner.next().

+ Here is the caller graph for this function:

◆ main()

static void aud.util.LexicalScanner.main ( String[]  args)
static

testing and example for usage

Reimplemented in aud.example.expr.Tokenizer.

Definition at line 143 of file LexicalScanner.java.

143 {
144
145 Rule[] rules={
146 new Rule(1,"[0-9]*\\.?[0-9]+"),
147 new Rule(2,"[a-z]+")
148 };
149
151 (rules,args.length==0 ? " 12.3a 12 bcd 34 " : args[0]);
152
153 System.out.println("input = '"+s.remainder()+"'");
154
155 while (s.next()!=END_OF_INPUT) {
156 if (s.matchedTokenId()==NO_MATCH) {
157 System.out.println("syntax error near '"+s.remainder()+"'");
158 break;
159 }
160 System.out.println("next token id = "+s.matchedTokenId());
161 System.out.println("matched text = '"+s.matchedText()+"'");
162 System.out.println("remaining input = '"+s.remainder()+"'");
163 }
164 }
static final int END_OF_INPUT
no more input
LexicalScanner(Rule[] rules, String input)
create new scanner processing input @endiliteral
static final int NO_MATCH
no match (usually implies a syntax error)

References aud.util.LexicalScanner.END_OF_INPUT, aud.util.LexicalScanner.matchedText(), aud.util.LexicalScanner.matchedTokenId(), aud.util.LexicalScanner.next(), aud.util.LexicalScanner.NO_MATCH, and aud.util.LexicalScanner.remainder().

+ Here is the call graph for this function:

◆ match()

boolean aud.util.LexicalScanner.match ( Pattern  p)
protected

Match remainder against pattern p.

match skips any preceding white space.

Parameters
pregular expression pattern
Returns
true if there was a match

Definition at line 103 of file LexicalScanner.java.

103 {
104 text_=null;
106
107 if (endOfInput())
108 return false;
109
110 Matcher m=p.matcher(input_);
111 if (!m.lookingAt())
112 return false;
113
114 int n=m.end();
115 text_=input_.substring(0,n);
116 input_=input_.substring(m.end(),input_.length());
117
118 return true;
119 }
void eatWhiteSpace()
ignore white space (called by match

◆ matchedText()

String aud.util.LexicalScanner.matchedText ( )

get text of last match or call to next

Definition at line 88 of file LexicalScanner.java.

88{ return text_; }

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

+ Here is the caller graph for this function:

◆ matchedTokenId()

int aud.util.LexicalScanner.matchedTokenId ( )

get result of last call to next()

Definition at line 90 of file LexicalScanner.java.

90{ return id_; }

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

+ Here is the caller graph for this function:

◆ next() [1/2]

int aud.util.LexicalScanner.next ( )

match remainder to rules provided to constructor

Returns
rule id or NO_MATCH or END_OF_INPUT

Definition at line 139 of file LexicalScanner.java.

139{ return next(rules_); }
int next()
match remainder to rules provided to constructor

References aud.util.LexicalScanner.next().

Referenced by aud.util.LexicalScanner.next().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ next() [2/2]

int aud.util.LexicalScanner.next ( Rule[]  rules)
protected

match remainder to table of rules @endiliteral

Returns
rule id or NO_MATCH or END_OF_INPUT

Definition at line 123 of file LexicalScanner.java.

123 {
125
126 if (endOfInput()) return id_=END_OF_INPUT;
127 if (rules_==null) return id_=NO_MATCH;
128
129 for (Rule rule : rules) {
130 if (match(rule.pattern_))
131 return id_=rule.id_;
132 }
133 return id_=NO_MATCH;
134 }
boolean match(Pattern p)
Match remainder against pattern p.

References aud.util.LexicalScanner.eatWhiteSpace(), and aud.util.LexicalScanner.endOfInput().

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ remainder()

String aud.util.LexicalScanner.remainder ( )

get remaining text

Definition at line 92 of file LexicalScanner.java.

92{ return input_; }

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

+ Here is the caller graph for this function:

◆ setInput()

void aud.util.LexicalScanner.setInput ( String  input)

set input (resets scanner state)

Definition at line 84 of file LexicalScanner.java.

84 {
85 input_=input;
86 }

Member Data Documentation

◆ END_OF_INPUT

final int aud.util.LexicalScanner.END_OF_INPUT = -1
static

no more input

Definition at line 22 of file LexicalScanner.java.

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

◆ NO_MATCH

final int aud.util.LexicalScanner.NO_MATCH = -2
static

no match (usually implies a syntax error)

Definition at line 24 of file LexicalScanner.java.

Referenced by aud.example.expr.Tokenizer.main(), and aud.util.LexicalScanner.main().

◆ P_FLOAT

final Pattern aud.util.LexicalScanner.P_FLOAT
static
Initial value:
=
Pattern.compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?")

floating point number

Definition at line 60 of file LexicalScanner.java.

◆ P_IDENTIFIER

final Pattern aud.util.LexicalScanner.P_IDENTIFIER
static
Initial value:
=
Pattern.compile("[_a-zA-Z]?(\\w|_)+")

identifiers

Definition at line 57 of file LexicalScanner.java.

◆ P_WHITESPACE

final Pattern aud.util.LexicalScanner.P_WHITESPACE = Pattern.compile("\\s+")
static

white space

Definition at line 55 of file LexicalScanner.java.

Referenced by aud.util.LexicalScanner.eatWhiteSpace().


The documentation for this class was generated from the following file: