Class Terminator
A Terminator specifies how to identify and handle token boundaries in the tokenizer. Each terminator has:
- A regex pattern to match at the current position
- A termination strategy (PRESERVE or DROP)
- An optional group name for categorizing matched tokens
- An active flag that can disable the terminator temporarily
Termination strategies:
PRESERVE- The matched text is returned as a token. Use this for tokens you want to process (keywords, operators, etc.)DROP- The matched text is discarded silently. Use this for separators you don't need (whitespace, comments, etc.)
Example:
// Drop whitespace (don't return it as a token)
Terminator whitespace = tokenizer.addTerminator(DROP, "\\s+");
// Preserve quoted strings (return them as tokens)
Terminator strings = tokenizer.addTerminator(PRESERVE, "\".*?\"", "string");
// Preserve numbers with a group name
Terminator numbers = tokenizer.addTerminator(PRESERVE, "\\d+", "number");
// Temporarily disable a terminator
whitespace.active = false; // Now whitespace will be returned as tokens
whitespace.active = true; // Back to dropping whitespace
Patterns are anchored to match only at the current position (the "^" anchor is prepended automatically).
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumDefines how matched tokens are handled by the tokenizer. -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionTerminator(Terminator.TerminationStrategy termination, String regexp, String group) Creates a new terminator with the specified configuration. -
Method Summary
-
Field Details
-
termination
The strategy for handling matched tokens.Determines whether matched text is returned as a token (PRESERVE) or silently discarded (DROP).
- See Also:
-
group
An optional group name for categorizing this token type.When set, matched tokens can be identified by this group name using
TokenizerMatch.isGroup(String).May be null for uncategorized tokens.
-
active
public boolean activeFlag indicating whether this terminator is active.When false, this terminator is skipped during matching. Set this to temporarily disable a terminator without removing it.
Default value is
true. -
pattern
The compiled regex pattern used for matching.This is the regexp with "^" prepended to anchor it at the start. The pattern is compiled once for efficiency.
-
-
Constructor Details
-
Terminator
Creates a new terminator with the specified configuration.The regex pattern is anchored with "^" to match only at the current position in the source string.
- Parameters:
termination- how to handle matched tokens (PRESERVE or DROP).regexp- the regex pattern to match tokens. Must not be null.group- the group name for categorizing this token type. May be null.
-
-
Method Details
-
match
Creates a matcher for this terminator at the specified position.The matcher is configured to search from the given index to the end of the source string, ensuring matches only occur at that exact position (due to the "^" anchor in the pattern).
- Parameters:
source- the source string to match against. Must not be null.index- the starting position for matching.- Returns:
- a Matcher configured to search from index to the end.
-
toString
Returns a string representation of this terminator.Includes the regexp, termination strategy, group name, and active status.
-