Class TokenizerMatch
TokenizerMatch contains all information about a token that was extracted from the source string:
token- The actual text content of the tokenterminator- The Terminator that identified this tokenmatcher- The regex Matcher used for matching
Key methods:
isGroup(String)- Check if this token belongs to a specific groupgetRegExpGroups()- Extract regex capture groups from the matchgetTokenizer()- Get the tokenizer that produced this match
Example usage:
TokenizerMatch match = tokenizer.getNextToken();
System.out.println("Token: " + match.token);
if (match.isGroup("number")) {
int value = Integer.parseInt(match.token);
}
if (match.isGroup("string")) {
String[] groups = match.getRegExpGroups();
// groups[0] might be the string content without quotes
}
For tokens that were accumulated text (not matched by a terminator), the terminator and matcher fields will be null.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionfinal MatcherThe regex Matcher used to identify this token.final TerminatorThe Terminator that identified this token.final StringThe text content of the matched token. -
Constructor Summary
ConstructorsConstructorDescriptionTokenizerMatch(String token, Terminator terminator, Matcher matcher, Tokenizer tokenizer) Creates a new TokenizerMatch with all components. -
Method Summary
-
Field Details
-
token
The text content of the matched token.This is the actual substring from the source that was identified as a token. For accumulated text (no terminator match), this contains all characters accumulated before a terminator was found.
-
terminator
The Terminator that identified this token.May be null if this token was accumulated text rather than matched by a terminator. When not null, you can check the terminator's group to categorize the token.
-
matcher
The regex Matcher used to identify this token.May be null if this token was accumulated text. When not null, you can use this to extract capture groups from the match.
-
-
Constructor Details
-
TokenizerMatch
Creates a new TokenizerMatch with all components.- Parameters:
token- the matched text. May be empty but should not be null.terminator- the Terminator that matched this token. May be null for accumulated text tokens.matcher- the regex Matcher used for matching. May be null for accumulated text tokens.tokenizer- the Tokenizer that produced this match.
-
-
Method Details
-
isGroup
Checks if this token belongs to the specified group.This compares the group name of the terminator against the provided group name. Useful for categorizing tokens by type.
Special cases:
- If terminator is null, returns true only if group is also null
- If terminator.group is null, returns true only if group is null
Example:
tokenizer.addTerminator(PRESERVE, "\\d+", "number"); tokenizer.addTerminator(PRESERVE, "\\w+", "word"); TokenizerMatch match = tokenizer.getNextToken(); if (match.isGroup("number")) { // Token is a number } else if (match.isGroup("word")) { // Token is a word }- Parameters:
group- the group name to check against. May be null.- Returns:
trueif this token belongs to the specified group,falseotherwise.
-
getRegExpGroups
Extracts regex capture groups from this match.Returns the captured groups from the regex pattern that matched this token. Group 1 and onwards are returned (not the full match).
Example:
tokenizer.addTerminator(PRESERVE, "(\\d+):(\\d+)", "time"); // Matches "12:30" TokenizerMatch match = tokenizer.getNextToken(); String[] groups = match.getRegExpGroups(); // groups[0] = "12" (hours) // groups[1] = "30" (minutes)- Returns:
- an array of captured group strings. Empty array if matcher is null or no capture groups exist in the pattern.
-
toString
Returns a detailed string representation for debugging.Includes the token text, terminator details, and any regex groups.
-
getTokenizer
Returns the tokenizer that produced this match.This allows continuing tokenization or accessing tokenizer state from within token handling code.
- Returns:
- the Tokenizer instance that created this match.
-