public interface Tokenizer
| Modifier and Type | Method and Description | 
|---|---|
| String | currentTag()Returns the identifier of the tag the tokenizer is into. | 
| long | getByteOffset()Returns the byte offset in the current indexed file. | 
| boolean | inDocnoTag()Indicates whether we are in a special document number tag. | 
| boolean | inTagToProcess()Indicates whether we are in a tag to process. | 
| boolean | inTagToSkip()Indicates whether we are in a tag to skip | 
| boolean | isEndOfDocument()Returns true if the end of document is encountered. | 
| boolean | isEndOfFile()Returns true if the end of file is encountered. | 
| void | nextDocument()Proceed to process the next document. | 
| String | nextToken()Returns the next token from the input stream used. | 
| void | setInput(BufferedReader input)Sets the input of the tokenizer | 
String currentTag()
String nextToken()
boolean inDocnoTag()
boolean inTagToProcess()
boolean inTagToSkip()
boolean isEndOfDocument()
boolean isEndOfFile()
void nextDocument()
long getByteOffset()
void setInput(BufferedReader input)
input - BufferedReader the input stream to tokenizeTerrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow