|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface Tokenizer
The specification of the interface implemented by tokeniser classes.
Method Summary | |
---|---|
java.lang.String |
currentTag()
Returns the identifier of the tag the tokenizer is into. |
long |
getByteOffset()
Returns the byte offset in the current indexed file. |
boolean |
inDocnoTag()
Indicates whether we are in a special document number tag. |
boolean |
inTagToProcess()
Indicates whether we are in a tag to process. |
boolean |
inTagToSkip()
Indicates whether we are in a tag to skip |
boolean |
isEndOfDocument()
Returns true if the end of document is encountered. |
boolean |
isEndOfFile()
Returns true if the end of file is encountered. |
void |
nextDocument()
Proceed to process the next document. |
java.lang.String |
nextToken()
Returns the next token from the input stream used. |
void |
setInput(java.io.BufferedReader input)
Sets the input of the tokenizer |
Method Detail |
---|
java.lang.String currentTag()
java.lang.String nextToken()
boolean inDocnoTag()
boolean inTagToProcess()
boolean inTagToSkip()
boolean isEndOfDocument()
boolean isEndOfFile()
void nextDocument()
long getByteOffset()
void setInput(java.io.BufferedReader input)
input
- BufferedReader the input stream to tokenize
|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |