Package | Description |
---|---|
org.terrier.indexing |
Provides classes and interfaces related to the indexing of documents.
|
org.terrier.indexing.tokenisation |
Provides classes related to the tokenisation of documents.
|
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
TaggedDocument.currentTokenStream |
protected TokenStream |
FileDocument.tokenStream |
Modifier and Type | Field and Description |
---|---|
static TokenStream |
Tokeniser.EMPTY_STREAM
empty stream
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
UTFTokeniser.tokenise(Reader reader) |
TokenStream |
UTFTwitterTokeniser.tokenise(Reader reader) |
TokenStream |
IdentityTokeniser.tokenise(Reader reader) |
abstract TokenStream |
Tokeniser.tokenise(Reader reader)
Tokenises the text obtained from the specified reader.
|
TokenStream |
EnglishTokeniser.tokenise(Reader reader) |
Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow