| Package | Description |
|---|---|
| org.terrier.indexing |
Provides classes and interfaces related to the indexing of documents.
|
| org.terrier.indexing.tokenisation |
Provides classes related to the tokenisation of documents.
|
| Modifier and Type | Field and Description |
|---|---|
protected TokenStream |
TaggedDocument.currentTokenStream |
protected TokenStream |
FileDocument.tokenStream |
| Modifier and Type | Field and Description |
|---|---|
static TokenStream |
Tokeniser.EMPTY_STREAM
empty stream
|
| Modifier and Type | Method and Description |
|---|---|
TokenStream |
UTFTokeniser.tokenise(Reader reader) |
TokenStream |
UTFTwitterTokeniser.tokenise(Reader reader) |
TokenStream |
IdentityTokeniser.tokenise(Reader reader) |
abstract TokenStream |
Tokeniser.tokenise(Reader reader)
Tokenises the text obtained from the specified reader.
|
TokenStream |
EnglishTokeniser.tokenise(Reader reader) |
Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow