Interface Document

    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      boolean endOfDocument()
      Returns true when the end of the document has been reached, and there are no other terms to be retrieved from it.
      java.util.Map<java.lang.String,​java.lang.String> getAllProperties()
      Returns the underlying map of all the properties defined by this Document.
      java.util.Set<java.lang.String> getFields()
      Returns a list of the fields the current term appears in.
      java.lang.String getNextTerm()
      Gets the next term of the document.
      java.lang.String getProperty​(java.lang.String name)
      Allows access to a named property of the Document.
      java.io.Reader getReader()
      Returns a Reader object so client code can tokenise the document or deal with the document itself.
    • Method Detail

      • getNextTerm

        java.lang.String getNextTerm()
        Gets the next term of the document. NB:Null string returned from getNextTerm() should be ignored. They do not signify the lack of any more terms. endOfDocument() should be used to check that.
        Returns:
        String the next term of the document. Null returns should be ignored.
      • getFields

        java.util.Set<java.lang.String> getFields()
        Returns a list of the fields the current term appears in.
        Returns:
        HashSet a set of the terms that the current term appears in.
      • endOfDocument

        boolean endOfDocument()
        Returns true when the end of the document has been reached, and there are no other terms to be retrieved from it.
        Returns:
        boolean true if there are no more terms in the document, otherwise it returns false.
      • getReader

        java.io.Reader getReader()
        Returns a Reader object so client code can tokenise the document or deal with the document itself. Examples might be extracting URLs, language detection.
      • getProperty

        java.lang.String getProperty​(java.lang.String name)
        Allows access to a named property of the Document. Examples might be URL, filename etc.
        Parameters:
        name - Name of the property. It is suggested, but not required that this name should not be case insensitive.
        Since:
        1.1.0
      • getAllProperties

        java.util.Map<java.lang.String,​java.lang.String> getAllProperties()
        Returns the underlying map of all the properties defined by this Document.
        Since:
        1.1.0