Terrier Core

Full pass over documentation

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Blocker Blocker
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.0
  • Component/s: None
  • Description:
    Hide
    Much has changed for Terrier 3.0. We need to spend much time improving the documentation before the 3.0 release.

    E.g.:
     * http_terrier.sh etc
     * MetaIndex
     * Changes to inverted index parsing
     * Changes to namespaces
     * TSMs are deprecated
     * Improved fields
     * TrecTerrier can take command line -D options
    Show
    Much has changed for Terrier 3.0. We need to spend much time improving the documentation before the 3.0 release. E.g.:  * http_terrier.sh etc  * MetaIndex  * Changes to inverted index parsing  * Changes to namespaces  * TSMs are deprecated  * Improved fields  * TrecTerrier can take command line -D options

Activity

Hide
Richard McCreadie added a comment - 18/Feb/10 2:55 PM - edited

List of documents needing changed:
Overview:

  • add clueweb09/blogs08
  • hiemstra and croft
  • 50 million documents
  • 6 years of work
  • add web interface
  • frequency occurrences in fields
  • field weighting models?

QuickStart:

  • change version numbers
  • one command for tar
  • add web interface

Components:

  • remove term score modifiers
  • meta index structure
  • update applications

Configure Indexing:

  • remove docno/string.byte.length
  • add meta index

Configure Retrieval

  • add/remove weighting models
  • TREC output format

Desktop:

  • take out file list

Examples:

  • remove croft
  • check all numbers after changing the stemmer
  • check retrieval performance

Hadoop Indexing:

  • talk about document/term partitioning
  • inverted to direct in MapReduce

Properties:

  • needs a pass

Extend Terrier:

  • change version

Extend Retrieval:

  • remove LM matching
  • examples changed for posting lists

Non-English:

  • use UTF

Future Features

  • needs a pass

Whats New:

  • move TREC resolved issues to TR and update
  • detail what has changed

NewPages

  • Web Interface
Show
Richard McCreadie added a comment - 18/Feb/10 2:55 PM - edited List of documents needing changed: Overview:
  • add clueweb09/blogs08
  • hiemstra and croft
  • 50 million documents
  • 6 years of work
  • add web interface
  • frequency occurrences in fields
  • field weighting models?
QuickStart:
  • change version numbers
  • one command for tar
  • add web interface
Components:
  • remove term score modifiers
  • meta index structure
  • update applications
Configure Indexing:
  • remove docno/string.byte.length
  • add meta index
Configure Retrieval
  • add/remove weighting models
  • TREC output format
Desktop:
  • take out file list
Examples:
  • remove croft
  • check all numbers after changing the stemmer
  • check retrieval performance
Hadoop Indexing:
  • talk about document/term partitioning
  • inverted to direct in MapReduce
Properties:
  • needs a pass
Extend Terrier:
  • change version
Extend Retrieval:
  • remove LM matching
  • examples changed for posting lists
Non-English:
  • use UTF
Future Features
  • needs a pass
Whats New:
  • move TREC resolved issues to TR and update
  • detail what has changed
NewPages
  • Web Interface
Hide
Craig Macdonald added a comment - 24/Feb/10 4:28 PM

I have made my pass at the documentation. Only "Whats New" remains to be done.

Please can others read each page, and make alterations. Note here what alterations you complete.

Show
Craig Macdonald added a comment - 24/Feb/10 4:28 PM I have made my pass at the documentation. Only "Whats New" remains to be done. Please can others read each page, and make alterations. Note here what alterations you complete.
Hide
Craig Macdonald added a comment - 10/Mar/10 12:53 PM

Many additional updates have been made, covering choosing an appropriate Collection class, RF, Proximity, field-models, links to wiki, and more.

Show
Craig Macdonald added a comment - 10/Mar/10 12:53 PM Many additional updates have been made, covering choosing an appropriate Collection class, RF, Proximity, field-models, links to wiki, and more.

People

Dates

  • Created:
    17/Feb/10 2:17 PM
    Updated:
    10/Mar/10 12:53 PM
    Resolved:
    10/Mar/10 12:53 PM