Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-118

SimpleXMLCollection - the term near the closing tag is ignored

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.5
    • Component/s: .indexing
    • Labels:
      None

      Description

      When I try to index an XML collection using SimpleXMLCollection, the term near the closing tag is ignored if there is not character between the term and the tag (space, new line, ...).

      Please find attached :
        * an xml file with its DTD to reproduce the bug
        * a patch which fixes the problem

      The needed properties :
      xml.doctag=article
      xml.idtag=docid
      xml.terms=title
      trec.collection.class=SimpleXMLCollection

        Attachments

        1. 10002.xml
          13 kB
        2. article.dtd
          29 kB
        3. patch.diff
          0.6 kB
        4. TR-118-craigm-v1.patch
          12 kB

          Activity

          dudognon Damien Dudognon created issue -
          craigm Craig Macdonald made changes -
          Field Original Value New Value
          Attachment TR-118-craigm-v1.patch [ 10217 ]
          craigm Craig Macdonald made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 3.1 [ 10040 ]
          Resolution Fixed [ 1 ]

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              dudognon Damien Dudognon
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: