Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-220

SimpleXMLCollection raise null pointer exception if document contains doctype with same the name than xml.doctag

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5
    • Fix Version/s: 3.6
    • Component/s: .indexing
    • Labels:
      None

      Description

      See : http://terrier.org/forum//read.php?3,1669

      NPE occurs when a document has a DOCTYPE placed before the root element, with the same name than the root element,
      and the xml.doctag property is set with this name.

      In SimpleXMLDocument, the method findDocumentElement(Node n) only checks the name of the node n :
      if (DocumentElements.contains(n.getNodeName().toLowerCase()) {...}
      and if true, tries to get all the attributes of n.
      But if n is a doctype element, it doesn't have any attribute.

      My workaround is to check if n is not a DOCUMENT_TYPE element (can be a DOCUMENT_NODE or an ELEMENT_NODE).

      Regards,
      Nicolas


        Attachments

          Activity

          Hide
          craigm Craig Macdonald added a comment -

          Committed, r3677.

          Thanks Nicolas!

          Show
          craigm Craig Macdonald added a comment - Committed, r3677. Thanks Nicolas!
          Hide
          craigm Craig Macdonald added a comment -

          Perfect, I will give a look.

          Thanks

          Craig

          Show
          craigm Craig Macdonald added a comment - Perfect, I will give a look. Thanks Craig
          Hide
          nfaessel Nicolas Faessel added a comment -

          JUnit tests results.

          Show
          nfaessel Nicolas Faessel added a comment - JUnit tests results.
          Hide
          nfaessel Nicolas Faessel added a comment - - edited

          To reproduce this problem, !DOCTYPE name must be the same than xml.doctag (in the previous test, you must use <!DOCTYPE body> instead of <!DOCTYPE html>).
          The following test reproduce the problem :

          @Test public void testSingleTermSingleDocumentWithDocType() throws Exception
          	{
          		ApplicationSetup.setProperty("xml.doctag", "test-doctype");
          		ApplicationSetup.setProperty("xml.terms", "test-doctype");
          		SimpleXMLCollection c = getCollection("<?xml version=\"1.0\"?><!DOCTYPE test-doctype><test-doctype>test</test-doctype>");
          		assertTrue(c.nextDocument());
          		Document d = c.getDocument();
          		assertNotNull(d);
          		assertFalse(d.endOfDocument());
          		String t = d.getNextTerm();
          		assertEquals("test", t);
          		assertTrue(d.endOfDocument());
          		assertFalse(c.nextDocument());
          		assertTrue(c.endOfCollection());
          	}
          
          Show
          nfaessel Nicolas Faessel added a comment - - edited To reproduce this problem, !DOCTYPE name must be the same than xml.doctag (in the previous test, you must use <!DOCTYPE body> instead of <!DOCTYPE html> ). The following test reproduce the problem : @Test public void testSingleTermSingleDocumentWithDocType() throws Exception { ApplicationSetup.setProperty( "xml.doctag" , "test-doctype" ); ApplicationSetup.setProperty( "xml.terms" , "test-doctype" ); SimpleXMLCollection c = getCollection( "<?xml version=\" 1.0\ "?><!DOCTYPE test-doctype><test-doctype>test</test-doctype>" ); assertTrue(c.nextDocument()); Document d = c.getDocument(); assertNotNull(d); assertFalse(d.endOfDocument()); String t = d.getNextTerm(); assertEquals( "test" , t); assertTrue(d.endOfDocument()); assertFalse(c.nextDocument()); assertTrue(c.endOfCollection()); }
          Hide
          craigm Craig Macdonald added a comment -

          Hi Nicolas,

          Thanks for your report. I have tried, unsuccessfully, to reproduce this problem in the JUnit test for SimpleXMLCollection (TestSimpleXMLCollection).

          	@Test public void testSingleTermSingleDocumentWithDocType() throws Exception
          	{
          		ApplicationSetup.setProperty("xml.doctag", "body");
          		ApplicationSetup.setProperty("xml.terms", "body");
          		SimpleXMLCollection c = getCollection("<?xml version=\"1.0\"?><!DOCTYPE html><body>test</body>");
          		assertTrue(c.nextDocument());
          		Document d = c.getDocument();
          		assertNotNull(d);
          		assertFalse(d.endOfDocument());
          		String t = d.getNextTerm();
          		assertEquals("test", t);
          		assertTrue(d.endOfDocument());
          		assertFalse(c.nextDocument());
          		assertTrue(c.endOfCollection());
          	}
          	
          

          Can you revise your patch with a test case that does identify the problem?

          Craig

          Show
          craigm Craig Macdonald added a comment - Hi Nicolas, Thanks for your report. I have tried, unsuccessfully, to reproduce this problem in the JUnit test for SimpleXMLCollection (TestSimpleXMLCollection). @Test public void testSingleTermSingleDocumentWithDocType() throws Exception { ApplicationSetup.setProperty( "xml.doctag" , "body" ); ApplicationSetup.setProperty( "xml.terms" , "body" ); SimpleXMLCollection c = getCollection( "<?xml version=\" 1.0\ "?><!DOCTYPE html><body>test</body>" ); assertTrue(c.nextDocument()); Document d = c.getDocument(); assertNotNull(d); assertFalse(d.endOfDocument()); String t = d.getNextTerm(); assertEquals( "test" , t); assertTrue(d.endOfDocument()); assertFalse(c.nextDocument()); assertTrue(c.endOfCollection()); } Can you revise your patch with a test case that does identify the problem? Craig

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              nfaessel Nicolas Faessel
            • Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: