Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range Index error prevents storing Element containing CDATA that is preceeded by Text node, and corrupt db #4825

Open
adamretter opened this issue Mar 23, 2023 · 0 comments
Assignees
Labels
bug issue confirmed as bug high prio
Milestone

Comments

@adamretter
Copy link
Member

adamretter commented Mar 23, 2023

It is not possible to store a document into eXist-db if the following two concerns align:

  1. The document contains an Element with two children:
    1. a Text node,
    2. followed by a CData Section.
  2. There is a Range Index configured on the Element for the Collection in which the document is to be stored.

Attempting this will cause an error like this:

Caused by: java.lang.NullPointerException
	at java.base/java.lang.System.arraycopy(Native Method)
	at org.exist.util.XMLString.append(XMLString.java:94)
	at org.exist.Indexer.endCDATA(Indexer.java:299)
	at org.exist.collections.triggers.SAXTrigger.endCDATA(SAXTrigger.java:188)
	at org.exist.collections.triggers.DocumentTriggers.endCDATA(DocumentTriggers.java:226)
	at org.apache.xerces.parsers.AbstractSAXParser.endCDATA(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
	at org.exist.collections.MutableCollection.lambda$8(MutableCollection.java:1142)
	at org.exist.collections.MutableCollection.lambda$12(MutableCollection.java:1229)
	at org.exist.collections.MutableCollection.storeXMLInternal(MutableCollection.java:1376)
	at org.exist.collections.MutableCollection.storeXmlDocument(MutableCollection.java:1229)
	at org.exist.collections.MutableCollection.storeDocument(MutableCollection.java:1148)
	at org.exist.collections.LockedCollection.storeDocument(LockedCollection.java:367)
	at org.exist.storage.NativeBroker.storeDocument(NativeBroker.java:2300)
	at org.exist.xmldb.LocalCollection.lambda$27(LocalCollection.java:642)
	at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
	at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
	at org.exist.xmldb.AbstractLocal.lambda$6(AbstractLocal.java:218)
	at org.exist.xmldb.AbstractLocal.lambda$5(AbstractLocal.java:152)
	at org.exist.xmldb.function.LocalXmldbFunction.apply(LocalXmldbFunction.java:48)
	at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:58)
	at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:52)
	at org.exist.xmldb.AbstractLocal.lambda$4(AbstractLocal.java:152)
	at org.exist.xmldb.LocalCollection.lambda$37(LocalCollection.java:808)
	at org.exist.xmldb.LocalCollection.storeXMLResource(LocalCollection.java:607)
	at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:550)
	at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:539)
	at org.exist.xquery.functions.xmldb.XMLDBLoadFromPattern.evalWithCollection(XMLDBLoadFromPattern.java:202)
...

Due to eXist-db's lack of ACID transactions semantics, the cause of the above error, as with any error that occurs whilst storing an XML document in eXist-db, will corrupt the database!


The reproducible Test Case is simple:

declare variable $collection-conf := document {
	<collection xmlns="http://exist-db.org/collection-config/1.0">
	    <index xmlns:int="http://services.parallelgraphics.com/vm/mmr/vm-interactivity-xml/all">
	        <!-- Range index -->
	        <create qname="entry" type="xs:string"/>
	    </index>
	</collection>
};

(: Create Collection, and store the Index config :)
xmldb:create-collection("/db", "test"),
xmldb:create-collection("/db/system/config/db", "test"),
xmldb:store("/db/system/config/db/test", "collection.xconf", $collection-conf),

(: Store the Document :)
xmldb:store-files-from-pattern(
	"/db/test",
	"/tmp",
	"test1.xml",
	(),
	fn:true()
)

The test1.xml document needed by the above query has the content:

<entry>something<![CDATA[Item]]></entry>

I have tested this with eXist-db 7.0.0-SNAPSHOT. I have not yet checked older versions of eXist-db, but having taken a quick look at the Git history, I believe this bug has been hiding for a very long time, and so will likely be present in at least 4.0.0 onwards.

@adamretter adamretter added bug issue confirmed as bug high prio labels Mar 23, 2023
@adamretter adamretter added this to the eXist-7.0.1 milestone Mar 23, 2023
@adamretter adamretter self-assigned this Mar 23, 2023
adamretter added a commit to evolvedbinary/exist that referenced this issue Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug high prio
Projects
None yet
Development

No branches or pull requests

1 participant