'How to validate an XML file using Java with an XSD having an include?
I'm using Java 5 javax.xml.validation.Validator to validate XML file. I've done it for one schema that uses only imports and everything works fine. Now I'm trying to validate with another schema that uses import and one include. The problem I have is that element in the main schema are ignored, the validation says it cannot find their declaration.
Here is how I build the Schema:
InputStream includeInputStream = getClass().getClassLoader().getResource("include.xsd").openStream();
InputStream importInputStream = getClass().getClassLoader().getResource("import.xsd").openStream();
InputStream mainInputStream = getClass().getClassLoader().getResource("main.xsd").openStream();
Source[] sourceSchema = new SAXSource[]{includeInputStream , importInputStream,
mainInputStream };
Schema schema = factory.newSchema(sourceSchema);
Now here is the extract of the declaration in main.xsd
<xsd:schema xmlns="http://schema.omg.org/spec/BPMN/2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:import="http://www.foo.com/import" targetNamespace="http://main/namespace" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:import namespace="http://www.foo.com/import" schemaLocation="import.xsd"/>
<xsd:include schemaLocation="include.xsd"/>
<xsd:element name="element" type="tElement"/>
<...>
</xsd:schema>
If I copy the code of my included XSD in the main.xsd, it works fine. If I don't, validation doesn't find the declaration of "Element".
Solution 1:[1]
The accepted answer is perfectly ok, but does not work with Java 8 without some modifications. It would also be nice to be able to specify a base path from which the imported schemas are read.
I have used in my Java 8 the following code which allows to specify an embedded schema path other than the root path:
import com.sun.org.apache.xerces.internal.dom.DOMInputImpl;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;
import java.io.InputStream;
import java.util.Objects;
public class ResourceResolver implements LSResourceResolver {
private String basePath;
public ResourceResolver(String basePath) {
this.basePath = basePath;
}
@Override
public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
// note: in this sample, the XSD's are expected to be in the root of the classpath
InputStream resourceAsStream = this.getClass().getClassLoader()
.getResourceAsStream(buildPath(systemId));
Objects.requireNonNull(resourceAsStream, String.format("Could not find the specified xsd file: %s", systemId));
return new DOMInputImpl(publicId, systemId, baseURI, resourceAsStream, "UTF-8");
}
private String buildPath(String systemId) {
return basePath == null ? systemId : String.format("%s/%s", basePath, systemId);
}
}
This implementation also gives to the user a meaningful message in case the schema cannot be read.
Solution 2:[2]
As user "ulab" points out in a comment on another answer the solution described in this answer (to a separate stackoverflow question) will work for many. Here's the rough outline of that approach:
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
URL xsdURL = this.getResource("/xsd/my-schema.xsd");
Schema schema = schemaFactory.newSchema(xsdURL);
The key to this approach is avoiding handing the schema factory a stream and instead giving it a URL. This way it gets information about the location of the XSD file.
One thing to keep in mind here is that the "schemaLocation" attribute on include and/or import elements will be treated as relative to the classpath location of the XSD file whose URL you've handed to the validator when you use simple file paths in the form "my-common.xsd" or "common/some-concept.xsd".
Notes: - In the example above I've placed the schema file into a jar file under an "xsd" folder. - The leading slash in the "getResource" argument tells Java to start at the root of the classloader instead of at the "this" object's package name.
Solution 3:[3]
I had to make some modifications to this post by AMegmondoEmber
My main schema file had some includes from sibling folders, and the included files also had some includes from their local folders. I also had to track down the base resource path and relative path of the current resource. This code works for me know, but please keep in mind that it assumes all xsd files have a unique name. If you have some xsd files with same name, but different content at different paths, it will probably give you problems.
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;
/**
* The Class ResourceResolver.
*/
public class ResourceResolver implements LSResourceResolver {
/** The logger. */
private final Logger logger = LoggerFactory.getLogger(this.getClass());
/** The schema base path. */
private final String schemaBasePath;
/** The path map. */
private Map<String, String> pathMap = new HashMap<String, String>();
/**
* Instantiates a new resource resolver.
*
* @param schemaBasePath the schema base path
*/
public ResourceResolver(String schemaBasePath) {
this.schemaBasePath = schemaBasePath;
logger.warn("This LSResourceResolver implementation assumes that all XSD files have a unique name. "
+ "If you have some XSD files with same name but different content (at different paths) in your schema structure, "
+ "this resolver will fail to include the other XSD files except the first one found.");
}
/* (non-Javadoc)
* @see org.w3c.dom.ls.LSResourceResolver#resolveResource(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String)
*/
@Override
public LSInput resolveResource(String type, String namespaceURI,
String publicId, String systemId, String baseURI) {
// The base resource that includes this current resource
String baseResourceName = null;
String baseResourcePath = null;
// Extract the current resource name
String currentResourceName = systemId.substring(systemId
.lastIndexOf("/") + 1);
// If this resource hasn't been added yet
if (!pathMap.containsKey(currentResourceName)) {
if (baseURI != null) {
baseResourceName = baseURI
.substring(baseURI.lastIndexOf("/") + 1);
}
// we dont need "./" since getResourceAsStream cannot understand it
if (systemId.startsWith("./")) {
systemId = systemId.substring(2, systemId.length());
}
// If the baseResourcePath has already been discovered, get that
// from pathMap
if (pathMap.containsKey(baseResourceName)) {
baseResourcePath = pathMap.get(baseResourceName);
} else {
// The baseResourcePath should be the schemaBasePath
baseResourcePath = schemaBasePath;
}
// Read the resource as input stream
String normalizedPath = getNormalizedPath(baseResourcePath, systemId);
InputStream resourceAsStream = this.getClass().getClassLoader()
.getResourceAsStream(normalizedPath);
// if the current resource is not in the same path with base
// resource, add current resource's path to pathMap
if (systemId.contains("/")) {
pathMap.put(currentResourceName, normalizedPath.substring(0,normalizedPath.lastIndexOf("/")+1));
} else {
// The current resource should be at the same path as the base
// resource
pathMap.put(systemId, baseResourcePath);
}
Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A");
String s1 = s.next().replaceAll("\\n", " ") // the parser cannot understand elements broken down multiple lines e.g. (<xs:element \n name="buxing">)
.replace("\\t", " ") // these two about whitespaces is only for decoration
.replaceAll("\\s+", " ").replaceAll("[^\\x20-\\x7e]", ""); // some files has a special character as a first character indicating utf-8 file
InputStream is = new ByteArrayInputStream(s1.getBytes());
return new LSInputImpl(publicId, systemId, is); // same as Input class
}
// If this resource has already been added, do not add the same resource again. It throws
// "org.xml.sax.SAXParseException: sch-props-correct.2: A schema cannot contain two global components with the same name; this schema contains two occurrences of ..."
// return null instead.
return null;
}
/**
* Gets the normalized path.
*
* @param basePath the base path
* @param relativePath the relative path
* @return the normalized path
*/
private String getNormalizedPath(String basePath, String relativePath){
if(!relativePath.startsWith("../")){
return basePath + relativePath;
}
else{
while(relativePath.startsWith("../")){
basePath = basePath.substring(0,basePath.substring(0, basePath.length()-1).lastIndexOf("/")+1);
relativePath = relativePath.substring(3);
}
return basePath+relativePath;
}
}
}
Solution 4:[4]
The accepted answer is very verbose, and builds a DOM in memory first, includes seems to work out of the box for me, including relative references.
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File("../foo.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new File("./foo.xml")));
Solution 5:[5]
For us the resolveResource looked like this. After some prolog exception and weird Element type "xs:schema" must be followed by either attribute specifications, ">" or "/>". Element type "xs:element" must be followed by either attribute specifications, ">" or "/>". (because of the breakdown of multiple lines)
The path history was needed because of the structure of includes
main.xsd (this has include "includes/subPart.xsd")
/includes/subPart.xsd (this has include "./subSubPart.xsd")
/includes/subSubPart.xsd
So the code looks like:
String pathHistory = "";
@Override
public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
systemId = systemId.replace("./", "");// we dont need this since getResourceAsStream cannot understand it
InputStream resourceAsStream = Message.class.getClassLoader().getResourceAsStream(systemId);
if (resourceAsStream == null) {
resourceAsStream = Message.class.getClassLoader().getResourceAsStream(pathHistory + systemId);
} else {
pathHistory = getNormalizedPath(systemId);
}
Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A");
String s1 = s.next()
.replaceAll("\\n"," ") //the parser cannot understand elements broken down multiple lines e.g. (<xs:element \n name="buxing">)
.replace("\\t", " ") //these two about whitespaces is only for decoration
.replaceAll("\\s+", " ")
.replaceAll("[^\\x20-\\x7e]", ""); //some files has a special character as a first character indicating utf-8 file
InputStream is = new ByteArrayInputStream(s1.getBytes());
return new LSInputImpl(publicId, systemId, is);
}
private String getNormalizedPath(String baseURI) {
return baseURI.substring(0, baseURI.lastIndexOf(System.getProperty("file.separator"))+ 1) ;
}
Solution 6:[6]
This thread was very useful for parsing complex xml schemas in multiple files .
I also had to add:
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
factory.setFeature("http://apache.org/xml/features/honour-all-schemaLocations", true);
to handle multiple files with same targetnamespace.
Solution 7:[7]
If you wont find an element in xml you will get xml:lang exception. Elements are case sensitive
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Gordon Daugherty |
| Solution 3 | Community |
| Solution 4 | teknopaul |
| Solution 5 | AMegmondoEmber |
| Solution 6 | M123 |
| Solution 7 | Ramakrishna |
