XsdParser

<div align="justify"> XsdParser is a library that parses a XML Definition file (.xsd) into a list of java objects. Each different XSD tag has a corresponding Java class and the attributes of a given XSD type are represented as fields of that class. All these classes derive from the same abstract class, <i>XsdAbstractElement</i>. All Java representations of the XSD elements follow the schema definition for XSD. For example, the <i>xsd:annotation</i> tag only allows <i>xsd:appinfo</i> and <i>xsd:documentation</i> as children nodes, and also can have an attribute named <i>id</i>, therefore XsdParser has the following class (simplified for example purposes): <br /> <br /> </div>

public class XsdAnnotation extends XsdAbstractElement {

    private String id;
    private List<XsdAppInfo> appInfoList = new ArrayList<>();
    private List<XsdDocumentation> documentations = new ArrayList<>();
    
    // (...)
}

<div align="justify"> The set of rules followed by this library can be consulted in the following URL: <a href="http://www.datypic.com/sc/xsd/s-xmlschema.xsd.html">XSD Schema</a> </div>

Installation

<div align="justify"> First, in order to include it to your Maven project, simply add this dependency: <br /> <br /> </div>

<dependency>
    <groupId>com.github.xmlet</groupId>
    <artifactId>xsdParser</artifactId>
    <version>1.2.22</version>
</dependency>

Usage example

<div align="justify"> A simple example: <br /> <br /> </div>

public class ParserApp {
    public static void main(String [] args) {
        String filePath = "Your file path here.";
        XsdParser parserInstance1 = new XsdParser(filePath);
        
        //or
        
        String jarPath = "Your jar path here.";
        String jarXsdPath = "XSD file path, relative to the jar root.";
        XsdParserJar parserInstance2 = new XsdParserJar(jarPath, jarXsdPath);

        Stream<XsdElement> elementsStream = parserInstance1.getResultXsdElements();
        Stream<XsdSchema> schemasStream = parserInstance1.getResultXsdSchemas();
    }
}

<div align="justify"> After parsing the file like shown above it's possible to start to navigate in the resulting parsed elements. In the image below it is presented the class diagram that could be useful before trying to start navigating in the result. There are multiple abstract classes that allow to implement shared features and reduce duplicated code. <br /> <br /> <img src="https://raw.githubusercontent.com/xmlet/XsdParser/master/src/main/java/org/xmlet/xsdparser/xsdelements/xsdelements.png"/> </div>

Navigation

<div align="justify"> Below a simple example is presented. After parsing the XSD snippet the parsed elements can be accessed with the respective java code. <br /> <br /> </div>

<?xml version='1.0' encoding='utf-8' ?>
<xsd:schema xmlns='http://schemas.microsoft.com/intellisense/html-5' xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
	
  <xsd:group name="flowContent">
    <xsd:all>
      <xsd:element name="elem1"/>
    </xsd:all>
  </xsd:group>
	
  <xs:element name="html">
    <xs:complexType>
      <xsd:choice>
        <xsd:group ref="flowContent"/>
      </xsd:choice>
      <xs:attribute name="manifest" type="xsd:anyURI" />
    </xs:complexType>
  </xs:element>
</xsd:schema>

<div align="justify"> The result could be consulted in the following way: <br /> <br /> </div>

public class ParserApp {
    public static void main(String [] args) {
        //(...)
        
        XsdElement htmlElement = elementsStream.findFirst().get();
        
        XsdComplexType htmlComplexType = htmlElement.getXsdComplexType();
        XsdAttribute manifestAttribute = htmlComplexType.getXsdAttributes().findFirst().get();
        
        XsdChoice choiceElement = htmlComplexType.getChildAsChoice();
        
        XsdGroup flowContentGroup = choiceElement.getChildrenGroups().findFirst().get();
        
        XsdAll flowContentAll = flowContentGroup.getChildAsAll();
        
        XsdElement elem1 = flowContentAll.getChildrenElements().findFirst().get();
    }
}

Parsing Strategy

<div align="justify"> In order to minimize the number of passages in the file, which take more time to perform, this library chose to parse all the elements and then resolve the references present. To parse the XSD file we use the DOM library, which converts all the XSD elements into <i>Node</i> objects, from where we extract all the XSD information into our XSD respective classes. <br /> <br /> Our parse process is also based on a tree approach, which means that when we invoke the <i>XsdSchema parse</i> function the whole document will be parsed, because each <i>XsdAbstractElement</i> class extracts its respective information, i.e. a <i>XsdSchema</i> instance extracts information from the received xsd:schema <i>Node</i> object, and also invokes the respective parse function for each children elements present in its current <i>Node</i> object. </div>

Type Validations

<div align="justify"> This library was born with an objective in mind, it should strictly follow the XSD language rules. To guarantee that we used the Visitor pattern. We used this pattern to add a layer of control regarding different XSD types interactions. In the presented code snippet we can observe how this works: <br /> <br /> </div>

class XsdComplexContentVisitor extends XsdAnnotatedElementsVisitor {

  private final XsdComplexContent owner;
  
  @Override
  public void visit(XsdRestriction element) {
    owner.setRestriction(ReferenceBase.createFromXsd(element));
  }

  @Override
  public void visit(XsdExtension element) {
    owner.setExtension(ReferenceBase.createFromXsd(element));
  }
}

<div align="justify"> In this example we can see that <i>XsdComplexContentVisitor</i> class only implements two methods, <i>visit(XsdRestriction element)</i> and <i>visit(XsdExtension element)</i>. This means that the <i>XsdComplexContentVisitor</i> type only allows these two types, i.e. <i>XsdRestriction</i> and <i>XsdExtension</i>, to interact with <i>XsdComplexContent</i>, since these two types are the only types allowed as <i>XsdComplexContent</i> children elements. </div> <div align="justify"> The XSD syntax also especifies some other restrictions, namely regarding attribute possible values or types. For example the <i>finalDefault</i> attribute of the xsd:schema elements have their value restricted to six distinct values: <br /> <br /> <ul> <li> DEFAULT ("") </li> <li> EXTENSION ("extension") </li> <li> RESTRICTION ("restriction") </li> <li> LIST("list") </li> <li> UNION("union") </li> <li> ALL ("#all") </li> </ul> To guarantee that this type of restrictions we use Java <i>Enum</i> classes. With this we can verify if the received value is a possible value for that respective attribute. <br /> There are other validations, such as veryfing if a given attribute is a positiveInteger, a nonNegativeInteger, etc. If any of these validations fail an exception will be thrown with a message detailing the failed validation. </div>

Rules Validation

<div align="justify"> Apart from the type validations the XSD syntax specifies some other rules. These rules are associated with a given XSD type and therefore are verified when an instance of that respective object is parsed. A simple example of such rule is the following rule: <br /> <br /> "A xsd:element cannot have a ref attribute if its parent is a xsd:schema element." <br /> <br /> This means that after creating the <i>XsdElement</i> instance and populating its fields we invoke a method to verify this rule. If the rule is violated then an exception is thrown with a message detailing the issue. </div>

Reference solving

<div align="justify"> This is a big feature of this library. In XSD files the usage of the ref attribute is frequent, in order to avoid repetition of XML code. This generates two problems when handling the parsing which are detailed below. Either the referred element is missing or the element is present and an exchange should be performed. To help in this process we create a new layer with four classes: <br /> <br /> <b>UnsolvedElement</b> - Wrapper class to each element that has a <i>ref</i> attribute. <br /> <b>ConcreteElement</b> - Wrapper class to each element that is present in the file. <br /> <b>NamedConcreteElement</b> - Wrapper class to each element that is present in the file and has a <i>name</i> attribute present. <br /> <b>ReferenceBase</b> - A common interface between <i>UnsolvedReference</i> and <i>ConcreteElement</i>. <br /> <br /> These classes simplify the reference solving process by serving as a classifier to the element that they wrap. Now we will shown a short example to explain how this works: <br /> <br /> </div>

<?xml version='1.0' encoding='utf-8' ?>
<xsd:schema xmlns='http://schemas.microsoft.com/intellisense/html-5' xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
	
    <xsd:group id="replacement" name="flowContent">         <!-- NamedConcreteType wrapping a XsdGroup -->
        (...)
    </xsd:group>
	
    <xsd:choice>                                            <!-- ConcreteElement wrapping a XsdChoice -->
        <xsd:group id="toBeReplaced" ref="flowContent"/>    <

XsdParser

Install / Use

README

XsdParser

Installation

Usage example

Navigation

Parsing Strategy

Type Validations

Rules Validation

Reference solving