Friday, 7 August 2015

Generating xsds for XML interfaces

XSD is XML Schema Definition, one of those recursive acronyms that I like. In this post, I share my notes on generating an xsd file from a given XML file. But first a disclaimer: It’s been quite a while since I played with XML, so my methods maybe a bit outdated; if you know of better tools and automation approaches please let me know.

XML has been the predominant data-exchange format in the recent times and probably continues to be so. Typically, the XML data format would be finalized by clients and business analysts, and handed over to programmers.

Programmers then generate the Java classes corresponding to these XML files. These JAXB classes are used by the communicating software systems to marshal and unmarshal XML data. For generating them, programmers would use tools like xjc, but in order to do that, they need the schema i.e., definition which is the .xsd file.

The XML - XSD steps
The starting point is a tool. A simple one available online is XML Grid [1]. Paste your XML into the text box there and click Save. Download the xsd file. Open it in Eclipse. If there are errors, you will immediately see them marked with red marks. Fix the errors. Wash, rinse, repeat.

Wash, rinse, repeat - it is needed because, even though automatic tools do a good job in generating the xsd file, they go wrong when the data has a combination of attributes and children.

Here are some basics to fix errors. I will illustrate with examples.
Let’s say our data is a collection called persons which has many number of person’s. Each person has a full_name and a child and multiple child_name’s.[2]

Here's a sample XML data:
The structure is:
persons - person - full_name(1) / child_name(0..many)

The corresponding xsd is:
So persons - person - full_name/child_name in the XML

becomes in the xsd:
element name=”persons” - complexType - sequence - element name=”person” - complexType - sequence - element name=”full_name” / element name=”child_name” minOccurs=”0”
That is, element - complexType - sequence - element - complexType - sequence - element / element.

Let’s take a different example. Suppose there is one element with one attribute and one value [3] --
The xsd for this is:
Element, attribute, value in the XML become
element - complexType - simpleContent - extension - attribute in the xsd.
In other words, a nested complexType with simpleContent that extends xs:string

Let’s complicate it and take a real-world example. We will combine the above two cases and have a data set that has both attributes and multiple children.
The data format is: Customer has contact details. Contact has name("first"), name ("last"), an email, phone with attributes type, time, preferredcontact, an address with street (“1”), street (“2”), city, regioncode, postalcode, and country, a timeframe with a description.

The xsd turns out to be:
So, if you remember the two sequences given above:
element - complexType - sequence - element - complexType - sequence - element / element
and
element - complexType - simpleContent - extension - attribute
you can check that you assembled your xsd with a combination of these two patterns properly and easily get rid of your errors.

Final point: you might see an exception that an element has both a type attribute and an anonymous type child. Just make sure that your schema is written as per the following syntax rule:

“On any element declaration use either the type attribute or a type definition child (simpleType or complexType), not both.” [4]

References:
1. http://xmlgrid.net/xml2xsd.html
2. http://www.w3schools.com/schema/schema_complex_indicators.asp
3. http://www.w3schools.com/schema/schema_complex_text.asp
4. http://stackoverflow.com/questions/12908197/element-programmetype-has-both-a-type-attribute-and-a-anonymous-type-child

No comments:

Post a Comment