Monday, June 25, 2007

Validating XML in .Net

There might be a case where you are importing XML file from somewhere and need to validate it before performing some operations on the XML file. Though, there can be many ways to do so, probably the best is using XMLValidatingReader or XMLReaderSettings. It provides you mechanism to provide a XML file, add a schema and validate the XML file supplied against the schema. It also provides an event (ValidationEventHandler) which is fired as soon as an exception occurs in validating the XMLDocument. Please note that this class has been marked obsolete in .Net framework 2.0 which recommends using XMLReader.Create() with proper XMLReaderSettings for validating the document. We will be looking at both the cases i.e. using XMLValidatingReader as well as XMLReader.Create().

Lets assume that the XML file that we wish to validate has details about orders. Each of the Order contains the Order ID, details about the Customer, Discount percent given and the details about Products. Following graphic(click for larger image) shows the XML file that we are using:


Click to view larger image

Orders.xml


To validate the XML shown above, we have the schema. The schema for the XML can be found here.

Validating XML using XmlValidatingReader:

To validate the XML file against a schema, we can use XmlValidatingReader. The steps involved are:

  1. Create instance of XmlValidatingReader using the XML file to validate.
  2. Create an object of XmlSchema using the schema file to validate against.
  3. Hookup event for XmlValidatingReader.ValidationEventHandler.
  4. Read the Xml to the end using XmlValidatingReader.
Please note that in absence of event mentioned in point (3), an exception will be thrown on the first error encountered in XML file where as in case of having the ValidationEventHandler hooked up, all the errors in the XML file can be displayed. Following code shows the steps mentioned above:


using (XmlValidatingReader xmlValidatingReader = new XmlValidatingReader(new XmlTextReader("Orders.xml")))

{

// Create the schema object to validate XML files

XmlSchema xmlSchema = XmlSchema.Read(new XmlTextReader("Orders.xsd"), new ValidationEventHandler(Schema_ValidationError));

// Add to the collection of schemas for XmlValidatingReader

xmlValidatingReader.Schemas.Add(xmlSchema);

// Attach an event which will be filed on validating error

xmlValidatingReader.ValidationEventHandler += new ValidationEventHandler(xmlValidatingReader_ValidationEventHandler);

// Read the XML to the end

while (xmlValidatingReader.Read()) ;

Console.WriteLine("\nFinished validating XML file....");

}




Validating XML using XmlReaderSettings:

Validating XML using XmlReaderSettings follows almost the same path as that for XmlValidatingReader, with a difference that here we will be using XmlReaderSettings to pass in the schema details. Following are the steps to using XmlReaderSettings for validating an XML file:

  1. Create a XmlSchema object using the schema file.
  2. Create an object for XmlReaderSettings.
  3. Set the ValidationType of XmlReaderSettings as "Schema".
  4. Add the XmlSchema to the collection of schemas of XmlReaderSettings.
  5. Attach the event for XmlReaderSettings.ValidationEventHandler.
  6. Create an instance of XmlReader using XmlReader.Create() and passing in XML file to validate.
  7. Read the XmlFile to the end.
In this case also, absence of ValidationEventHandler will cause exception on the first error encountered in reading the XML file. Following code shows the above mentioned steps:

// Create the schema object

XmlSchema xmlSchema = XmlSchema.Read(new XmlTextReader("Orders.xsd"), new ValidationEventHandler(Schema_ValidationError));

// Create reader settings

XmlReaderSettings xmlReaderSettings = new XmlReaderSettings();

// Set validation type to schema

xmlReaderSettings.ValidationType = ValidationType.Schema;

// Add to the collection of schemas in readerSettings

xmlReaderSettings.Schemas.Add(xmlSchema);

// Attach event handler whic will be fired when validation error occurs

xmlReaderSettings.ValidationEventHandler += new ValidationEventHandler(xmlReaderSettings_ValidationEventHandler);

// Create object of XmlReader using XmlReaderSettings

using (XmlReader xmlReader = XmlReader.Create(new XmlTextReader("Orders.xml"), xmlReaderSettings))

{

// Read XML to the end

while (xmlReader.Read()) ;

Console.WriteLine("\nFinished validating XML file....");

}



The code behaves the same in both the cases. However, there is a small difference in how the exception message is displayed. Using theXmlValidatingReader tells about the element/attribute with incorrect value as per the data type, but using XmlReaderSetting provides a detailed error showing the incorrect value along with the expected datatype.

I hope I was able to provide most of you with a good insight on validating XML files using schemas. If you have been using some other method, please do share it here.

kick it on DotNetKicks.com

16 comments:

Anonymous said...

Great example! Do you know how we can do the validation for a bunch of XMLs against a set of schemas; instead of iterating over the schemas and adding them for validation? Hopefully I am clear in my question.
-Thanks
Chak Bireddy.

Rajdeep Kwatra said...

To validate multiple XML files against a set of schemas, you need to add all the schemas to the Schemas collection of XMLReaderSettings. However, XML files needs to be individyually validated. You can add the XmlReaders to list(one reader per file) and use the same XmlReaderSettings object in XMLReader.Create(). Loop over the list and read all XmlReaders to end. Thats it! Hope it was of help to you :)

Anonymous said...

If instead of [Orders](brackets used instead of curlies) someone gives an xml file that changes it to [Orders xmlns="abc"] the validation seems to fail with an error about abc:Orders element not declared.

Is there additional code that would allow validation of the XML for either situation when you don't know ahead of time which you are validating?

Rajdeep Kwatra said...

I am not very sure that I understood your question correctly. Can you please elaborate/provide an example?

Anonymous said...

Sorry for the confusion of my last post I didn't mean curlies.

I noticed if the Orders.xml file is changed and the first element is changed to < Orders xmlns="abc"> instead of just < Orders> the validation will fail. (Added spaces before the word Orders to allow this post).

Is there any simple way you know of to add a few more lines of code that would be able to validate the XML file no matter if the default namespace exists in the XML or not?

I may resort to string replacement and remove the default namespace but I was hoping there is a way to avoid that.

Rajdeep Kwatra said...

The XML file will be validated if you add the targetnamespace to your schema file(in your case, "abc"). If you wish to do without it, you might have to resort to string manipulation as you said.

Hemendra said...

Please help out in following code snippet, please reply me at hvyas@irevna.com:

For the first time it gives prompt while XmlSchema.Read(). Second time schema_collection_object is not null and so does not enter into this block of code.

Please help............



private static XmlSchemaCollection schema_collection_object;
private static XmlSchema m_ifschema;
-------
-------
-------

if (schema_collection_object==null)
{
schema_collection_object=new XmlSchemaCollection();

XmlTextReader xmlFile=new XmlTextReader(path);
//path contains the path of the .xsd file

m_ifschema=XmlSchema.Read(xmlFile,new ValidationEventHandler(ValidationCallBack));

//*******error is genarated in the above statement



m_ifSchema.Compile(newValidationEventHandler(Valid ationCallBack));

schema_collection_object.Add(m_ifSchema);
}

Anonymous said...

suppose I added new tag or attribute to xml file then will it work ?
it should show the error because this tag or attribute is not there in schema

Rajdeep Kwatra said...

Yes! Adding a new element/node will result in an error. Let's say, I add a new node by name "Date" in the Orders.xml, I'll get an error saying "The element 'Orders' has invalid child element 'Date'. List of possible elements
expected: 'Order'."

Anonymous said...

ok but do we need to have two event handlers?
like...
Schema_ValidationError
and
xmlReaderSettings_ValidationEventHandler
???
thanks

Anonymous said...

The element 'Orders' has invalid child element 'Date'. List of possible elements
expected: 'Order'."

I can see this error in my Xml where i added new node but if I write this validation code to my button click event this it doesn't show any errors

Rajdeep Kwatra said...

@anonymous,
You need to subscribe to xmlReaderSettings.ValidationEventHandler apart from one in XMLSchema constructor. I think it should work.

Dure Sameen said...

Great example.

Unknown said...

Is there a way to validate files that have been sent with incorrect namespace declarations.

Anonymous said...

What if you dont have the xml schema. I just want to validate only xml document i.e., creating xml schema on fly. Is it possible to create on fly the xml schema.

Rajdeep Kwatra said...

@Anonymous,
You can generate schema from the Xml file but that won't be correct way to validate your Xml. Since you'll be using an XML file to generate schema, Xml will always be validated against it. i.e. validation will never fail. Are you sure you want to do this only?