That's it in a nutshell. I'd just like to know where I ought to post this:

We're going to be using xml schemas, either from back office systems or set up specifically as test data. This will go through HP Exstream to produce pdf documentation. We think testing the xml from the back office system is 75% of the problem and would like to automate it. We also need to work out how to (regression) test the pdf output. In the past this has been checked manually but we would like to automate it too.

Thank you for our help, this is a first-time post.