Complex automatic file compare (tool) question
Within a project that Iím working on a old system is being replaced by a whole new system. All the data from the old system is being migrated to the new system. A part of the output of the old and the new system are large quantities of XML files. One of the business requirements is that ALL files are to be compared. A 100% match is necessary; any deviations need to be explainable.
But the new system has changed and the content of the XML files will look somewhat different;
- part of the information is has a different location within the file
- part of the information has been deleted, or added
- part of the information has changed (for example; first name into first name + last name)
Because of the large quantity of the files and the limited period we can compare, a compare by hand is simply impossible.
Are there any automatic file compare tools to handle such a situation automatically? The cheaper, the betterÖ Other propositions to handle this are also welcome.
Re: Complex automatic file compare (tool) question
Well if the Business Requirement says that deviations need to be explainable, exactly what is a 100% match if the files to be compared are already known to be different? Is it just the data that needs to be the same? And if its already known that data is going to be deleted, add and otherwise changed - how are you going to match the files or data since there won't be any straight correlation.
If you are doing straight compares, there are diff tools out there you can use, but if its the data that you need to verify then you're probably looking at scripting something in Perl or Python to handle the text and to be able to support some regular expressions. I've used Perl successfully in the past to handle text manipulations and compares, but its all a matter of what you are familiar with. With the scenario you describe I am not sure any tool, out of the box, is going to do what you want.
Nothing learns better than experience.
"So as I struggle with this issue I am confronted with the reality that noting is perfect."
Now wasting blog space at QAForums Blogs - The Lookout