It is really no different from any other testing. You're examining actual results as compared to expected results. Now, the trick is knowing how and where to seek out where to evaluate success/failure.
You might want to discuss with your developers how they have implemented the web services call.
It is a test harness for loading web services. it doesn't have a lot of features, but for just slinging XML against your service from concurrent users, it works for me. I'm not sure if it meets your requirements, but perhaps it is worth a look.
* multi-threaded load generator
* HTTP and HTTPS (SSL) support
* response verification with regular expressions
* execution/monitoring console
* real-time stats
* results reports with graphs
* GUI mode
* shell/console mode