I start with the Business Requirements documentation. There are usually valid/invalid input examples. If not, ask your user what a typical business flow looks like and mimic it. Business users can also provide good input for valid and invalid data inputs that ulitmately kick out on any report after appropriate data massage.
If you have access to the code, you can also tell by branch and condition statements what is valid and invalid data.
As you said, there is also some element of gut instinct and experience involved too.
What about first looking what test cases must be covered. Once you know what cases to cover you can prepare the initial data set you need, and the expected results.
I do agree that business requirements contain examples, but then aren't these examples of the cases you need to test. What other cases do exist for which no examples are given.
Summary: for me you only have enough information to prepare your initial data set when you have your test cases and test procedures prepared. In fact I see test data preparation as the last step in the preparation of the test procedures.
Steven is absolutely correct - you don't know what test data you need until you have written the test cases. On each test case, it is worth having a Test Data section where it can be detailed what test data will be required in order to run the test case. These can then be gathered together to prepare the test environment.
To wrap up things...before creating a test bed for ur software u need to first analyze ur basic requirements of the application..u need to research both valid and invalid values that can be given as input to ur AUT...to do this u need to have domain knowledge..in some projects u cannot give data which will accepted by the AUT but is actually coming out from nowhere..such as providing values relevant to a banking project to a healthcare project....hence u need to have substantial domain knowledge in preparing a test data for ur project..
1. I decide what to test in particular case and watch what kind of data come if a test case derived from requirements, specification, gut instinct.
2. I try to find a general model that would work here asking myself: Why in the list there is 1 and 5, and 3 is not tested, trying to understand what kind of bugs “1” tests and using a model try to fill holes in test cases?
3. I try to find what kind of testing I missed. For example, in my cases I concentrated on data that test basic functionality and forgot about configuration or localization. Fill in holes again.
4. I write test (data) generator.
5. (actually it is done during second through fourth steps) I review the test engine (the engine that accepts input data) and usually my tests is divided on few independent tests.