Is there a reason that you would be testing this at a web page level? Is this database driven? If so, then why not run a query on the database itself? That would be a much easier way to find a duplicate.
Now, if you were to run it this way, then I might suggest taking a database backup. However, you can also seed the database with a couple duplicates to ensure that your test is working.
Is this only checks for exact duplicates? Or do you also need to use fuzzy matches?