You can play around with the Heiarchy settings to get the right blend. You can organize by Tag for a flatter tree vs. DOM model for the tree used by IE for rendering. There's also a Hybrid view which is something between using Tag and DOM.
As for identifying HTML buttons, it should be possible to identify without using coordinates given it has some unique property. You can either hard code this, but it's easier to use a NameMapping /w Aliases.