One of the key insights for me was noting the MOSFET source-follower self-bootstrapping of the large gate-source capacitance and how it is affected by the load. The absence of a d.c. drive requirement is misleading to some---the important effects pertain to capacitances and also to the intrinsic high-frequency figures of merit.
But having no d.c. involved is handy, and the absence of saturation effects in heavy conduction compared to bipolars can simplify drive requirements as well.
Against this you have the disadvantage that rarely will a batch of parts be consistent enough to use "out of the box" unless the circuit is very clever and/or has adjustments. The distribution of threshold voltages is just way more broad than bipolars. As well, the mobility of holes in silicon is way less than electrons, so the P parts are much worse than the N for comparable geometries. Thus, the circuits will be hard-pressed to work symmetrically in an output stage: either gm of the P device will be lower than the N, or the capacitances will be higher. The result, for a straightforward complementary output stage design, will be a gain asymmetry and consequent odd-order distortion. (EDIT: brain fart thanks PRR--even-order)
There are some bipolar power transistors with pretty decent parameters that have been around for a long time, with f sub t's in the 30-60MHz range and pretty good safe operating area, and fairly constant beta over a wide range of Ic's. But if you really want to go as fast as possible and have really high loop gain the DMOS stuff in a good design will beat them I think. As Samuel says (about), anything that tries to go that fast will be load-sensitive---although there are techniques where the loop compensation ends up being partly controlled by the output loading to make the amp less prone to instability. There's always the old trick of the L-R series network to isolate capacitive loads, too.