So this weekend I finally got around to building the circuit, and did some preliminary testing. Probably time to move this topic over to the Lab section - I'll start a new thread there soon. For now, I'll just report my results...
First testing was on the diamond buffer output stage, open loop (with 2SC4793 / 2SA1837 instead of BD139 / BD140). To set up the correct DC conditions, I constructed the entire circuit (no servo) and trimmed it approximately for DC offset. I then ran it open loop by grounding the +IN node and injecting signal directly at the Q6 base using a function generator.
Square wave response looked good, with a small well-damped overshoot. However on the sine sweep, I ran into trouble at about 7MHz. At that point, the output suddenly collapsed into a distorted sawtooth, and did not recover until the frequency was backed off all the way down to 4MHz. One of the limitations of the bootstrap jobbie is that R7 and R10 appear as AC loads on the output, so there's a tradeoff between driver current and output loading. 2mA seems to be too low (I was optimistically hoping otherwise), as the drivers looked to be going into cutoff while the stage was distorting. The cause of the frequency-related hysteresis isn't entirely clear to me yet, but starved drivers in combo with the bootstrap appear to be the culprit. I'll try and post some waveforms soon. As much as I like the the bootstrap aesthetically, to bias the drivers hotter would put more load on the output than I want.
So for now, I'll replace the bootstrap arrangement with CCS-connected JFETs and see how that does. I have 2N5462 and 2SK246BL on hand. The 5462s are only a 40V part, while the K246s are 50V, but the K246 is not as easily available or as inexpensive as the 5462. I was hoping not to have to go this route, because I wanted to avoid using hand-picked parts in the circuit. I guess one can always pad the FET down to the right current with a source resistor, but this still involves part selection. The other alternative would be a couple more BJTs and a bunch of resistors. *sigh*
One last note - I think I'm going to servo at the CCS transistor emitter, as this provides the lowest possible noise (lower resistance at both the base and emitter). Further, if I move the servo to Q3 instead of Q9, the noise-dominating input stage is removed from the servo loop, and the two-capacitor opamp network can change to a single-capacitor version.