The "phase reversal" at clipping and other abberant clipping behavior comes from two places.
The "devil's horns" in positive overload is due to the Q4-Q5 stage and D6 causing the drop across the emitter resistor of Q5 to get much larger than when the loop is closed. This is because now both the normal ~9mA and additional current from Q1 are flowing. The output emitter followers faithfully reproduce the collector voltage of Q5, which more or less follows the voltage at the R5-Q5 junction.
On the other overload swing the thing clips not too badly, but notice the transient instability coming out of clipping. I think this is due to the overload state saturating Q6, which then pulls down on the D4-D5 bias and this nearly shuts down Q2.
Note that, if the output stage could swing a bit higher, on the positive swing the reverse protection diode D1 could get forward-biased and for a highish source Z induce latchup. But this can't happen unless Q5's emitter were sitting a lot lower, as it requires both D1 and the collector-base diode of Q1 to be forward-biased.
What are the fixes? You could contrive a different clamp mechanism for the Q4-Q5 stage (for example, a third transistor' emitter to +18V, base tapped down a split R5---but then you would have this actuate only after a delay with C2 there). You could make R5 something else, that biases the emitter as it is now but is stiffer, for example an LED or a couple of diodes, or a Q Vbe multiplier. For the negative swing problem I suspect it would help to separate the I source bias sources, or at least put an R in series with each base and run a bit more current through the diodes. Or, as suggested, use the two-Q I source for each.
You could hard-clamp the bases of Q8 amd Q9, but this will still lead to slow recovery due to C2-R5.
Or, you could just avoid clipping :green:
EDIT: Actually I don't think you need as much as 1.7 volts drop to the emitter of Q5 anyway. Q3's collector will pull to within about 0.8V of the rail, which should be plenty enough to operate the Q4-Q5 stage.
EDIT 2: Yes---tie Q5 right to the rail, put 1k or so in series with the base of Q6. Things clean up nicely and the clipping is almost symmetrical.