> use the FET in series with the signal allowing it to flow through the FET or shorting the audio to ground via the FET?
The voltage across the FET for low distortion is very small, say 50 milliVolts.
The level into a limiter is ideally unlimited, and in practice may peak 20 or 30 dB above the limiting level, typically many volts. If you attenuate the input so a hot source can't overdrive the FET, the nominal levels are very low, near noise level.
> put up with the high levels of distortion caused by the difference between the ac potential of the drain (or is it source.... I can never remember) and the gate.
The gate voltage must be compared to the middle of the channel, not just to one end. If your gate voltage is much-much larger than your signal voltage, that may not matter. If your gate voltage were say "-1V" and the two ends of the channel were at 0V and +2V, the effective gate voltage would be -2V; when signal reversed to 0V and -2V it would be zero V. If the gate/gain control voltage should be -1V but varies from -2V to 0V in the course of the cycle, that wave will be very bent. Smooshed on one side and tall on the other. 2nd-order distortion. You can push-pull and get complicated; it actually (for once) works just as well to throw half the signal voltage onto the gate (does about the same thing). That works to kill the gross 2nd-order distortion; the residual high-order products and the inevitable fall-to-pieces at high level still remain. (My impression is that they are not huge, up to that point; but hugeness is in the ear of the beholder....)
Note also that some THD, even 3rd, at and above threshold is not necessarily bad. Limiting the voltage reduces musical dynamics. If it is not just sloppy playing, you may want the dynamics, just not the high levels. A dash of distortion can give the impression of "louder" without actual higher voltages, lessening the squashing effect. The difference between loud shouting and VERY loud shouting is often not a matter of voice power, but voice timbre. Taken to extreme, this explains why the crappy virgin LevelLock in my closet is worth more now than it was back when. It "screams" without clipping the tape/CD.
> use FETs as mute elements
That works because the FET is driven hard-off, and so fast that we don't hear the distortion as it goes from on to off. if we leave the FET part-on, the distortion will be gross.
The only way is to put the FET to ground so that it never sees any more than the limited signal. Of course for that to work, the series resistor must be larger than the FET ON-resistance by the amount of limiting. If the FET can be pushed down to 1,000 ohms, and we want 40dB of excess input, we need a 100K series resistor. When not limiting, this is our noise source. The noise of 100K is about 4uV. Add a uV for post-amp noise, 5uV. Maximum level at this point is the 50mV or 50,000uV that the FET will stand before folding. Maximum signal to noise ratio just below threshold is then 50,000uV/5uV or 80dB. This was ample for tape and LP, may seem small for digi-media. We can fiddle the numbers to look better, but there is no huge improvement with classic FETs. (I have not had a hard look at the super-JETs; my suspicion is that the reduction in ON resistance is matched by a reduction in voltage before distortion, so little or no net gain. One of the FET boys may prove me wrong.) We do know that many people use and love 1176 type FET limiters, and even LevelLock which lacks basic THD-reduction, so a super S/N number must not be the most important thing in life. And the FET limiter is the only great $1 gain-cell. LDRs are fun and natural, but annoyingly inconsistent, proper VCAs cost at least a couple bucks (thank you, THAT!), and a twin-tube gain cell runs closer to $50 even with dubious transformers.
> So should we focus on creating a simple sidechain first and then we can try different elements for the actual attenuation?
No because....
> well designed detection circuit can be scaled to work with most any modifying element..
Aside from the gross differences (positive or negative, milliVolts or tens of volts), every gain-cell has a different control law.
There are BJT cells that are straight linear.
There are proportions for LDRs and tube cathode followers that come close to straight linear over a limited range.
There are BJT cells that are perfectly exponential V/dB.
There are tube cells that are sloppy exponential V/dB with a broad soft bend right where low-pressure engineers may spend a lot of time.
Yes, you can add or take-away an exponential in the sidechain. But this looks to me like a long muddy road. It may be better strategy to pick a path, any open path, and slog along until you reach a happy answer or are forced to abandon it. Certainly we have a big map of limiter-paths that are not dead-ends.
And I think some of the "sound" of some types, aside from the time-constants and residual THD that Tedf discusses, is the "imperfect" control law. While the very-Mu scheme is frustrating if you want a brick-wall limiter, it has a very natural shape that can be easy on the ear, built right into the shape of the vacuum. The BJT gain-cells have that wonderfully exact control law that you can shape as you please.
Also the choice of linear or exponential dictates your time-constant choices. The exponential laws fight the exponential decay of an R-C or I-C network. DBX had an active capacitor that worked inside an exponential sidechain, but I've stared at it for hours and think it is screwy (I don't argue with the results, just baffled by the circuit).
If you think feedforward is always better, control law is vital. Yes, you can trim an imprecise law to give a tolerably constant or curved slope over 10dB or 20dB, but ultimately an FF limiter needs detector and gain-cell very well matched over many dB, which discourages the "sloppy" control laws on LDR, FET, and very-Mu in heavy-limiting and graceful gross-overload applications.
> (circa 1972) was the first one where I saw the 'auto release' circuit, but I believe that its origins are earlier than that.
I first noticed it far later, but it is present in the Fairchild 660 and as NYDave says it was written up long before.
While early claims mention reducing charge/discharge current ratio, it does not give the same result as a single R-C with a honkin-big driver. It does give a very musical result, though "auto" is surely an exaggeration (Narma gives two dual-RC positions, so even with two RCs no single pair of time constants was truly "auto" enough for him, and maybe not for all of us.)
> alter the standing voltage of the bottom 'ground' connection, or even make it dynamic
Yeah, there are a zillion frills and most of them have been tried. "Dynamic" release can include going into Hold for a short time after a big transient, or going into Hold if the current level drops low; these tricks "fix" specific types of pumping, though may also do The Wrong Thing for some program material. (I once watched an old movie on TV that seemed to have been designed to trick the station's level controller every way possible.)