I would have to think about gains, I don't remember if there is a convention of having attentuation on the sum (M) signal so you don't get overload problems or not.
There is still a debate about it. Some sources suggest direct summing, such as M=L+R, without any consideration for the level increase.
For two L and R perfectly correlated signals, which is the case wih a single mic picking up a source placed on-axis, the M signal is 6dB louder, which may result in overload. In order to restitute the original levels when decoding, a 0.5 factor must be applied, which gives L=1/2(M+S) and R=1/2(M-S)
In order to compensate that, many recommend, aptly IMO, a 0.5 factor at coding, which prevents overload, resulting in M=1/2 (L+R) and S=1/2 (L-R)
The disadvantage of both methods is that the coding and decoding matrixes are different.
Now, there are some who advocate 3dB attenuation when encoding, accompanied with same 3dB attenuation at decoding, which gives
Left = (M+S)-3dB Right = (M-S)-3dB at coding, followed by decoding Mid = (L+R)-3dB Sides = (L-R)-3dB
The only advantage is making the MS to LR decoder identical to the LR to MS coder, but it still presents a risk of overload due the potential 3dB increase in level.
I think it's not of much practical use, since one of the major considerations for using MS is the ability to change relative levels.
Wayne Kirkwood, in the Pro Audio Design site, advocates a hybrid solution, where the level of the S signal can be increased by 6 or even 12dB, with corresponding attenuation at decoding.
The possibility of having out-of-phase L and R signals, which may result in overloading the resulting S component, must not be discounted, however most musical recordings show a large correlation between L and R, which tends to ascertain the near 6dB increase in the M signal and a significantly lesser S signal.
However, for nature or sound effects recordings, the possibility of largely decorrelated L and R is real.