> I always thought RMS was the average and that's that...?
Note that in audio, "Average" is average of the absolute value. A high-passed waveform always averages to zero. That's not interesting. As you said, we rectify (get the absolute magnitude without polarity) and then average. Obvious to analog-heads, but some algorithm-geek may jump right into AVG() and wonder why it don't work.
RMS inherently loses polarity info, no rectifier needed (altho many practical implementations need a rectifier because their "square" is one-quadrant).
As you know, average is Sum/count and RMS is heating-value. When heating, the high parts count more than the low parts, since heat is square of voltage.
In bench-meters, a $10 meter measures Average, a $20 meter may read Peak-peak. Both are marked (or scaled) so that on SINE waves they read the usually-cited RMS number. The Avg meter has a 1.1 multiplier, my P-P meter has a 1/2.828 divider. But a True RMS bench meter costs $100++, because it has to have a squaring-chip (or heater/thermo) to give actual RMS of arbitrary waveforms. (And even then, any practical True RMS meter has a Crest Factor specification, higher crest-factors won't read right.)
A sine wave is often sold as RMS, the Average is 0.9 times RMS.
A one-sided square wave which is zero V half the time and 2V the other half, averages to 1V, but heats like 1.4V.
Here is a lopsided clipped sine showing Ave about half of RMS.
It is trivial to devise odd waveforms to give odd Avg/RMS ratios, if you enjoy confusing your meters.
However if we high-pass an asymmetrical wave the ratio is different. A very narrow positive spike, hi-passed, gives Avg/RMS near 0.13.
Since DC is not an interesting sound, just makes trouble, we always have some DC-block, a high-pass, somewhere in the chain.
But your singer isn't trying to confound the Avg/RMS ratio. The Avg/RMS ratio of most speech/music will be less than 1 but probably greater than half, and fairly consistent over any long run of mixed music (not pure synth-tones/noises). I would guess 0.8. It isn't very important. Long-term reading of speech/music is inherently sloppy approximation. Level is bopping up/down constantly. And even if we believed that RMS=Volume, I know I could set two different tracks to identical RMS readings, and they would not sound same-loudness.
It isn't precision. When you cut from music to news, or from newsroom to remote, you don't want level to change "much", like -10dB to -30dB. But you can cut-over from -10dB to -15dB and the listener tolerates the level shift. You get the general level generally similar. And an Average will be as good as an RMS except for very abnormal sounds.
Note that in audio, "Average" is average of the absolute value. A high-passed waveform always averages to zero. That's not interesting. As you said, we rectify (get the absolute magnitude without polarity) and then average. Obvious to analog-heads, but some algorithm-geek may jump right into AVG() and wonder why it don't work.
RMS inherently loses polarity info, no rectifier needed (altho many practical implementations need a rectifier because their "square" is one-quadrant).
As you know, average is Sum/count and RMS is heating-value. When heating, the high parts count more than the low parts, since heat is square of voltage.
In bench-meters, a $10 meter measures Average, a $20 meter may read Peak-peak. Both are marked (or scaled) so that on SINE waves they read the usually-cited RMS number. The Avg meter has a 1.1 multiplier, my P-P meter has a 1/2.828 divider. But a True RMS bench meter costs $100++, because it has to have a squaring-chip (or heater/thermo) to give actual RMS of arbitrary waveforms. (And even then, any practical True RMS meter has a Crest Factor specification, higher crest-factors won't read right.)
A sine wave is often sold as RMS, the Average is 0.9 times RMS.
A one-sided square wave which is zero V half the time and 2V the other half, averages to 1V, but heats like 1.4V.
Here is a lopsided clipped sine showing Ave about half of RMS.

It is trivial to devise odd waveforms to give odd Avg/RMS ratios, if you enjoy confusing your meters.
However if we high-pass an asymmetrical wave the ratio is different. A very narrow positive spike, hi-passed, gives Avg/RMS near 0.13.

Since DC is not an interesting sound, just makes trouble, we always have some DC-block, a high-pass, somewhere in the chain.
But your singer isn't trying to confound the Avg/RMS ratio. The Avg/RMS ratio of most speech/music will be less than 1 but probably greater than half, and fairly consistent over any long run of mixed music (not pure synth-tones/noises). I would guess 0.8. It isn't very important. Long-term reading of speech/music is inherently sloppy approximation. Level is bopping up/down constantly. And even if we believed that RMS=Volume, I know I could set two different tracks to identical RMS readings, and they would not sound same-loudness.
It isn't precision. When you cut from music to news, or from newsroom to remote, you don't want level to change "much", like -10dB to -30dB. But you can cut-over from -10dB to -15dB and the listener tolerates the level shift. You get the general level generally similar. And an Average will be as good as an RMS except for very abnormal sounds.