Your solution is 99% correct. My only quibble is 4k7 is a fixed resistor value, which is not common for potentiometers unless you pay for custom tooling and buy maybe 10,000 or more. Fortunately a value of 5k is common and readily available.At the risk of over-simplifying, do you even need stereo, if it's just a speech feed?
If mono will do, it's now totally simple:
View attachment 117570
You just need a DPDT switch and a single-gang pot, and drive both headphone channels from one transformer.
(A 4k7 pot will probably do just fine. With a pot driven from a low-impedance source, the maximum output impedance is one-quarter the total impedance, which occurs at 50% track resistance).
Even a linear taper pot would work decently as loading it reasonably heavy bows the transfer curve down approximating a "linear in dB" type log-taper pot. FWIW I've done this, plotted out via spreadsheet for proof, just recently for an oddball DI box I'm building.