The first page or two of
National's AN-222 has an OK essay on BJT noise.
Ignore the first part where they toot their own horn. Paralleling identical devices will reduce voltage noise compared to a single device, true. However one BIG device works the same as many small devices. Since we can buy hefty switching transistors much larger than anything on an IC, for cheap, that argument won't hold water.
Their idea was to build a
dual device with excellent matching by laying out 100 transistors like this:
ABABA
BABAB
ABABA
B
ABAB
ABABA
Note that there is a "defect", a thin-spot or spec of dust on the wafer. If they put two big transistors on one wafer, this would usually screw-up one side of the "matched" pair. By interleaving many-many A and B pieces, any defect will probably screw-up both sides equally, so the match stays good.
Their noise essay does not explain why there is voltage noise and current noise (the LM194 isn't especially "special" for current noise"). But their theory is plain and correct: for silicon transistors made in the last 30 years, noise is simply predicted by theory plus "Rbb" which is related to base contact area, which is related to overall size, which frankly is a lot like the Max Current spec. And for all normal transistors for the impedances we would call "medium" or "high", you can calculate the noise without knowing much about the actual transistor.
With any silicon transistor, the optimum emitter current in Amps for lowest noise from a specified source resistance Rs is about 0.6/Rs. Or for convenience: milliAmps = 600/Rs. Or: for 600Ω, use 1mA; for 60KΩ, use 0.01mA or 10uA.
Beta (Hfe) affects this a little. But a 10:1 change in Hfe changes the optimum current only 3:1. And the optimum is so broad that a 3:1 "error" in current causes only a small change in noise. What is really happening though is that low-Beta transistors have more current noise. Everything else equal, you can usually do better with a high-Beta transistor. The very small (50mA max) "low noise" transistors of the 1970s were really high-Beta transistors: good for high impedance sources but the cost of processing for high Beta means they are small and won't be best for low-Z sources.
All this is fine from fairly low impedances up to quite high impedances. But when we get down to 150Ω, as when trying to get signal from a recording microphone without a transformer, we can not ignore the base spreading resistance Rbb. In many small transistors it can be over 100Ω, which must be added to our source resistance, but adds only noise not signal. The LM194 has Rbb below 40Ω, so even two of these (differential) has only marginal effect with a 150Ω source. It is still the largest "unnecessary" noise source, so we could wish for something better, but there are only a few options. The MAT duals used to be a little better, but not enough to notice, and it seems they may now be made on LM194-type dies. THATcorp has some transistor arrays. And then there are BIG transistors, like the TIP series. These may have low base-spread resistance. But they are pretty fat, and when run at the 0.6/Rs current level (about 8mA for a 150Ω source into a differential input) their Beta may be extremely low, causing not only noise but bias difficulties. They may also have leakage current, which is not a problem for clean silicon working anywhere near a "reasonable" current for the device size, but when running FAR below rating we may drown in Iceo noise. Hence we see a lot of 2N4401-like devices: 500mA parts that don't stink too bad at 1ma-5mA.