There are certain things that tubes do well, and that can be understood based on what we know about psychoacoustics and physics. There are certain things that have come to be associated with tubes and tube sound owing to the constraints that, as PRR says, "hollow state", have placed on the overall circuit design.
Tubes employ rather large structures compared to transistors. Consequently the thermal effects tend to be rather long-time-constant in nature. There are a lot of charge carriers involved, although they are a lot less densely distributed. There is a lot of thermal jostling of those carriers and probably less of an interaction of the associated noise with the signal---some have said that is is easier to ignore the noise in tube electronics, that it seems to float there independent of the signal somehow.
As well, the input capacitances tend to be more constant with operating point, reducing a distortion mechanism compared to solid-state devices used in similarly simple configurations.
To take advantage of the very high input Z of tubes and the highish output Z, we use transformers. Were it not for tubes the art of lower-frequency multi-octave transformer design would probably not have advanced nearly so much as it has.
Transformers are wonderful devices for isolating systems from the vagaries of interference like hum and much EMI. They also tend to be inherent bandpass filters. And like any other component they have limitations at high signal levels, and very distinct ways of responding to such overload that is usually measureably and subjectively benign.
Given transformers and their limited bandwidth, or even without transformers and just with the constraints of R-C coupling, it becomes difficult to apply large amounts of negative feedback around a multistage amplifier. But thus is sidestepped some of the problems involved with poorly-designed high-feedback systems.
In addition, triodes at least with high-Z plate loads can be remarkably linear voltage amplifiers over a useful signal range. But it is hard to extract this signal and drive a load without spoiling the performance.
Another characteristic of tubes that worked to constrain designs is their existence in only one polarity. This favored balanced designs from the outset when one was trying to achieve high performance, which conferred other advantages in many ways.
Bipolar solid-state devices were the first practical ones available. It was natural to insert them into tube-like circuits as those were all we knew, but they didn't work very well. They do have high transconductance though, and thus local feedback can often linearize their performance. Since they come in two different polarities a number of topologies become available that are difficult or impossible with tubes. But there was much to learn.
The transistor is a microscopic structure compared to a tube, especially when the two handle similar currents. It's instructive to crack open a plastic small-signal transistor and realize just how tiny the actual chip is. And then realize that what you see is a chip where most of the action is occurring very close to the surface, typically.
The transistor also has a variety of parameters that are sensitive functions of temperature. Couple this with the tiny thermal mass and you have a recipe for signal-induced shifts in those parameters---with which, by the way, affordable simulators don't even attempt to deal.
Bipolars do have an amazingly accurate conformance to an exponential dependence of collector current on base-emitter voltage, at a constant temperature. This can be exploited.
But for the simpler topologies the easiest thing to do is use lots of local and global feedback. With it comes all of the problems of transient response and stability against oscillations under varying conditions. And we know that for smaller amounts of global feedback there is even the effect of creating additional energy at higher harmonics in the distortion spectrum.
The availability of monolithically constructed, inherently matched and thermally coupled devices allows for a number of circuit enhancements. At the same time, putting whole amplifers on a chip requires close attention to the thermal effects, and usually entails a lot of parasitic capacitances. In this regard a mix of monolithic matched multiples and discrete devices can yield superior performance.