Things that don't sound like they look...

JohnRoberts · Sep 26, 2021

To stop corrupting the thread about sum bus technology some recent comments have triggered this veer.

Back in the 70s working with splicing together discontinuous samples of pitch shifted speech I had an epiphany about how audio can "sound" dramatically different than you would expect from looking at scope traces. When pitch shifting audio we grab X mSec of audio then time compress it to fit into less time to pitch shift up, or expand it out over longer time to pitch shift down. Clearly when trying to reassemble these samples in real time there will be overlap (too much audio to fit in original time space), or gaps when pitch shifted samples take up less time than available.

In the case of pitch shifting up, how we splice the valid audio with the dead gaps between samples can cause audible perturbations. Pitch shifting down gives us too much information to fit in the time window neatly and you can't just let them randomly overlap and sum. There are different strategies for dealing with both. For waveforms with redundant information the excess data can be discarded, but the starts and stops must be carefully managed for minimal artifacts. For pitch shifting up, some data can be repeated to fill excessive gaps but again this must be done carefully. I am just scratching the surface of this and there are numerous other factors like optimal sample size, etc.

Back in the day, the gold standard for pitch shifting was a "rotating head" tape machine. Kind of like the technology used in VCR tape machines the rotating head allows the relative tape speed seen by the head to differ from the true tape speed. These audio rotating head machines used 4 heads spaced around the head assembly that seamlessly spliced together audio samples as they approached or withdrew from contact with the tape. In a little serendipity, the nature of how tape heads decode magnetic signals, the LF signals were received before the HF content. This delivered a smoother blending of samples.

The small company I was working for invented a pitch shifter based on BBD delay. By clocking in an audio sample at one clock rate, and clocking it out slower or faster changes the apparent pitch. BBD delay chips are too short for simple bi-frequency pitch shift, but clocking the BBD with constantly ramping clock frequency (faster or slower) delivered the desired pitch shifts with more useful sample sizes. This was too crude for music, but surprisingly effective for speech with the target market of speeding up talking book recordings for blind people.
====
For TMI about managing clicks another old design trick is to use HF pre/de-emphasis. By boosting the HF content before the switch, and restoring it to flat, after the switch effectively rolls off just the HF content of the "click". Of course there is no free lunch. This pre-emphasis comes out of headroom but HF content in complex waveforms is generally much lower amplitude than LF content, leaving room for pre/de-emphasis.

Fourier analysis of the gain steps identifies HF spectral content, but I am still not a mathematician.

Let the veer continue..

JR

Tubetec · Sep 26, 2021

There's a VST called Serato pitch n time , its not something I've used , but people do rave on about it .
You can alter pitch without speeding it up or speed the sound up without effecting pitch . Like anything the more you shift things in % terms audio quality starts to suffer degradation .

abbey road d enfer · Sep 26, 2021

IIRC Lexicon started with the first digital implementation of an audio speed shifter; many followed, with the most famous being AutoTune and the most advanced being Celemony, both having their speed/pitch algos incorporated in a dedicated automatic (or not) pitch-correcting suite.

JohnRoberts · Sep 26, 2021

abbey road d enfer said:
IIRC Lexicon started with the first digital implementation of an audio speed shifter; many followed, with the most famous being AutoTune and the most advanced being Celemony, both having their speed/pitch algos incorporated in a dedicated automatic (or not) pitch-correcting suite.

Actually Dr Lee (Lexicon) wasn't first and his early cassette based pitch shifter was pretty crude digital 8bit and far more expensive than my employer's (VSC Variable speech control) technology. IIRC Lexicon also made some very early digital delay units costing thousands of dollars for only tens of mSec delay.

Of course Lexicon progressed to use more bits and develop studio digital processors (including reverbs) that didn't suck

A quick internet search reveals several different companies claimed to be first.

I was in the trenches at least for pitch shifted speech playback.

JR

sahib · Sep 27, 2021

JohnRoberts said:
To stop corrupting the thread about sum bus technology some recent comments have triggered this veer.

........

Since I was the one who gave you the headache.

JohnRoberts said:
For TMI about managing clicks another old design trick is to use HF pre/de-emphasis. By boosting the HF content before the switch, and restoring it to flat, after the switch effectively rolls off just the HF content of the "click". Of course there is no free lunch. This pre-emphasis comes out of headroom but HF content in complex waveforms is generally much lower amplitude than LF content, leaving room for pre/de-emphasis.

Fourier analysis of the gain steps identifies HF spectral content, but I am still not a mathematician.

Let the veer continue..

JR

I have never done anything like this before but I was thinking about it last night after going to bed. So, thank you for the tip.

JohnRoberts · Sep 27, 2021

High frequency pre/de-emphasis is old, and widely used in everything from FM radio to dbx tape noise reduction. Using it to de-click switched signals is less common.

Back in the 80s I designed an AMR recording product (MAP8x4) that was an 8x4 insert switcher. Effectively 4 effects could be punched into any one of 8 mixer insert jacks under midi control. Back then efx like reverbs and compressors were relatively expensive, so being able to use the same effect in multiple channels alternately during one song using midi program changes was useful.

Using effects like reverbs with midi control meant that you could use the same physical reverb on two different parts of the song with different midi presets. Further the MAP8x4 allowed multiple effects to be patched in series in any order you wanted to program.

HF pre/de-emphasis allowed artifact free switching in the middle of songs. Back then we were working with SMPTE time code machine control of AMR's 4T cassette deck. The Syncontroller also provided SMPTE to midi preset programming so SMPTE time cues from a tape track could move efx from channel to channel and change presets on the fly all under midi control.

JR

sahib · Sep 27, 2021

I have recently designed an 8x8 matrix cross point switch using 4051 controlled by 4532. Very basic design but I got an excellent click-free performance. MIDI control is the next step. These are under 40p chips. I am not sure what they cost back in '80s.

However, when I have a bit of free time I'll look into HF pre/de-emphasis switching design for my own learning and perhaps pick your brain.

In terms of slowing down/speeding up recorded material, I think that technique was also used in band limited transmission conditions. The signal material would be slowed down, transmitted, and the speeded up to restore.

Things that don't sound like they look...

Help Support GroupDIY Audio Forum:

JohnRoberts

Well-known member

Tubetec

Well-known member

abbey road d enfer

Well-known member

JohnRoberts

Well-known member

sahib

Well-known member

JohnRoberts

Well-known member

sahib

Well-known member

Similar threads

Latest posts

Things that don't sound like they look...

Help Support GroupDIY Audio Forum:

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Similar threads

Join the conversation!

Join today and get all the highlights of this community direct to your inbox. It's FREE!