MADI frame synchronization

GroupDIY Audio Forum

Help Support GroupDIY Audio Forum:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.

Andy Peters

Well-known member
Joined
Oct 30, 2007
Messages
2,031
Location
Sunny Tucson
I'm implementing a MADI receiver in an FPGA, and I've got the deserializer working. Its output is a byte stream accompanied by a data/sync indicator and a data valid flag.

What's not clear to me, and where the AES10 spec is rather vague, is how to recognize the start of the MADI frame. It indicates that the first nybble of a frame includes the frame sync bit set true, the channel active true, and the AES block-start bit true. But distinguishing those bits from any particular data word isn't possible.

The spec does a lot of handwaving about the position of sync words within the frame. It says that a frame needs to be padded out with sync words and that they must appear on a 40-bit (meaning channel, or four decoded data byte) boundary.

It doesn't explicit say, but the figure (#7) showing "some permissible 4B5B sync symbol positions" seems to imply that you'll see a sync symbol immediately preceding the start-of-frame and Channel 0 data. Is this true, or at least is that assumption made by everyone implementing a MADI system?

thanks ...

-a
 
JohnRoberts said:
Do you have access to one or more MADI data streams to look at?  While that might not be definitive either.

A friend has something I can use. I'll get it from him when I rig up a "MADI bus analyzer" which will let me look at the packets as they come across the wire or fiber. (There's a product idea there ...)

The problem is that as you say, it might not be definitive.

I'm going with the assumption that a JK sync word will be transmitted before the frame's first channel data.

The design as it stands deserializes the incoming MADI stream, finds the start of frame, and as channel data come in they are written to a sample-wide ping-pong buffer. "Something else" has to read out the buffer.  Also, the other sync characters are pushed out as they come in, so if one wanted to implement MIDI-over-MADI, the hooks are there.

If one were to actually build a 56-channel ADC or DAC with MADI I/O, one would need a pretty big FPGA.

-a
 
Reminds me of back in the day when I suggested to the digital pukes that a good feature for a midi rack product is a trouble-shooting mode where the display would just echo and provide specific information about incoming midi instructions. OK not the same...

Good Luck

JR 
 
Kinda new to looking at MADI (in the past week or so; have been focusing on AVB and Ethernet)....  Whilst I am not entirely familiar with FDDI (of which MADI seems most to be a "version" of?), I see some similarities (as best I can tell) in thinking for MADI with 100Base-TX (and MLT-3 including 4B5B at 125MHz for 100MHz of "stuff" on the wire)..... 

I would think (guess) that sync is based upon the usual start characters (JK?) repeated X number of times, then data (channel) payload is frame aligned from there?  This gives the receiver time to "PLL" onto the incoming stream or at least start the clock recovery (given 4B5B with MLT-3, three voltage level signaling).

If at first you plug in the BNC for MADI and you only get X-n (where n is <= X and >0)  JK sync "characters" then that frame is discarded until X proper is seen (hopefully on the next frame)?

But it sounds like you have the SERDES part rendered in FPGA....  Now onto the protocol (and/or framing) section....

Please feel free to point out the errors in my thoughts... I do not have a copy of the MADI spec (yet)...  And there seem to be partial annexes of the spec that are not "entire"....
 
Twenty Log said:
Kinda new to looking at MADI (in the past week or so; have been focusing on AVB and Ethernet)....  Whilst I am not entirely familiar with FDDI (of which MADI seems most to be a "version" of?), I see some similarities (as best I can tell) in thinking for MADI with 100Base-TX (and MLT-3 including 4B5B at 125MHz for 100MHz of "stuff" on the wire)..... 

That's right. The spec and some AES supporting documentation (which I have, but I won't post due to the obvious copyright restriction) are explicit about that, actually, because they say that they chose the implementation based on the parts available at the time. (Some wags argue that FDDI was a solution in search of a problem.)  It's not MLT-3 (thankfully). It's simple binary NRZI encoding.

Digression: It's also the same as the old AMD TAXI protocol, which I first encountered when working for the Kitt Peak National Observatory (1997-2005). The infrared instruments used a fiber transmission system to get data from the instrument down through the telescope cable wrap-up (a couple hundred meters) into the computer room. The typical system had a sensor with 16 outputs, which were digitized with these Datel hybrid 2 MHz 16-bit flash ADCs. Conversions were all synchronized. The parallel ADC data were all serialized in shift registers, and then the shift register outputs each fed one bit of one of two TAXI transmit modules. The modules then serialized and coded the data and lit up the fiber. The receive end reassembled the image data.

The reason I mention all of that is because once I understood how that worked, my first though was, "Well, this would be a simple way to implement a digital snake for live sound applications ..." 

I would think (guess) that sync is based upon the usual start characters (JK?) repeated X number of times, then data (channel) payload is frame aligned from there?  This gives the receiver time to "PLL" onto the incoming stream or at least start the clock recovery (given 4B5B with MLT-3, three voltage level signaling).

The spec is rather vague. It says that the JK character is the default sync and says that "a 4B5B sync symbol shall be inserted into the data stream at least once per frame period ... sufficient sync symbols shall be inserted by interleaving with the encoded data words to fill the total link capacity." Notably, it doesn't say where sync characters should be inserted, other than that they need to occur on "40-bit channel boundaries," meaning you can't split a channel's data (32 bits encoded into 40) by putting a sync byte in the middle.

They provide a picture with "some examples of permissible positions of the sync symbol" which shows the sync symbol as the last byte of the frame, in addition to being interspersed with channel data. Given 56 channels at 48 kHz frame rate and 32 bits per channel, that's 86 Mbits/second payload so you have to pad out 1748 bytes with sync words. If it was my choice (and is, if I'm designing the transmit side, too), then I'd just stick all of the sync padding at the end of the data payload. But that's not required.

If at first you plug in the BNC for MADI and you only get X-n (where n is <= X and >0)  JK sync "characters" then that frame is discarded until X proper is seen (hopefully on the next frame)?

Once the deserializer is locked, I look for the first JK sync character. Then I check the next byte to see if it is data and it includes the MADI frame sync. If it doesn't, I go back and look for the next JK sync. If it does, then I capture the data and also look at the next three incoming bytes to see if they're also data bytes. If so, I add them to the sample word; if not, something's gone wrong and I start to look for another JK sync. So that's how I sync to the first channel sample of the frame.

I am not at all sure how to sync the frame if the first subframe (the one which includes the MADI frame sync indicator) does not immediately follow a JK sync character, hence this thread :)

After capturing all bytes of a channel sample, I just keep looking at what the deserializer gives me. It's perfectly legal for a sync character to come through and if so, I push it out and assert a "got new sync strobe" so if anything cares it can deal with it. If it's a data word, it gets captured the same way as above, and the process repeats.

MADI specifies that you can have up to 64 channels per 48 kHz frame, but there's no requirement (if I read the spec correctly) to actually send 64 subframes. The only requirement is that "inactive channels shall always have a higher number than the highest numbered active channel."  So at some point, you'll see the first byte of a channel data subframe with its MADI Frame Sync bit set, so you know you're into the next frame. The prudent receiver design should be prepared to deal with all 64 subframes, though.

But it sounds like you have the SERDES part rendered in FPGA....  Now onto the protocol (and/or framing) section....

The deserializer is the easy part. The protocol would be easy if the spec was less handwaving and more, uh, specific.
 
I'am new here, and forgive me my english (i'am swiss), i just saw your post by accident. I implemented a working fpga-based madi receiver this... ähm... last year in a product.
If you got your datastream aligned due to the JK symbol your next 5bit symbols are either more sync symbols or the start of a 40bit AES3 data. To find the first audio-channel from the 40bit subframe look for the frame-synchronization-bit (Bit0). As soon as you detect the first subframe you can start a counter to get the other channels and have a valid madistream. The JK symbol is just to aligne your serial data not to give you any clue where your first subframe is. At least you have to wait for nearly an audiosample to get your valid madi-data.

For transmitting, you can place the JK symbols wherever you want except in between of a 40bit AES3 data. So your right. The easiest way is to send all needed JK-symbols at the end of your encoded audiodata which is absolutely in spec ;)
 
Good day
I am currently working on an FPGA AES-10 transmitter. Unfortunately I don't know exactly when and how many sync bits I have to send. At the moment I am sending 1 sync symbol per 64 channels. My receiver recognizes the 64 channels but the rest is not stable.

They provide a picture with "some examples of permissible positions of the sync symbol" which shows the sync symbol as the last byte of the frame, in addition to being interspersed with channel data. Given 56 channels at 48 kHz frame rate and 32 bits per channel, that's 86 Mbits/second payload so you have to pad out 1748 bytes with sync words. If it was my choice (and is, if I'm designing the transmit side, too), then I'd just stick all of the sync padding at the end of the data payload. But that's not required.
How did you come up with the 1748 bytes and what does this mean in number of sync symbols?



I would really appreciate an answer.
 

Latest posts

Back
Top