Twenty Log said:
Kinda new to looking at MADI (in the past week or so; have been focusing on AVB and Ethernet).... Whilst I am not entirely familiar with FDDI (of which MADI seems most to be a "version" of?), I see some similarities (as best I can tell) in thinking for MADI with 100Base-TX (and MLT-3 including 4B5B at 125MHz for 100MHz of "stuff" on the wire).....
That's right. The spec and some AES supporting documentation (which I have, but I won't post due to the obvious copyright restriction) are explicit about that, actually, because they say that they chose the implementation based on the parts available at the time. (Some wags argue that FDDI was a solution in search of a problem.) It's not MLT-3 (thankfully). It's simple binary NRZI encoding.
Digression: It's also the same as the old AMD TAXI protocol, which I first encountered when working for the Kitt Peak National Observatory (1997-2005). The infrared instruments used a fiber transmission system to get data from the instrument down through the telescope cable wrap-up (a couple hundred meters) into the computer room. The typical system had a sensor with 16 outputs, which were digitized with these Datel hybrid 2 MHz 16-bit flash ADCs. Conversions were all synchronized. The parallel ADC data were all serialized in shift registers, and then the shift register outputs each fed one bit of one of two TAXI transmit modules. The modules then serialized and coded the data and lit up the fiber. The receive end reassembled the image data.
The reason I mention all of that is because once I understood how that worked, my first though was, "Well, this would be a simple way to implement a digital snake for live sound applications ..."
I would think (guess) that sync is based upon the usual start characters (JK?) repeated X number of times, then data (channel) payload is frame aligned from there? This gives the receiver time to "PLL" onto the incoming stream or at least start the clock recovery (given 4B5B with MLT-3, three voltage level signaling).
The spec is rather vague. It says that the JK character is the default sync and says that "a 4B5B sync symbol shall be inserted into the data stream at least once per frame period ... sufficient sync symbols shall be inserted by interleaving with the encoded data words to fill the total link capacity." Notably, it
doesn't say where sync characters should be inserted, other than that they need to occur on "40-bit channel boundaries," meaning you can't split a channel's data (32 bits encoded into 40) by putting a sync byte in the middle.
They provide a picture with "some examples of permissible positions of the sync symbol" which shows the sync symbol as the last byte of the frame, in addition to being interspersed with channel data. Given 56 channels at 48 kHz frame rate and 32 bits per channel, that's 86 Mbits/second payload so you have to pad out 1748 bytes with sync words. If it was my choice (and is, if I'm designing the transmit side, too), then I'd just stick all of the sync padding at the end of the data payload. But that's not required.
If at first you plug in the BNC for MADI and you only get X-n (where n is <= X and >0) JK sync "characters" then that frame is discarded until X proper is seen (hopefully on the next frame)?
Once the deserializer is locked, I look for the first JK sync character. Then I check the next byte to see if it is data and it includes the MADI frame sync. If it doesn't, I go back and look for the next JK sync. If it does, then I capture the data and also look at the next three incoming bytes to see if they're also data bytes. If so, I add them to the sample word; if not, something's gone wrong and I start to look for another JK sync. So that's how I sync to the first channel sample of the frame.
I am not at all sure how to sync the frame if the first subframe (the one which includes the MADI frame sync indicator) does
not immediately follow a JK sync character, hence this thread
After capturing all bytes of a channel sample, I just keep looking at what the deserializer gives me. It's perfectly legal for a sync character to come through and if so, I push it out and assert a "got new sync strobe" so if anything cares it can deal with it. If it's a data word, it gets captured the same way as above, and the process repeats.
MADI specifies that you can have up to 64 channels per 48 kHz frame, but there's no requirement (if I read the spec correctly) to actually send 64 subframes. The only requirement is that "inactive channels shall always have a higher number than the highest numbered active channel." So at some point, you'll see the first byte of a channel data subframe with its MADI Frame Sync bit set, so you know you're into the next frame. The prudent receiver design should be prepared to deal with all 64 subframes, though.
But it sounds like you have the SERDES part rendered in FPGA.... Now onto the protocol (and/or framing) section....
The deserializer is the easy part. The protocol would be easy if the spec was less handwaving and more, uh, specific.