Bro, do you even calculate?
To select a computing unit, it’s of fundamental importance to know how much calculations we need to perform.
That is reasonably easy to approximate. We know that we want the output at 48kHz samplerate. We also know we want 32 simultaneous voices, and for each voice we need to calculate:
- 7 waveform generators (6 operators + 1 LFO, phase accumulation plus LUT)
- 7 envelope generators (one for the level of each operator, plus one for the voice pitch)
- 7 gain stages, applying EG, LFO, velocity tracking, keyboard tracking, MIDI volume/wheel/after/breath/etc.)
In most platforms, a waveform generator can be optimized to use about 20 atomic instructions (fetch phase, calculate increment from all pitch modulators, add, store phase, get lookup multiple values from shared ROM/RAM, calculate interpolation). Envelope generators depend on the number of segments and shapes, but can easily go above 40 atomic instructions for a 6-segment generator with adjustable rates and levels, user-defined curves and interpolation. Gain stages depend on if the system features hardware multipliers, and if so they require one instruction per target, about 10 in our case. So one sample of one operator would total about 70 instructions.
Let’s call the group of one oscillator plus one EG plus one gain stage an “element”. Those numbers can then be multiplied to estimate our requirements:
48,000 samples/sec 32 voice/sample * 7 elements/voice * 70 instructions/element = 752 MIPS
Additionally, extra calculation power needs to be accounted to do:
So in broad terms, we would need about 650 MIPS unit to fit the synthesizer reasonably well. As we want our output to have 24-bit quality, calculations need to be performed in at least 32-bit. That means that each one of those instructions should be 32-bit (i.e. 32-bit addition, 32x32 multiplication, etc.).
AVR units calculate one instruction per clock cycle (1 MIPS x MHz), while PIC units take four cycles to calculate one instruction (0.25 MIPS x MHz). Most of the Arduino boards run below 100MHz, so they are out of range.
Parallax Propeller chips (which were used to build a 16-voice, 2-op functional prototype) are a notable multi-core option. However, the lack of hardware multipliers reduces substantially their throughput. In the prototype, log/antilog tables combined with interpolation were used to compensate for the lack of multipliers, but the resulting SNR wasn’t acceptable for 24-bit operation.
So leaving the SBCs out because of power/latency/determinism, and most of the Arduino boards, the only viable options were the DSP boards and FPGA.
The FPGA option was chosen for XFM, essentially because of price/performance, scalability and future-proofing. DSP chips with similar computational power can easily cost 4x as much. Just as a reference, the DSI Prophet 12 uses the Analog Devices ADSP 21479 (six of those!), which cost $19.95.
The synthesizer was then designed in HDL, which could be ported/migrated to other FPGA vendors/families with a relatively lower effort. A bigger unit could eventually be assembled at silicon level (using a bigger FPGA device), in addition of gluing multiple chips together. And in a final case, an ASIC chip could be derived from the design. The only drawback being the length and complexity of the design.