The BxBFFT for Xilinx Versal

The BxBFFT is an amazing high-speed streaming Fast Fourier Transform, and Xilinx Versal was its first supported FPGA family.  Some of the advantages of a BxBFFT in Xilinx Versal are these:

Some of these advantages apply equally to all BxBFFT implementations.  These are discussed on the main BxBFFT page.  The sections below describe the BxBFFT's performance and advantages that are specific to a Xilinx Versal implementation.

Power Savings

It is not uncommon for an FPGA design to approach either power limits or resource limits.  Even when this is not true of a baseline design, it often becomes true because of the introduction of new product features.  Power consumption of an FFT can thus make or break a design, or allow or disallow product upgrades.  Power consumption also affects product life and reliability, as high consumption puts extra stress on the power supply, and high temperatures and large temperature swings increase the rate of component degradation.  High-speed FFTs require intensive processing, and thus may use a large percentage of the total power consumption of a design.  Thus power reduction in the FFT can be of particularly high importance.

The BxBFFT is highly optimized for power consumption.  Multiple customers have found that a switch to the BxBFFT saved significant amounts of power in their designs, making those designs viable where before they were not.

Below are results from Xilinx Vivado synthesis for power consumption of the BxBFFT vs several other FFTs.  It shows that BxBFFT power is typically lower than other FFTs by a factor of 1.2X to 1.5X in Versal FPGAs.

Resource Savings

FPGA resources are another common design limitation.  Designs that use fewer resources have more margin for initial implementation and for future upgrades.  For the same design, they can use fewer FPGAs of smaller size and be cheaper to manufacture.

The BxBFFT uses substantially fewer FPGA LUTs than competing FFTs in Versal FPGAs, as shown in the graph below.  Required DSPs and memory are not significantly different among the best FFTs, although for very large FFTs the BxBFFT can save memory by autogeneration of twiddle coefficients rather than storing them in ROM.  The BxBFFT also offers memory savings for the case where a scrambled output data order is acceptable.

Throughput and Latency Advantages

Sometimes designs need to meet strict real-time requirements, either in throughput or in latency.  Both of these improve when an FFT runs faster.  A faster FFT can be achieved with a higher achieved FPGA clock rate (Fmax) or with increased parallelism.  Parallelism is measured by the processed complex data Points Per Clock (PPC), also called SuperSample Rate (SSR).  Throughput is Fmax * PPC.

One issue is that as PPC increases, more resources are used, there is more resource contention, and thus the achieved Fmax of an FFT goes down.  This may make the desired throughput unachievable.  

For BxBFFTs, Fmax degrades less from resource contention.  BxBFFTs are thus able to achieve higher throughput, because a high Fmax and high PPC are simultaneously achievable.  The graph below shows this for Versal FPGAs. The BxBFFT achieves high PPC and high Fmax simultaneously, when the other FFTs do not.   Thus the BxBFFT provides the best throughput and latency.

Xilinx-Specific Ease of Use and Productivity Enhancements

The BxBFFT was designed to get you running quickly.  It has features to make configuration, synthesis, and simulation faster and easier, saving NRE.  Many of these features are mentioned on the main BxBFFT  page.  There is one productivity feature specific to Xilinx Versal FPGAs:

IP Integrator

One productivity feature specific to Xilinx FPGAs is the inclusion of a Xilinx "IP Integrator" model.  This allows quick integration of the BxBFFT with Xilinx IP.  Most major BxBFFT features are controllable from a GUI selection box with this approach.  This allows extreme ease-of-use.

 For those using Xilinx block designs, this is the fastest way to instantiate and configure a BxBFFT.

These results illustrate how the BxBFFT is superior in most ways to other FFTs in Xilinx Versal FPGAs.  It uses less power, uses fewer resources, and attains higher speeds.  It is unmatched at almost all FFT sizes and speeds.  It is unmatched in supported features.  It is also cross-platform, supporting both Xilinx and Altera FPGAs, with a path into ASICs.