Next Generation FPGAs Address Demands of the Zettabyte Era

By Mark Bingeman | Jan 27, 2015

FDR-infiniband-design_240x200.jpg
Nuvation developed this Xilinx Virtex-7
driven FDR InfiniBand adapter card 
for a broadcast video server.

Global IP traffic has been significantly increasing over the last few years and is anticipated to continue to increase into the future.  Here are a few highlights from Cisco’s VNI June 2016 repor.i

  • Annual global IP traffic will pass the zettabyte threshold by the end of 2016 and will reach 2.3 zettabytes per year by 2020
  • Global IP traffic will increase threefold over the next 5 years
  • Over half of all IP traffic will originate with non-PC devices by 2020
  • Traffic from wireless and mobile devices will account for two-third of total IP traffic by 2020
  • Broadband speed will nearly double  by 2020 compared with 2015 speeds
  • Globally, IP video traffic will be 82 percent of all IP traffic by 2020
  • Internet video surveillance traffic will increase tenfold between 2015 and 2020
  • Global  mobile data traffic will grow almost three  times as fast as fixed IP traffic from 2015 to 2020
  • Globally, monthly IP traffic will reach 25 GB per capita by 2020, up from 10 GB per capita in 2015.
  • There will be 3.4 networked devices per capita by 2020, up from 2.2 networked devices per capita in 2015
  • Internet of Everything (IoE) – Globally, M2M connections will grow nearly 2.5-fold, from 4.9 billion in 2015 to 12.2 billing by 2020.  There will be 1.6 M2M connections per capita by 2020.

These trends point to the need for significant improvements in the Information and Communications Technology (ICT) sector infrastructure in order to support the demand for IP traffic.  The trend towards mobile devices and increases in broadband speeds will require performance improvements throughout the ICT infrastructure.  These market demands require faster high-speed serial interfaces to transfer data between nodes as well as faster and deeper memory interfaces for data buffering.  Furthermore, these performance increases must occur without increasing power consumption.  Today’s next-gen FPGA devices have risen to these challenges.

Serial Transceivers

The performance of today’s next-gen FPGAs is significantly higher than previous generation devices, opening up opportunities for higher bandwidth and newer protocols. 

image_1.png

 

The GTY transceivers in the Xilinx Virtex UltraScale and UltraScale+ devices support bandwidths up to 32.75 Gbps.  Similarly, the GTH transceivers found in both the Kintex UltraScale and Virtex UltraScale devices, support bandwidths up to 16.3 Gbps.  In comparison, the 7-series transceivers supported bandwidths up to 28 Gbps in the Virtex-7 and 12.5 Gbps in the Kintex-7.  The number of transceivers has also been increased up to 128 in Virtex UltraScale+, up to 120 in the Virtex UltraScale, and up to 64 in the Kintex UltraScale.ii

The UltraScale and UltraScale+ transceivers add support for a number of new protocols including PCIe Gen4, 16G Backplane, CPRI 10.1G/16G, CIE-28G-LR and Interlaken 25G.

Altera’s Stratix 10 GT transceivers support bandwidths up to an impressive 56 Gbps for chip-to-chip interfaces.  The Stratix 10 GX transceiver support bandwidths up to 32 Gbps for chip-to-chip interfaces and 28 Gbps for backplane applications.  Similarly, the Altera Arria 10 transceivers support bandwidths up to 28.3 Gbps. In comparison, the Stratix V, and Arria 10 transceivers supported bandwidths up to 28.05 Gbps and 12.5 Gbps respectively.  The number of transceivers has also been increased up to 144 in the Stratix 10 and up to 96 and the Arria 10.iii

image_2.png

 

The increase in bandwidth coupled with an increase in the number of transceivers means that the next generation of FPGAs can provide an unprecedented level of total effective serial bandwidth.

Memory Interfaces

In addition to the need for increased serial interface bandwidth to transfer the increasing IP traffic between nodes, there is also need to buffer more data at higher rates.  This results in a need for increased external memory interface performance.  The next generation of FPGA devices have made the jump from DDR3 to DDR4 memory in order to address the need for improved memory interface performance.  For example, the Xilinx UltraScale+ devices support DDR4 up to 2666 Mbps and DDR3 up to 2133 Mbps.  The Xilinx UltraScale devices support DDR4 up to 2400 Mbps and DDR3 up to 2133 Mbps.  In comparison, the 7-Series devices only support DDR3 up to 1866 Mbps.iv

image_3.png

 

The Altera Stratix 10 supports DDR4 up to 2600 Mbps and the Arria 10 supports DDR4 up to 2400 Mbps.  In comparison, the Stratix V supported DDR3 up to 1866 Mbps and the Arria V supported DDR3 up to 1344 Mbps.v

image_4.png

 

The next generation FPGAs have added support for the higher bandwidth DDR4 memory, along with IO changes to more efficiently pack DDR3/4 interfaces, and support for hybrid memory cubes, in order to provide significant increases in memory interfacing bandwidth and memory storage capacity.

For example, the Stratix V and Arria 10 can support up to four x72 DDR interfaces while the Stratix 10 can support up to six x72 DDR interfaces.  This results in a total DDR bandwidth of 1.382 Tbps for Stratix 10 as compared to 0.537 Tbps for Stratix V.  This over 2x improvement in DDR total bandwidth means more than two times the IP traffic, video frames, etc. can be handled by a single Stratix 10 device.vi

Power

ICT infrastructure often requires that newer equipment is compatible with the exiting form factor.  This means that the new equipment must also operate with similar power as the existing equipment.  Maintaining similar power consumption while increasing serial interfacing and memory bandwidth performance improvements is a significant challenge.  The next generation of FPGAs have made significant efforts to reduce static power, dynamic power, I/O power and transceiver power.

Static and dynamic power have been reduced with smaller process geometries along with material selections and gate construction.  Binning is also used to use a lower core voltage to obtain lower static and dynamic power with a slight reduction in performance.  Architecture changes in device resources, including internal memory, PLLs, clocking networks, and DSP blocks have also resulted in lower power.  IO power has been reduced by supporting the lower voltage DDR4 Pseudo Open Drain standardvii as compared with the DDR3 SSTL standard.  IOB low power mode and automatic disabling of input termination also result in IO power reductions.  Transceivers have also been redesigned for power reduction. In addition to the device power improvements, vendor tools improvements have also been made to optimize power usage and take advantage of the device power improvements. 

Overall power usage has been decreased by 30%-50%, thus providing additional capacity to improve serial interfacing and memory bandwidth performance.

In summary, the next generation of FPGAs have made significant improvement in transceiver performance, memory interface performance and power usage in order to support the market demand of the zettabyte era.