Click Here!

Summer 2008 Frontpage | Subscribe | Feedback 


In This Issue

DSP or FPGA?
How to Choose?

Air-HockeyBot
Debut


Previous Issues

Video Processing in FPGA vs. DSP

Signal Integrity 101
Series: P1 | P2 | P3

Device Spotlights

Altera Cyclone III
QPixel QL201B
Xilinx Virtex-5
Pixim D2500

Gennum VXP

IPFlex DAPDNA-2
TI DaVinci

Lattice XP
TI DM642 DSP
ADI TigerSharc
Altera's Nios II
Airgo Wireless



Nuvation HEADLINES 

New Events

» 

Nuvation's eStore is launched!

»  Nuvation is hiring! Check out Nuvation's exciting careers in North America and Asia

New Affiliations

» InPlay Technologies partners with Nuvation
 


DSP or FPGA? How to choose the right device

Bamdad Afra, Amit Kapadiya
Nuvation


System designers face a number of key questions during the architecture phase of their project. Increasingly one of these questions is whether to use an FPGA (field programmable gate array) or a DSP (digital signal processor). To answer this question, system designers consider parameters such as:

  • System performance requirements for signal processing
  • Power consumption
  • Component count and form factor
  • Future product/system road-map and upgradeability for the system
  • Economic parameters such as non-recurring engineering (NRE) investment, bill-of-materials (BOM) cost, time-to-market and project risk

The decision also depends on the technology familiarity factor. In some cases, the design team is well-versed in DSP systems but has little FPGA background, or vice-versa. In such cases, the team skill-set may drive the choice between FPGA and DSP. For example, Nuvation recently worked on an algorithm acceleration project where the algorithm lent itself to wide parallel implementation in an FPGA. However, classic FPGA approaches were ruled out due to the lack of FPGA skills in the client's engineering team and the potential barriers this presented to product lifecycle maintenance.

We acknowledge that most engineers and system architects are more familiar with DSP technology due to the simplicity of designing with DSPs. This is a clear advantage for DSPs. However, developer familiarity varies widely across design teams, so it is difficult say how important this issue is to a "generic" design team. Thus, this article ignores this DSP advantage and assumes that the choice of technology does not depend on developer familiarity.

In order to choose between FPGA and DSP, we look at system performance requirements for signal processing and BOM cost. We consider devices from a major DSP vendor (Texas Instruments) and a major FPGA vendor (Altera) to guide us through this process. We identify some of the signal processing applications in which each specific technology is clearly superior. We also consider where an FPGA may be used as a co-processor to a DSP chip.

DSP devices from Texas Instruments
Table 1 lists principle DSP devices from Texas Instruments (TI) in different cost categories. This table summarizes the cost/performance data of more than 160 DSP devices. As shown in Table 1, DSPs achieve cost/performance in the range of 1.8 to 48 cents per MMAC (millions of multiply-accumulate operations per second).

Note that the table foot-note provides details relating to each device family. The cost/performance data presented in this table should be considered in conjunction with these details. For example, each DaVinci digital media processor incorporates an ARM9 processor (at up to 297 MHz), a TMS320C64x+ DSP core (up to 4752 MIPS, and eight 8-bit MACs per cycle for up to 4752 MMACS) as well as many peripherals and internal memory.

In Table 1, specific device names are excluded and only overall performance/cost ranges are categorized based on the device cost belonging to a certain family. Some families appear multiple times in the table as there are multiple devices within each family; some of which belong to different cost categories.


Table 1. DSP device families from Texas Instruments in different cost categories.

Table notes:
(1)
: The device cost is based on 100u volumes. Pricing info was obtained from www.ti.com in January 2008.

(2): MIPS is defined as the number of instructions that can be executed in millions per second. A range of MIPS are available in various devices within each family.

(3): MMAC is defined as the number of single precision floating-point or fixed-point multiply-and-accumulate 32-bit operations that can be executed in millions per second. MMAC performance values increase by a factor of 2x and 4x for 16-bit and 8-bit operations, respectively. All operations assume no truncation of multiplication results (i.e. results are twice the size of operation bit-width: 32-bit multiply generates 64-bit result; 16-bit multiply generates 32-bit results, etc).

Device Family Notes:
The following notes provide a collective information summary on features and capabilities offered within each device family. For specific and complete device capabilities refer to www.ti.com.

DaVinci Digital Media Processors include on-chip ARM9 processor, Ethernet MAC and/or Switch sub-system, Video/Audio Ports, Video Processing Subsystem, High Definition features, PCI Bus Interface, ATA Interface, USB Interface, etc.

C6000 Fixed Point DSPs include Viterbi Decoder Co-processor, Turbo Decoder Co-processor, UTOPIA Slave 2 ATM Controller, PCI Bus Interface, RapidIO, Ethernet MAC, etc.

C6000 Floating Point DSPs include Single/Double precision floating point DSP core, Enhanced CPU core, Audio Port, Dual Access Internal Memory, etc.

C5000 Fixed Point DSPs include Dual/Quad Core DSPs, On-chip ARM7 Processor, Video Hardware Accelerator, ADC, USB, etc. This family includes low power devices and devices targeted for IP-Phone and Client Side Telephony applications.

C2000 Digital Signal Controllers include 16/32-bit Core, Internal Flash Memory, 10/12-bit ADC (multi-channel), PWM, etc. This device family is targeted for Control Applications.

FPGA device families from Altera
Table 2 lists FPGA device families from Altera in different cost categories. This table summarizes cost/performance data for over 100 FPGA devices (including various speed grades for each specific device). Note that details regarding each FPGA family (interface capabilities, internal memory architecture, and versatility of DSP resources) have been excluded from this summary. Moreover, the MMAC performance estimates are based on clock frequencies which are achievable given an overall resource utilization of 70 percent.

Also note that Table 2 does not identify MIPS performance values. A designer may include one or multiple Nios II processors in the FPGA depending on the available resources (registers and memory). However, defining a MIPS performance value for the FPGA as a whole is impossible because the use of an embedded processor (and its specifications) is design specific. Finally, it is important to note that logic resources and registers in the FPGA may be used to create additional signal processing resources to increase the MMAC performance.


Table 2. Various FPGA device families from Altera sorted in different cost categories.

Table notes:
(1)
: Costs were obtained from www.altera.com in Jan 2008. At this time, the pricing for only one device, EP3SL150, was available in Stratix III device family.

(2): MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second. MMAC performance values increase by a factor of 4x for 16-bit operations.

(3): Assuming a clock frequency of 120MHz for Cylcone II and Stratix II and a clock frequency of 165MHz for Cyclone III and Stratix III devices. The overall resource utilization is estimated to be 70%. Note that higher resource utilization and performance is achievable in the FPGA.

Choosing FPGAs and/or DSPs
FPGAs and DSPs are different devices, and they were created for different purposes. DSPs were created to provide an optimized platform for signal processing algorithms implemented in software, while FPGAs were initially created for providing glue logic. Over time, DSPs and FPGAs have grown in performance and resources, and they now provide solutions in overlapping markets. Both have application areas in which they are the optimum solution. For example, FPGAs are by far the superior choice for networking applications that move traffic at Gigabit/second data rates. DSPs are the superior choice for video applications such as surveillance. However, there are many overlapping application space for these two devices.

In previous sections, we looked at the cost-performance values for DSPs and FPGAs. Table 3 summarizes these comparisons in three different MMAC performance categories: Low, Medium and High. Table 3 also groups devices according to their cost. For example, Medium-performance devices are sub-categorized into those costing $10~30, and those costing $30-100.

This table shows minimum cost/performance values. Note that FPGAs and DSPs differ in their functionality and features. These features must be kept in mind while considering their cost/performance values. Blindly choosing an FPGA its cost advantage (which can be as low as 0.2 cents/MMAC) would be a mistake.


Table 3. FPGA/DSP comparison summary.

Table Note: MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second.

In order to put DSP and FPGA features in contrast, one can use Table 3 and Table 4 as guidelines for choosing between FPGA and/or DSPs. The decision process shown in Table 4 considers application-specific DSP features. DSPs often come with bundled features (see Notes for Table 1) that can translate into cost savings. Therefore, a DSP with application specific features has an advantage over an FPGA with similar cost/performance value. Table 4 reflects this advantage.


Table 4. Guideline for choosing DSP and/or FPGA.

Table Note: MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second.

For designs with MMAC requirement below 300 MMAC, DSPs are in general the optimum solution from cost/performance perspective. For designs with MMAC requirement between 300 and 1000 MMAC, the DSP is generally preferable when it comes with application specific resources (such as video/audio ports, ARM processor, etc., as is the case with the DaVinci digital media processors). When a DSP with application specific resources does not exist, other aspects of the design must be considered.

For applications with performance requirements above 1000 MMAC, FPGA/DSP Hybrid solutions are often the ideal solution. These applications often include multiple signal processing algorithms, some of which have low performance requirements. In such cases, relatively inexpensive DSPs can implement the algorithms with low-to-medium performance requirements, leaving the higher-performance algorithms to FPGAs.

It is important to note that each design is unique. No global solution exists for choosing between DSPs and FPGAs. The aim of this article is to provide the reader with an overall overview of DSPs and FPGAs along with their overall cost/performance values and features. The data and guidelines presented in here may be used as the initial step for choosing a DSP and/or FPGA. The choice must be justified based on the totality of the design requirements. In the following sections, we look at some of the additional requirements one must consider when choosing DSPs and/or FPGAs.

DSP in signal processing applications
A DSP is a specialized CPU for signal processing applications. Its core is designed to optimally execute signal processing algorithms for which the principle operation (multiply-and-accumulate) is similar across almost all algorithms. DSPs are also packaged with many peripherals and different types of memory in the same device, similar to micro-controllers. In a sense, DSPs concurrently offer all flexibilities and functionalities offered by microcontrollers in addition to being optimal for signal processing applications for low and medium performance applications. Therefore, DSPs become the device of choice for system architects for a large range of application given the combination of:

  • Microcontroller functionality,
  • Being optimized for signal processing, and
  • Numerous on-chip peripherals bundled in the same package.
As an example, a Nuvation customer wanted to develop a laser control loop. For that project a DSP-capable microcontroller was optimal for cost and integration reasons. FPGAs lack integrated ADCs (analog to digital converters) and DACs (digital to analog converters) internally, but developers can get that functionality in a DSP. Nuvation used a small DSP-capable microcontroller with a 12-bit ADC, 8-bit DAC, and an Ethernet interface to handle an entire control loop without additional parts. This approach saved considerably on the BOM cost and board complexity.

As another example, Nuvation recently worked on a motor control application with multiple control loops that would work well in an FPGA's parallel architecture. However, a TI DSP capable controller was clearly optimal when we considered the cost implications of developing each separate processor-type function (communications, supervising, etc).

FPGAs in signal processing applications
FPGA devices let designers to create custom logic for widely parallel, high computation rate signal processing. For example, an 81-tap FIR filter operating at 400 MSPS requires over 32 billion MMACS. Such performance is approximately an order of magnitude beyond the capabilities of a single DSP. However, a single FPGA can easily provide this performance.

It is important to mention that this performance is available at a cost premium, which can be up to an order of magnitude higher than the price of a DSP. The increased cost is due to NRE costs and the cost of the FPGA itself. For example, most signal processing systems require more than just a simple function such as a FIR filter. Most systems perform other types of functions such as data treatment and decision making. When using an FPGA, every additional signal processing function, data treatment and/or algorithm requires its own specialized logic. Therefore, added system complexity may quickly increase the device size, NRE costs and schedule.

FPGA-DSP co-processing architectures
As mentioned in the previous paragraph, each added function in the FPGA has the potential to increase the schedule, NRE and parts cost. If the added functionality is within the capabilities of a DSP chip, it is cost effective to implement that function on a DSP while keeping the MAC-intensive operations in the FPGA.

In general, this means placing functions that consume less than 1000 MMAC on the DSP, and placing functions with higher requirements on the FPGA. For example, Nuvation implemented an envelope detection application with a 500 MSPS sample rate on a DSP-FPGA hybrid. The FPGA performed the initial high-sampling rate filtering and decimation, and a DSP performed the remaining signal processing functions. This system configuration profiting from advantages offered on each platform.

Application example: IP camera reference design
To illustrate these design decisions in more detail, let us consider an example video application. Video applications such as IP set-top-boxes, digital video recorders (DVRs), entertainment devices, and digital cameras (to name a few) are growing rapidly. In such applications, system architects search for a platform that can address the following areas:

  • Video ports and other interface connectivity
  • Digital signal processing power
  • A CPU that can execute various scheduling, management, and control tasks
  • Volume pricing for multi-unit applications

Often these applications target the consumer electronic market. This translates to cost sensitivity, including a desire to minimize NRE costs. Furthermore, as consumer electronics is a fast moving market, time-to-market becomes a crucial factor. Overall project risk also needs to be minimized. Finally, the trend toward miniaturization of consumer electronics and security applications plays an important role in the system's form factor.

The IP camera market illustrates these concerns. The explosive growth of video surveillance, machine vision and video teleconferencing, have created a need for a low-cost camera reference design. Such a design should allow clients to achieve rapid time-to-market with minimum NRE, while also allowing modification of the design to tailor it to a specific application. In response to these needs, Nuvation specified the following requirements for its IP camera reference design:

  • Smallest form factor IP camera
  • Standard optics (CS mount) with wide dynamic range (WDR) imaging
  • Low power with PoE support
  • Low BOM cost
  • DFM/DfX including RoHS compliance and obsolescence risk mitigation
  • Full embedded Linux, real-time
  • TCP/IP and/or analog video output
  • H.264, MPEG-4, MJPEG encoder flexibility, up to D1 at 30 fps
  • Support for custom or licensed video analytics software
  • Field programmable

Figure 1. Nuvation IP camera reference design. Photo Credit: Jason Rothe.

To meet the requirement objectives, Nuvation engineers chose a DaVinci device from TI, the TMS320DM6446. This device is a high performance digital media system on chip (SoC) targeted at high-end video applications. It is a dual processor device that contains a C64x+ DSP core for accelerated video processing and an ARM9 core for co-processing tasks and peripheral management.

As the central device in the IP camera reference design, the DM6446 is responsible for acquiring video data, encoding it in the desired format and outputting it via Ethernet and TCP/IP. As a dual processor device, the DM6446 allows designers to implement signal processing algorithms in the C64x+ DSP core while executing other tasks, such as packet assembly and peripheral management, in the ARM9 microcontroller core.

The availability of a full Linux distribution for the ARM9 is another advantage of the DM6446. Linux allows system designers to use existing firmware in the open source community and quickly integrate third party libraries. The DM6446's Ethernet ports, video ports and small footprint and power consumption were also driving factors for choosing the device. In brief, the DM6446 DSP made it possible to meet the design requirements while minimizing NRE.

FPGA or DSP? Choose
The choice between the FPGA and DSP depends on many parameters. There is no global recipe for making the right choice, and there are always trade-offs. It is the understanding of these trade-offs that guides an architect to choose a platform that best meets the requirements of a specific system. We highlighted design examples where a FPGA or a DSP is the superior choice, as well as cases that call for a DSP/FPGA hybrid system. Using these examples, we hope to have given you more insight into choosing the appropriate device for your design. For more information, or help on choosing the correct device, contact Nuvation at sales@nuvation.com.

Additional Quicklinks:

Read this article published in DSP Design Line
Ask Nuvation a Technical Question
Get an Online Quote

Search Nuvation.com
Customer service
· To subscribe yourself or a friend, please click here.
· Questions? Comments? Send us your feedback.





Copyright © Nuvation Research Corporation 2008. All rights reserved.
Privacy Policy | About Nuvation | NUVATION.COM