![]() |
||||||
|
||||||
| ||||||
Nuvation HEADLINES ![]()
|
The decision also depends on the technology familiarity factor. In some cases, the design team is well-versed in DSP systems but has little FPGA background, or vice-versa. In such cases, the team skill-set may drive the choice between FPGA and DSP. For example, Nuvation recently worked on an algorithm acceleration project where the algorithm lent itself to wide parallel implementation in an FPGA. However, classic FPGA approaches were ruled out due to the lack of FPGA skills in the client's engineering team and the potential barriers this presented to product lifecycle maintenance. We acknowledge that most engineers and system architects are more familiar with DSP technology due to the simplicity of designing with DSPs. This is a clear advantage for DSPs. However, developer familiarity varies widely across design teams, so it is difficult say how important this issue is to a "generic" design team. Thus, this article ignores this DSP advantage and assumes that the choice of technology does not depend on developer familiarity. In order to choose between FPGA and DSP, we look at system performance requirements for signal processing and BOM cost. We consider devices from a major DSP vendor (Texas Instruments) and a major FPGA vendor (Altera) to guide us through this process. We identify some of the signal processing applications in which each specific technology is clearly superior. We also consider where an FPGA may be used as a co-processor to a DSP chip. DSP devices from Texas Instruments Note that the table foot-note provides details relating to each device family. The cost/performance data presented in this table should be considered in conjunction with these details. For example, each DaVinci digital media processor incorporates an ARM9 processor (at up to 297 MHz), a TMS320C64x+ DSP core (up to 4752 MIPS, and eight 8-bit MACs per cycle for up to 4752 MMACS) as well as many peripherals and internal memory. In Table 1, specific device names are excluded and only overall performance/cost ranges are categorized based on the device cost belonging to a certain family. Some families appear multiple times in the table as there are multiple devices within each family; some of which belong to different cost categories.
![]() Table 1. DSP device families from Texas Instruments in different cost categories. Table notes: (2): MIPS is defined as the number of instructions that can be executed in millions per second. A range of MIPS are available in various devices within each family. (3): MMAC is defined as the number of single precision floating-point or fixed-point multiply-and-accumulate 32-bit operations that can be executed in millions per second. MMAC performance values increase by a factor of 2x and 4x for 16-bit and 8-bit operations, respectively. All operations assume no truncation of multiplication results (i.e. results are twice the size of operation bit-width: 32-bit multiply generates 64-bit result; 16-bit multiply generates 32-bit results, etc). Device Family Notes: DaVinci Digital Media Processors include on-chip ARM9 processor, Ethernet MAC and/or Switch sub-system, Video/Audio Ports, Video Processing Subsystem, High Definition features, PCI Bus Interface, ATA Interface, USB Interface, etc. C6000 Fixed Point DSPs include Viterbi Decoder Co-processor, Turbo Decoder Co-processor, UTOPIA Slave 2 ATM Controller, PCI Bus Interface, RapidIO, Ethernet MAC, etc. C6000 Floating Point DSPs include Single/Double precision floating point DSP core, Enhanced CPU core, Audio Port, Dual Access Internal Memory, etc. C5000 Fixed Point DSPs include Dual/Quad Core DSPs, On-chip ARM7 Processor, Video Hardware Accelerator, ADC, USB, etc. This family includes low power devices and devices targeted for IP-Phone and Client Side Telephony applications. C2000 Digital Signal Controllers include 16/32-bit Core, Internal Flash Memory, 10/12-bit ADC (multi-channel), PWM, etc. This device family is targeted for Control Applications. Also note that Table 2 does not identify MIPS performance values. A designer may include one or multiple Nios II processors in the FPGA depending on the available resources (registers and memory). However, defining a MIPS performance value for the FPGA as a whole is impossible because the use of an embedded processor (and its specifications) is design specific. Finally, it is important to note that logic resources and registers in the FPGA may be used to create additional signal processing resources to increase the MMAC performance.
![]() Table 2. Various FPGA device families from Altera sorted in different cost categories. Table notes: (1): Costs were obtained from www.altera.com in Jan 2008. At this time, the pricing for only one device, EP3SL150, was available in Stratix III device family. (2): MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second. MMAC performance values increase by a factor of 4x for 16-bit operations. (3): Assuming a clock frequency of 120MHz for Cylcone II and Stratix II and a clock frequency of 165MHz for Cyclone III and Stratix III devices. The overall resource utilization is estimated to be 70%. Note that higher resource utilization and performance is achievable in the FPGA. Choosing FPGAs and/or DSPs In previous sections, we looked at the cost-performance values for DSPs and FPGAs. Table 3 summarizes these comparisons in three different MMAC performance categories: Low, Medium and High. Table 3 also groups devices according to their cost. For example, Medium-performance devices are sub-categorized into those costing $10~30, and those costing $30-100. This table shows minimum cost/performance values. Note that FPGAs and DSPs differ in their functionality and features. These features must be kept in mind while considering their cost/performance values. Blindly choosing an FPGA its cost advantage (which can be as low as 0.2 cents/MMAC) would be a mistake.
![]() Table 3. FPGA/DSP comparison summary. Table Note: MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second. In order to put DSP and FPGA features in contrast, one can use Table 3 and Table 4 as guidelines for choosing between FPGA and/or DSPs. The decision process shown in Table 4 considers application-specific DSP features. DSPs often come with bundled features (see Notes for Table 1) that can translate into cost savings. Therefore, a DSP with application specific features has an advantage over an FPGA with similar cost/performance value. Table 4 reflects this advantage.
![]() Table 4. Guideline for choosing DSP and/or FPGA. Table Note: MMAC is defined as the number of fixed-point 32-bit or single-precision floating point multiply-and-accumulate operations that can be executed in units of millions per second.
For designs with MMAC requirement below 300 MMAC, DSPs are in general the optimum solution from cost/performance perspective. For designs with MMAC requirement between 300 and 1000 MMAC, the DSP is generally preferable when it comes with application specific resources (such as video/audio ports, ARM processor, etc., as is the case with the DaVinci digital media processors). When a DSP with application specific resources does not exist, other aspects of the design must be considered. For applications with performance requirements above 1000 MMAC, FPGA/DSP Hybrid solutions are often the ideal solution. These applications often include multiple signal processing algorithms, some of which have low performance requirements. In such cases, relatively inexpensive DSPs can implement the algorithms with low-to-medium performance requirements, leaving the higher-performance algorithms to FPGAs. It is important to note that each design is unique. No global solution exists for choosing between DSPs and FPGAs. The aim of this article is to provide the reader with an overall overview of DSPs and FPGAs along with their overall cost/performance values and features. The data and guidelines presented in here may be used as the initial step for choosing a DSP and/or FPGA. The choice must be justified based on the totality of the design requirements. In the following sections, we look at some of the additional requirements one must consider when choosing DSPs and/or FPGAs. DSP in signal processing applications
As another example, Nuvation recently worked on a motor control application with multiple control loops that would work well in an FPGA's parallel architecture. However, a TI DSP capable controller was clearly optimal when we considered the cost implications of developing each separate processor-type function (communications, supervising, etc). FPGAs in signal processing applications It is important to mention that this performance is available at a cost premium, which can be up to an order of magnitude higher than the price of a DSP. The increased cost is due to NRE costs and the cost of the FPGA itself. For example, most signal processing systems require more than just a simple function such as a FIR filter. Most systems perform other types of functions such as data treatment and decision making. When using an FPGA, every additional signal processing function, data treatment and/or algorithm requires its own specialized logic. Therefore, added system complexity may quickly increase the device size, NRE costs and schedule. FPGA-DSP co-processing architectures In general, this means placing functions that consume less than 1000 MMAC on the DSP, and placing functions with higher requirements on the FPGA. For example, Nuvation implemented an envelope detection application with a 500 MSPS sample rate on a DSP-FPGA hybrid. The FPGA performed the initial high-sampling rate filtering and decimation, and a DSP performed the remaining signal processing functions. This system configuration profiting from advantages offered on each platform. Application example: IP camera reference design
Often these applications target the consumer electronic market. This translates to cost sensitivity, including a desire to minimize NRE costs. Furthermore, as consumer electronics is a fast moving market, time-to-market becomes a crucial factor. Overall project risk also needs to be minimized. Finally, the trend toward miniaturization of consumer electronics and security applications plays an important role in the system's form factor. The IP camera market illustrates these concerns. The explosive growth of video surveillance, machine vision and video teleconferencing, have created a need for a low-cost camera reference design. Such a design should allow clients to achieve rapid time-to-market with minimum NRE, while also allowing modification of the design to tailor it to a specific application. In response to these needs, Nuvation specified the following requirements for its IP camera reference design:
![]() Figure 1. Nuvation IP camera reference design. Photo Credit: Jason Rothe. To meet the requirement objectives, Nuvation engineers chose a DaVinci device from TI, the TMS320DM6446. This device is a high performance digital media system on chip (SoC) targeted at high-end video applications. It is a dual processor device that contains a C64x+ DSP core for accelerated video processing and an ARM9 core for co-processing tasks and peripheral management. As the central device in the IP camera reference design, the DM6446 is responsible for acquiring video data, encoding it in the desired format and outputting it via Ethernet and TCP/IP. As a dual processor device, the DM6446 allows designers to implement signal processing algorithms in the C64x+ DSP core while executing other tasks, such as packet assembly and peripheral management, in the ARM9 microcontroller core. The availability of a full Linux distribution for the ARM9 is another advantage of the DM6446. Linux allows system designers to use existing firmware in the open source community and quickly integrate third party libraries. The DM6446's Ethernet ports, video ports and small footprint and power consumption were also driving factors for choosing the device. In brief, the DM6446 DSP made it possible to meet the design requirements while minimizing NRE. FPGA or DSP? Choose Additional Quicklinks: · To subscribe yourself or a friend, please click here. · Questions? Comments? Send us your feedback. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © Nuvation Research Corporation 2008. All rights reserved. Privacy Policy | About Nuvation | NUVATION.COM |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||