Click Here!

Winter 2007 Frontpage | Subscribe | Feedback 


In This Issue

USB on Vista

DSP-based
IP Camera

Device Spotlight:
Xilinx Virtex-5


Previous Issues

ATCA: The NexGen Telecom Standard

Video Processing in FPGA vs. DSP

Signal Integrity 101
Series: P1 | P2 | P3

Device Spotlights

Pixim D2500
Gennum VXP

IPFlex DAPDNA-2
TI DaVinci

Lattice XP
TI DM642 DSP
ADI TigerSharc
Altera's Nios II
Airgo Wireless



Nuvation HEADLINES 

New Events

» 

Nuvation at TI Developer's Conference,
Dallas TX, Mar 7-9

»  Nuvation at ISC West, Las Vegas, Mar 28-30

New Affiliations

» Microchip
Authorized Design House
»  CYPRESS Partner
CYPros Certified USB Consultant
» TI Low-Power RF Developer Network Member
»  TI DaVinci DSP Partner Expansion
 


Maximizing Performance in USB 2.0

Bernardo Elayda
Firmware Engineer
Nuvation


USB on VistaMost USB 2.0 products and devices do not achieve the high data-transfer rates that are available through the high-speed serial protocol.  Designers often overlook important design specifications and application specific options that can increase performance speeds to the order of 480 Mbps.

By understanding the bottlenecks of USB architecture, with a careful analysis of your application and selected USB hardware/software implementation, it is possible to reach maximum USB 2.0 speeds.  We begin with a summary of the USB specification and move to describing three main bottlenecks to consider in your design.


USB 101: Understanding the Specification

USB (Universal Serial Bus) is a 2-wire half-duplex serial communications protocol defined by a consortium of PC-related manufacturers.  The goal of this protocol was to define a standard physical, electrical, and software interface that was customer friendly and provided high data throughput.  In the definition of USB, the designers wanted to overcome some of the short comings of serial and parallel port interfaces such as the large variation in physical performance.

USB has been successful in exceeding the performance of old serial and parallel ports.  However, it is often a challenge to achieve performance that reaches the USB 2.0 speed of 480 Mbps.  Often, designers are able to prototype their USB-based product quickly in hardware and software, but performance usually does not reach the maximum throughput of 480 Mbps.  Sometimes, even though USB 2.0 hardware is used, the hardware will only perform at USB 1.x speeds or slower.


Maximum Throughput

Achieving the maximum throughput via USB 2.0 is dependent upon understanding the USB 2.0 specification, along with the hardware and software relationships of your embedded system.  Conventional models of serial and parallel port behavior do not directly transfer to USB so it is important to comprehend the nuances in both software and hardware.

There are actually two maximum performance limits for USB 2.0.  When USB 2.0 was first advertised, the maximum speed was stated at 480 Mbps.  This number of 480 Mbps is actually the maximum clock speed of the USB 2.0 bus.  The actual throughput is less than this because the throughput is limited by the overhead needed to encapsulate data for the USB bus.    Furthermore, USB throughput is based upon two different types of transfers of data: bulk and isochronous.  So depending on the transfer protocol utilized, the maximum theoretical speeds are actually 480 Mbps and 425 Mbps, respectively, less the encapsulation overhead and assuming no other bottlenecks.

Hardware bottlenecks can also severely limit USB performance, even when the hardware is designed for USB 2.0. 

We will explore some of the nuances below.

Isochronous Transfer vs. Bulk Transfer Protocol

Applications where throughput is more important than error use the USB isochronous transfer.  Isochronous transfers are designed to send data as fast as possible near the sustained data rate of 480 Mbps.  This performance comes at a price.  The transfer of data via isochronous does not use error correction.  Data may arrive at its destination corrupted.  Since throughput is more important than data quality, isochronous transfers do not check if data that is sent arrives at its destination.  It is usually acceptable to drop a frame of audio or video transfers due to human perception limitations.

However, applications that must guarantee that data is error free use the bulk transfer protocol.  The bulk transfer protocol provides error detection to ensure data quality.   Unlike the isochronous protocol, the bulk transfer checks that sent data arrives at its destination error free.  Data throughput via bulk transfers is limited to a theoretical maximum throughput of 425 Mbps if error detection is removed.  Error detection further reduces the theoretical maximum, depending on how it is implemented in software.

Hardware Bottleneck

There are many hardware options for implementing a USB-based communications protocol in a product.  USB communications are available on ASIC’s, FPGA’s, microcontrollers, and processors that use a RTOS.  These choices accommodate a wide variety of performance needs and costs.  However, even though you can find a cost-effective solution for your USB application, sometimes the hardware chosen has an inherent physical limit that excludes maximum performance.  This can happen with a simple misunderstanding of the USB 2.0 terms of LS, FS, and HS.

Low-Speed, Full-Speed and High-Speed Bottlenecks

In broad terms, every USB 2.0 communication device can be categorized as a LS, FS, or HS device.  These 2-letter acronyms are short hand for low-speed, full-speed, and high-speed.  LS devices are typically used in peripherals that interface with humans directly, such as mice, keyboards, or game controllers.  LS operates at a theoretical max of 1.5 Mbps which is sufficient for these applications.  FS devices are devices that run at the original USB 1.x speed and max throughput of 12 Mpbs.  HS devices are devices are capable of running at the theoretical maximum everyone wants: 480 Mbps.  Remember, these are theoretical maximums used for basis of comparison.

When hardware vendors do not clearly define if their product is a LS, FS, and/or HS capable device, it is possible to make an architecture-level error that will limit performance from the beginning.  And, even though the right type of LS, FS, or HS device is chosen for a USB 2.0 application, additional hardware limits can still cause throughput to perform well below FS or HS operation.  For example, hardware can severely limit USB performance if the USB path goes through the hardware’s main clocked path. 


Software Bottlenecks

The last bottleneck that can cause a severe reduction in the performance of a USB-based product is to apply software architectures defined for typical serial or parallel port interfaces to USB.  When a USB version of a product is used with a software application that isn’t aware that it is sending data via USB, performance will drop to kilobits per second.  This can make the difference between a download taking seconds vs minutes.

In the past, when software was written to use the serial or parallel port interface, software designers implemented their algorithms through the ‘bit-banging’ pattern.  In ‘bit-banging’, the programmer explicitly specifies the physical conditions of the bits of a port.  This approach is easy to understand and implement because the programmer is literally telling the hardware exactly how to behave.  The disadvantage of this approach is that it allows bad programming practices, such as no encapsulation, and is hardware specific.

Because applications that use ‘bit-banging’ often use bad software engineering practices and are hardware specific, these applications will perform very poorly when a USB software layer is simply added on top.  In this type of situation, performance is in kilobits.  Sometimes, operations take even longer in the USB implementation than the original serial/parallel port implementation. 

A software application must be made USB aware for efficient transfer of data.  Software that uses the USB API must send and receive data at the maximum payload size to prevent USB protocol overhead from exceeding data payload.

There are numerous tricks to optimizing USB performance in the firmware and software driver level.  The tricks vary by device (i.e. Cypress, Philips/NXP, ST Micro, FPGA cores, and others) and operating system (WinXP, Vista, Linux, uC Linux, MacOS, etc).  Sorry, there isn’t time and room to go into specifics but rest assured we do share this information during our design engagements.

Conclusion

In summary, achieving maximum data throughput for a USB based application depends on understanding the realities of the USB specification, and USB hardware/software optimization strategies.  Isochronous transfers and bulk transfers trade speed for accuracy.  Hardware can severely limit USB performance if it is not designed correctly.  Firmware and software must be carefully architected with an understanding of the operating system and device specifics and of course USB.  Time should be allocated for optimization if the application requires achieving maximum throughput.

Through a careful analysis of your application and hardware/software architecture, performance can be significantly enhanced.  We have developed products with tested throughput USB 2.0 over 40MB/s.  For more information please contact sales@nuvation.com.

Additional Quicklinks:

Ask Nuvation a Technical Question
Get an Online Quote


Search Nuvation.com
Customer service
· To subscribe yourself or a friend, please click here.
· Questions? Comments? Send us your feedback.





Copyright © Nuvation Research Corporation 2007. All rights reserved.
Privacy Policy | About Nuvation | NUVATION.COM