PCI Express is the latest development in PCI to support adapters and devices in IBM NetVista desktop PCs, Intellistation workstations, and IBM eServer xSeries servers. The technology is aimed at multiple market segments meaning that it can be used to provide for connectivity for chip-to-chips, board-to-boards, and adapters. This document provides a technical description of PCI Express and a performance comparison to PCI and PCI-X.
PCI Express is the latest development in PCI to support adapters and devices. The technology is aimed at multiple market segments, meaning that it can be used to provide for connectivity for chip-to-chips, board-to-boards, and adapters.
PCI Express uses a serial interface and allows for point-to-point interconnections between devices using directly wired interfaces between these connection points. This differs from previous PCI bus architectures that used a shared, parallel bus architecture.
A single PCI Express serial link is a dual-simplex connection using two pairs of wires, one pair for transmit and one pair for receive, and can only transmit one bit per cycle. Although this sounds limiting, it can transmit at the extremely high speed of 2.5 Gbps, which equates to a burst mode of 320 MBps on a single connection. These two pairs of wires is called a lane.
A PCI Express link may be comprised of multiple lanes. In such configurations, the connection is labeled as x1, x2, x4, x12, x16 or x32, where the number is effectively the number of lanes. So where PCI Express x1 would require 4 wires to connect, a x16 implementation would require 16 times that amount or 64 wires. This also results in differently sized slots.
Tip: When you refer to the lane nomenclature, you use the word "by", as in "by eight" for x8.
The figure below shows the slots for 32-bit PCI 2.0, PCI Express x1 and PCI Express x16. It is clear that the PCI Express x16 adapter will not physically fit in the PCI x1 slot.
Figure: PCI 2.0 and PCI Express Edge Connectors
You can install PCI Express slots in larger slots but not smaller ones. For example, you can install a PCI Express x1 adapter into an x16 slot (but will still operate at the x1 speed), but you cannot insert an x16 adapter into an x1 slot. This compatibility is shown in the table below.
Table: PCI Express slot compatibility
|x1 slot||x4 slot||x8 slot||x16 slot|
While the underlying hardware technology is different between PCI-X and PCI Express, they remain compatible at the software layer. So it will support existing operating systems, drivers and BIOS without any changes. As they are compatible at the device driver model and software stacks — PCI Express devices look just like PCI devices to software.
One of the major benefits of PCI Express is that it is not limited to use as a connector for adapters. Due to its high speed and scalable bus widths, it can be used as a high speed interface to connect many different devices. PCI Express may be used to connect multiple onboard devices and provide a fabric that is capable of supporting USB 2, Infiniband, Gigabit Ethernet, and others.
PCI Express currently runs at 2.5 Gbps, or 200 MBps per lane in each direction, providing a total bandwidth of 80 Gbps in a 32-lane configuration, and up to 160 Gbps in a full duplex x32 configuration.
Future frequency increases will scale up total bandwidth to the limits of copper (which is 12.5 Gbps per wire) and significantly beyond that via other media without impacting any layers above the physical layer in the protocol stack. The table below shows the throughput of PCI Express at different lane widths.
Table: PCI Express maximum transfer rate
|Lane width||Clock speed||Throughput (duplex, bits)||Throughput (duplex, bytes)||Initial expected uses|
|x1||2.5 GHz||5 Gbps||400 MBps||Slots, Gigabit Ethernet|
|x2||2.5 GHz||10 Gbps||800 MBps|
|x4||2.5 GHz||20 Gbps||1.6 GBps||Slots, 10 Gigabit Ethernet, SCSI, SAS|
|x8||2.5 GHz||40 Gbps||3.2 GBps|
|x16||2.5 GHz||80 Gbps||6.4 GBps||Graphics adapters|
PCI Express uses an embedded clocking technique using 8b/10b encoding. The clock information is encoded directly into the data stream, rather than having the clock as a separate signal. The 8b/10b encoding essentially requires 10 bits per character, or about 20% channel overhead. This encoding explains differences in the published spec speeds of 250 MBps (with the embedded clock overhead) and 200 MBps (data only, without the overhead). For ease of comparison, Table 5-3 shows throughput in both bps and Bps.
When compared to the current version of a PCI-X 2.0 adapter running at 133 MHz QDR, the potential sustained throughput of PCI Express x16 is over double the throughput as shown in the figure below.
Figure: PCI Express and PCI-X comparison (in Gbps)
This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment. publib-b.boulder.ibm.com