PCI-SIG Finalizes PCIe 5.0 Specification: x16 Slots to Reach 64GB/sec
by Ryan Smith on May 29, 2019 6:30 PM ESTFollowing the long gap after the release of PCI Express 3.0 in 2010, the PCI Special Interest Group (PCI-SIG) set about a plan to speed up the development and release of successive PCIe standards. Following this plan, in late 2017 the group released PCIe 4.0, which doubled PCIe 3.0’s bandwidth. Now less than two years after PCIe 4.0 – and with the first hardware for that standard just landing now – the group is back again with the release of the PCIe 5.0 specification, which once again doubles the amount of bandwidth available over a PCI Express link.
Built on top of the PCIe 4.0 standard, the PCIe 5.0 standard is a relatively straightforward extension of 4.0. The latest standard doubles the transfer rate once again, which now reaches 32 GigaTransfers/second. Which, for practical purposes, means PCIe slots can now reach anywhere between ~4GB/sec for a x1 slot up to ~64GB/sec for a x16 slot. For comparison’s sake, 4GB/sec is as much bandwidth as a PCIe 1.0 x16 slot, so over the last decade and a half, the number of lanes required to deliver that kind of bandwidth has been cut to 1/16th the original amount.
The fastest standard on the PCI-SIG roadmap for now, PCIe 5.0’s higher transfer rates will allow vendors to rebalance future designs between total bandwidth and simplicity by working with fewer lanes. High-bandwidth applications will of course go for everything they can get with a full x16 link, while slower hardware such as 40GigE and SSDs can be implemented using fewer lanes. PCIe 5.0’s physical layer is also going to be the cornerstone of other interconnects in the future; in particular, Intel has announced that their upcoming Compute eXpress Link (CXL) cache coherent interconnect will be built on top of PCIe 5.0.
PCI Express Bandwidth (Full Duplex) |
|||||||
Slot Width | PCIe 1.0 (2003) |
PCIe 2.0 (2007) |
PCIe 3.0 (2010) |
PCIe 4.0 (2017) |
PCIe 5.0 (2019) |
||
x1 | 0.25GB/sec | 0.5GB/sec | ~1GB/sec | ~2GB/sec | ~4GB/sec | ||
x2 | 0.5GB/sec | 1GB/sec | ~2GB/sec | ~4GB/sec | ~8GB/sec | ||
x4 | 1GB/sec | 2GB/sec | ~4GB/sec | ~8GB/sec | ~16GB/sec | ||
x8 | 2GB/sec | 4GB/sec | ~8GB/sec | ~16GB/sec | ~32GB/sec | ||
x16 | 4GB/sec | 8GB/sec | ~16GB/sec | ~32GB/sec | ~64GB/sec |
Meanwhile the big question, of course, is when we can expect to see PCIe 5.0 start showing up in products. The additional complexity of PCIe 5.0’s higher signaling rate aside, even with PCIe 4.0’s protracted development period, we’re only now seeing 4.0 gear start showing up in server products; meanwhile the first consumer gear technically hasn’t started shipping yet. So even with the quick turnaround time on PCIe 5.0 development, I’m not expecting to see 5.0 show up until 2021 at the earliest – and possibly later than that depending on just what that complexity means for hardware costs.
Ultimately, the PCI-SIG’s annual developer conference is taking place in just a few weeks, on June 18th, at which point we should get some better insight as to when the SIG members expect to finish developing and start shipping their first PCIe 5.0 products.
Source: PCI-SIG
55 Comments
View All Comments
Lord of the Bored - Wednesday, May 29, 2019 - link
Probably both.edzieba - Thursday, May 30, 2019 - link
Bandwidth != latency. For DDR4, "I want that byte!" has single-digit nanoseconds of latency (usually just below 10ns). For PCIe, even with DMA, you're looking at several hundred to over a thousand nanoseconds (or in worst-case contention, maybe several microseconds). If all you're doing is shoving a big block of data across a bus then raw bandwidth is great. If you want to put a memory interface over that bus, then you would very quickly notice the difference between DDR and PCIe.xception - Thursday, May 30, 2019 - link
With the sizes of video cards lately makes you wonder why they aren't extending the slots to x32 as well.Duncan Macdonald - Thursday, May 30, 2019 - link
Power dissipation may be a problem - the faster that transistors switch the more power is needed. As PCIe 5.0 has a data rate around 32Gbits/sec per lane this implies that the chips are going to get hot. (The X570 chipset which uses PCIe 4.0 needs active cooling - this problem will be worse for PCIe 5.0).It is possible that CPUs may need a separate chip in their package to do a mux/demux job to reduce the data rates to the level that the rest of the silicon can handle (transistors built to operate at 32GHz are unlikely to be able to use the same process as the rest of the CPU running at 5GHz or less).
There is also going to need to be some serious microwave engineering on motherboards, add in cards and connectors to deal with the 32GHz signals. (At 32GHz a full wavelength is under 1mm!!) Screening of PC cases may well need to be much improved to stop PCs interfering with other users of the 32GHz frequency range.
brakdoo - Thursday, May 30, 2019 - link
Your numbers are one direction, not duplex...Tigran - Thursday, May 30, 2019 - link
So "actual bandwidth" in graph is duplex (bidirectional), whereas I/O bandwidth is bandwidth for one direction? It's in my question below. In this case you're right, there is a mistake in Anand's table ("Full Duplex").James5mith - Thursday, May 30, 2019 - link
They aren't.unidirectional = 64GB/s x16 PCIe 5.0
Full Duplex = 64GB/s x16 PCIe 5.0
Aggregate (The crap that marketing teams do that adds the bandwidth of both directions of a full duplex link) = 128GB/s x16 PCIe 5.0
1GbE = 1Gbps full duplex. I.e. 1Gbps in/out simultaneously. It seems like nobody thinks about the fact that speeds in PCIe USED to be reported the same way as network gear, I.e. full duplex.
Kudos to the writer of the article reporting it correctly. I.e. 64GB/s FULL DUPLEX.
Tigran - Thursday, May 30, 2019 - link
Quote from PCI-SIG's press release: "PCIe 5.0 Specification Highlights - Delivers 32 GT/s raw bit rate and up to 128(!!!) GB/s via x16 configuration". Is it a misprint?Tigran - Thursday, May 30, 2019 - link
I can see now it's not a misprint - the same in graph. What's the difference between actual and I/O bandwidth?arashi - Thursday, May 30, 2019 - link
GT measures raw data Transfer. Bytes or bits measure effective data transferred.Due to checksums GT/s > Gb/s.
8GT/s ~= 1GB/s
32GT/s * ~1 b per T * 16 lanes * 1/8 b per B * 2 directions = ~128GB/s