Hello and welcome to TI Precision Labs. In this series, we're going to discuss Peripheral Component Internet Connect Express, more commonly known as PCIe. PCIe is a motherboard expansion bus standard introduced in 2003 to enable high speed serial communication between the CPU and its peripheral components. Today, it has become the primary motherboard expansion bus standard and a popular communication method for many other on board applications. Let's explore how this high speed interface developed and how it works. The predecessor to PCIe is PCI. The PCI bus existed on many motherboards in the 90s, along with a few other expansion bus technologies. Motherboard expansion bus standards are designed for communication between the CPU and devices plugged into the motherboard's expansion slots. Initially, all expansion bus standard used a parallel bus, which means that data is sent and received over multiple channels. Initially, all expansion bus standards used the parallel bus, which means that data is sent and received over multiple channels. At the time of its introduction in 2003, the PCIe serial bus standard was meant to replace these older parallel buses to enable a higher data rate and to simplify system design. In 2003, the PCIe standard was defined by the PCI Sig organization. Since then, the PCIe standard has iteratively improved over time to accommodate the latest bandwidth needs of modern computers. This year, in 2021, the PCIe 6.0 specification will be introduced, enabling 64 giga transfers per second, or 64 gigabits per second per link. One unique feature of the PCIe standard is the ability to increase the number of lanes from 1 to up to 32 lanes to increase its throughput, a feature inspired by its parallel bus predecessor. A PCIe 6.0 link that is 16 lanes wide would have a data rate of 128 gigabytes per second, which is extremely fast by today's standards. PCIe communication is hierarchical, meaning there is a single source through which all the data passes, which is the root complex. The information passes to the root complex from multiple PCIe endpoints. Let's examine in detail how PCIe initializes and communicates over a typical link such as the one highlighted here. Here we have a diagram containing devices that are common in a PCIe link. A root complex is the interface between the system CPU and memory and the rest of the PCIe structure. The root complex is either integrated into the CPU directly, or is external to the CPU as a discrete component. A repeater is a signal conditioning device. For more information regarding signal conditioners, please refer to the "What is a signal conditioner?" video. Repeaters can fall into two categories. Retimers and redrivers. Both are common PCIe components used to maintain signal quality of high speed links. In our example, we'll be using a PCIe 4.0 compliant retimer. An endpoint is a general term for a PCIe end component. This could represent many different types of PCIe devices. In our case, let's assume that the endpoint is a graphics processing unit, or GPU. Before we examine how a PCIe link is established and data is transferred through the PCIe protocol, let's go over the function of some common PCIe control signals. PERST is referred to as a fundamental reset. It should be held low until all the power rails in the system are stable. A transition from low to high in this signal usually indicates the beginning of link initialization. WAKE and CLKREQ signals are both used for transitioning to and from low power states that are beyond the scope of this video. Please refer to the upcoming link training and of state machine video for more information. A REFCLK is a prerequisite for a PCIe device to begin data transmission. This 100 megahertz reference clock signal is used by the PCIe device to generate the high speed PCIe data within the link. After all devices in a PCIe link are powered, and have a reference clock provided, a PCIe device will have a receiver detect circuit on each lane that will allow it to determine if it has a link partner to pair with. Assuming that the PCIe Rx detect circuit sees the other device, each individual lane will then begin to transmit serial data at 2.5 gigabits per second. This is the lowest and most fundamental PCIe data rate, which was specified in the original PCIe 1.0 specification. PCIe 1.0, also called PCIe Gen 1, is compatible with any PCIe device. So every PCIe link will begin with the same link initialization process. In this example, the root complex, the retimer and the endpoint will all begin transmitting ordered sets of data called training sequences at PCIe Gen 1 speeds in order to establish bit and symbol lock. This stage of PCIe link initialization is referred to as the polling state. At the end of this process, each device will be able to interpret received data and respond accordingly. This allows the PCIe connection to begin to link training process and proceed into the configuration stage. In the configuration state, a lane to lane [INAUDIBLE] process takes place in which any misalignment in the data due to varying channel length is compensated for. The PCIe link width is also determined at this stage. At the end of this process, each lane will be associated with a specific link number, and a lane number within that link. If there are multiple links, the PCIe connection as referred to as bifurcated. However, in our example, we have a single non bifurcated connection, and all lanes will be assigned to link number 0. Keep in mind that the link is split in two parts due to the PCIe retimer. The link on both sides of the retimer undergo link initialization separately. After determining link and lane numbers, the PCIe link can move into a number of states. But for the example, it will move into what is called the L0 state, which is the normal operational state where data and packets are sent and received. Once we've reached L0, the root complex and endpoint can successfully communicate between each other. Alternatively, the PCIe link could transition into a number of low power states, or into another link training state called recovery. However, these states are beyond the scope of this video. Please stay tuned for the next video, which will cover the link training and status state machine in more detail. If all the devices in the PCIe link support PCIe Gen2 or higher data rates, that the link speed may also be increased up to the highest data rate supported. If the new data rate is to be PCIe Gen 3 or higher, the PCIe link will need to go through an additional link optimization process called link equalization. In link equalization, or Link Eq, the goal is to modify the characteristics of the transmitted data wave form for each part in a way that results in the most stable PCIe link. The PCIe specification defines the ways in which the signal can be modified by providing preset transmitter configurations. These configurations are aptly called presets. For PCIe Gen 3 and Gen 4, there are 11 presets numbered from 0 to 10 that may be used, each with its own unique signal characteristics. The preset values for each port are negotiated through Link Eq until the ideal preset is chosen. The downstream port begins this Link Eq process by sending its desired transmitter preset values for each lane to the upstream device. This is referred to as Phase 0 of Link Eq. Shortly after receiving the downstream port's request, the upstream port increases the data rate of the link to Gen3 and begins transmitting training sequences back to the downstream port using the desired presets. After the link has been increased to Gen 3 speeds, the Link Eq process continues to optimize the link by sending preset values back and forth to negotiate the preset configuration for each port. The goal of Phase 1 of the Link Eq process is to allow the link to be optimized enough to be able to exchange training sequences and complete the remaining Link Eq phases for fine tuning. During this phase, identical training sequences will be sent repeatedly to ensure the correct presets are received, despite the possibility of poor link quality. After Phase 1 has achieved a link with a bit error rate of less than 10 to the negative 4, the link is ready for fine tuning. The negotiation continues with Phase 2, further optimizing the preset values for the upstream port, while Phase 3 performs the same negotiation for the downstream port. After completing Phase 3 of the Link Eq process, Link Eq is completed, and the Gen3 PCIe link should have a bit error rate less than 10 to the negative 12. In some motherboard designs, particularly those with long channel links, this level of signal quality is not possible. Additional signal conditioning may be required. Fortunately for us, we have a retimer in the link and should have no issues accommodating most any channel length. The link now moves into a Gen 3 L0 state, and can communicate reliably at Gen 3 speeds. If you have any questions about PCIe or PCI, please visit our engineer supported forums at e2e.ti.com, and look for us in the Interface section. If you want more information on signal conditioning, please check out our other presentations and our TIPL series.