Ultra Low Voltage Logic Design For High-Speed Processing

Fulltekst

(1)Ultra Low Voltage Logic Design For High-Speed Processing Ole Herman Schumacher Elgesem. Thesis submitted for the degree of Master in Informatics: Nanoelectronics & Robotics (Nanoelectronics) 60 credits Department of Informatics Faculty of mathematics and natural sciences UNIVERSITY OF OSLO August 15, 2017.

(2)

(3) Ultra Low Voltage Logic Design. For High-Speed Processing. Ole Herman Schumacher Elgesem.

(4) © 2017 Ole Herman Schumacher Elgesem Ultra Low Voltage Logic Design http://www.duo.uio.no/ Printed: Reprosentralen, University of Oslo.

(5) Abstract Digitalization of modern society and industry has changed the way we communicate, travel, work, and trade. Globally there are now more connected devices than people and it is common for consumers to own several internet-enabled products. Researchers expect a large increase in connected devices in the near future. The Internet of Things (IoT) will consist of wearables, smart homes, vehicles, sensors, cameras, appliances et cetera. For practical reasons, many of these devices cannot have wired power supplies. Batteries are inconvenient and require significant energy and resources to manufacture. Energy harvesting is a promising opportunity for low-power connected devices. Small amounts of solar, heat, or kinetic energy can be used for simpler devices and sensors. In order to decrease power consumption of electronics, the supply voltage is often lowered, however, this negatively impacts speed. Low voltage (low power) technology with high (relative) speed is a necessity for many IoT applications. This thesis presents the Ultra Low Voltage Dual Rail (ULVDR) logic style, a technology aimed at achieving high processing speeds at low supply voltage. Implementations of ULVDR inverters, NAND/NOR gates, XOR gates, and adders are shown. Useful simulations, principles, and guidelines for creating ULVDR circuits are introduced. When compared to equivalent circuits implemented in Cascode Voltage Switch Logic (CVSL), the ULVDR NAND gates were 57 times faster, ULVDR XOR gates were 28 times faster, and the ULVDR full-adder was 52 times faster. Simulations were done on long chains (30-32 elements) using a supply voltage of 300 mV. The increase in speed can enable new types of applications, where high processing speeds are essential, or allow lower power consumption by further decreasing supply voltage or putting circuits to sleep when done processing.. 1.

(6) 2.

(7) Preface This thesis is submitted for a Master of Science in Informatics: Nanoelectronics and Robotics degree, with a specialization in nanoelectronics. After finishing a bachelor of science degree in 2015, I started a master’s degree in August 2015. Research and thesis work was conducted over a 2 year period, while I also took courses in the field of informatics and natural sciences. I would like to thank my supervisors and professors, Omid Mirmotahari and Yngvar Berg, for teaching me digital electronics, logic and ULV circuits. Their help with getting started in research as well as valuable feedback towards the end of the thesis is greatly appreciated. My good friend, Tor Jan Derek Berstad, also started his Master’s degree in 2015 and I’d like to thank him for the collaboration during the first year. We’ve studied together for 5 years now, and they’ve all been good. I appreciate his continued support and input on the thesis. Finally, I’d like to thank my family for their love, support, and inspiration. My parents, Anita Schumacher and Ole Herman Elgesem, have always supported me. I wouldn’t be where I am today if it weren’t for the opportunities and encouragement they have given me. My siblings, Maria, Kristina, Theresa, and Johannes, have inspired and challenged me. I respect and appreciate all of them. There are many more friends and family members which deserve to be mentioned, but in the interest of keeping this preface short, I have limited it to only close family and people involved with the thesis work.. 3.

(8) 4.

(9) Contents 1. 2. 3. Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . 1.2 Outline of this thesis . . . . . . . . . . . . . . . 1.3 Background . . . . . . . . . . . . . . . . . . . . 1.3.1 Previous research . . . . . . . . . . . . . 1.3.2 Terminology, notation and conventions 1.3.3 CMOS technology . . . . . . . . . . . . 1.3.4 Pseudo nMOS logic . . . . . . . . . . . . 1.3.5 Domino logic . . . . . . . . . . . . . . . 1.3.6 Cascode Voltage Switch Logic (CVSL) . 1.3.7 Floating gate logic . . . . . . . . . . . . 1.3.8 Low voltage operation . . . . . . . . . . Inverters 2.1 CMOS Inverter . . . . . . . . . . . . . . 2.2 CVSL Inverter . . . . . . . . . . . . . . . 2.3 Transistor drain currents at low voltage 2.4 ULVDR Inverter . . . . . . . . . . . . . . 2.4.1 ULV Inverters . . . . . . . . . . . 2.4.2 Precharge Matching . . . . . . . 2.4.3 Circuit topology . . . . . . . . . 2.4.4 Simulation & speed . . . . . . . . 2.4.5 Floating gate voltages . . . . . . 2.4.6 Input capacitors . . . . . . . . . . NAND/NOR gates 3.1 Flexibility of dual rail gates 3.2 CVSL NAND gate . . . . . . 3.3 ULVDR NAND gate . . . . 3.3.1 Circuit topology . . 3.3.2 Precharge Matching 3.3.3 Simulations & speed 3.4 Monte Carlo simulation . . 3.5 Chain delay . . . . . . . . . 5. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . .. 15 15 16 17 17 17 19 20 20 21 22 24. . . . . . . . . . .. 27 27 28 29 30 30 31 32 34 34 35. . . . . . . . .. 37 37 37 39 39 39 41 44 44.

(10) 4. XOR gates 4.1 CVSL XOR . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 XOR implemented using other gates . . . . . . . . . . 4.2.1 Logic verification . . . . . . . . . . . . . . . . . 4.2.2 Evaluation delay . . . . . . . . . . . . . . . . . 4.3 ULVDR XOR gate . . . . . . . . . . . . . . . . . . . . . 4.3.1 Circuit topology . . . . . . . . . . . . . . . . . 4.3.2 Keeper transistors in simplified ULVDR XOR . 4.3.3 Precharge Matching . . . . . . . . . . . . . . . 4.3.4 Simulation and Speed . . . . . . . . . . . . . . 4.3.5 Problems with the ULVDR XOR . . . . . . . . 4.4 Monte Carlo simulation . . . . . . . . . . . . . . . . . 4.5 Chain delay . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. 47 47 49 50 51 51 51 53 53 54 56 57 59. Adder logic 5.1 Logic operations needed for bigger circuits 5.2 CVSL full-adder . . . . . . . . . . . . . . . . 5.3 ULVDR adder . . . . . . . . . . . . . . . . . 5.4 32-bit Ripple-Carry Adder . . . . . . . . . .. . . . .. . . . .. . . . .. . . . .. 63 63 63 64 66. 6. Conclusion 6.1 Summary and results . . . . . . . . . . . . . . . . . . . . . . . 6.2 My contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Suggestions for further research . . . . . . . . . . . . . . . . .. 69 69 69 70. 7. Appendix 7.1 Schematic figures . . . . . 7.2 Articles . . . . . . . . . . . 7.3 Scripts . . . . . . . . . . . 7.4 Plotting . . . . . . . . . . . 7.4.1 scripts/plot_csv.py 7.4.2 scripts/plotter.py .. 71 71 71 71 71 71 82. 5. . . . . . .. Bibliography. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 83. 6.

(11) List of Figures 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8. Standard CMOS inverter, NAND and NOR gates . . . . . . . Pseudo nMOS inverter and NOR gate . . . . . . . . . . . . . A 4 input domino CMOS gate - X = AC + BD . . . . . . . . Regular(left) and dynamic(right) CVSL gate structure. . . . . Floating gate logic, the right figure shows the recharge transistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A simple floating gate inverter (NP-domino, precharge to 1) Drain current,ID , as a function of supply voltage . . . . . . . ID for different nMOS lengths (VDD = 300 mV) . . . . . . . .. 23 24 24 25. 2.1 2.2 2.3 2.4 2.5 2.6 2.7. CVSL inverter transient simulation . . . . . . . . . . . . . . Iterations of the ULV inverter (1,3,5,7) - P1 version shown . ULVDR Inverter 0P1 . . . . . . . . . . . . . . . . . . . . . . ULVDR Inverter 1P0 . . . . . . . . . . . . . . . . . . . . . . ULVDR 0P1 inverter floating gate voltage peaks at 575 mV ULVDR 1P0 inverter floating gate bottoms at −263 mV . . The impact of capacitor sizes on chain delay . . . . . . . .. . . . . . . .. 28 30 32 33 34 35 35. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11. AND, OR, NOR equivalents using one dual-rail NAND gate CVSL NAND gate delay (best case) . . . . . . . . . . . . . . . A ULVDR NAND gate - 0P1 version . . . . . . . . . . . . . . A ULVDR NAND gate - 1P0 version . . . . . . . . . . . . . . Logic verification of ULVDR NAND 0P1 . . . . . . . . . . . . Logic verification of ULVDR NAND 1P0 . . . . . . . . . . . . Monte Carlo simulation of CVSL NAND . . . . . . . . . . . Monte Carlo simulation of ULVDR NAND . . . . . . . . . . NAND gate chain . . . . . . . . . . . . . . . . . . . . . . . . . Output delay for chain of 30 CVSL NAND gates . . . . . . . Output delay for chain of 30 ULVDR NAND gates . . . . . .. 37 38 39 40 42 42 43 43 44 45 45. 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8. A CVSL XOR gate . . . . . . . . . . . . . A simplified CVSL XOR gate . . . . . . CVSL XOR parasitic delay (worst case) . XOR implementation using other gates Logic verification of ULVDR XOR 0P0 . Logic verification of ULVDR XOR 1P1 . Worst case delay for ULVDR XOR 0P0 . ULVDR XOR 0P1 (not simplified) . . . .. 47 48 49 49 50 50 51 52. 7. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. 19 20 21 22.

(12) 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19. ULVDR XOR 0P1 (simplified) . . . . . . . . . . . . . . Circuits used for matching Full XOR . . . . . . . . . . Circuits used for matching simplified XOR . . . . . . Logic verification of ULVDR XOR 0P1 . . . . . . . . . Worst case delay of ULVDR XOR 0P1 . . . . . . . . . . ULVDR XOR 0P1 floating gates . . . . . . . . . . . . . Monte Carlo simulation of CVSL NAND . . . . . . . Monte Carlo simulation of ULVDR XOR 1P0 . . . . . Monte Carlo simulation of ULVDR XOR 0P1 . . . . . Output delay for chain of 30 CVSL XOR gates . . . . Output delay for chain of 30 ULVDR XOR (0P1) gates. 5.1 5.2 5.3 5.4 5.5 5.6. Half adder (left) and full adder (right) . . . . . . . . . . . . . CVSL full-adder response to binary counting sequence . . . ULVDR 0P0 full-adder . . . . . . . . . . . . . . . . . . . . . . ULVDR 0P0 full-adder response to binary counting sequence Carry propagation through a 32-bit CVSL ripple-carry adder Carry propagation through a 32-bit ULVDR ripple-carry adder. 8. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. 52 53 54 55 56 56 58 58 59 60 61 64 64 65 66 67 67.

(13) List of Tables 1.1 1.2 1.3. Common symbols used in electronics . . . . . . . . . . . . . Symbols specific to transistors and sizing. . . . . . . . . . . nMOS drain currents at different supply voltages . . . . . .. 18 18 25. 2.1 2.2 2.3 2.4 2.5 2.6. Matched CMOS inverter sizes at VDD = 300 mV . . . . . . . Minimal CVSL inverter . . . . . . . . . . . . . . . . . . . . . . Sizes needed for matching pMOS/nMOS drain currents . . Matched transistors (minimum) for 0P1(left) and 1P0(right) . Matched transistors (optimal) for ULV inverters . . . . . . . ULVDR inverter sizes at VDD = 300 mV . . . . . . . . . . . .. 27 28 29 31 32 33. 3.1 3.2. . . . . . . . of ULVDR . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 38. 3.3 3.4 3.5. Minimal CVSL NAND . . . . . . . . . . . . . . . Matching the pull-up and pull-down networks NAND . . . . . . . . . . . . . . . . . . . . . . . . Minimal ULVDR NAND dimensions . . . . . . . ULVDR NAND delay for all 4 possible inputs . . Delays for a chain of 30 NAND gates . . . . . . .. 40 41 42 46. 4.1 4.2 4.3 4.4. Minimal CVSL XOR delay, all output transitions ULVDR XOR precharge matching results . . . . Delays for both versions of ULVDR XOR . . . . Delays for a chain of 30 XOR gates . . . . . . . .. . . . .. . . . .. 48 55 55 60. 5.1 5.2. All useful 2-input gates . . . . . . . . . . . . . . . . . . . . . . 32-bit ripple-carry adder delays (worst case) . . . . . . . . .. 63 66. 9. . . . .. . . . .. . . . .. . . . .. . . . ..

(14) 10.

(15) List of Acronyms CMOS Complementary Metal Oxide Semiconductor. 7, 9, 16, 17, 19–22, 24, 27–29, 31, 57 CVSL Cascode Voltage Switch Logic. 1, 7–9, 16, 17, 21, 22, 27–29, 37, 41, 44, 46, 47, 49, 51, 57, 59, 60, 63–67, 69–71 FG Floating Gate. 16, 17 IoT Internet of Things. 1, 15 MEMS Micro Electro-Mechanical Systems. 15 MOSFET Metal-Oxide-Semiconductor Field-Effect-Transistor. 11, 19, 22, 23 MSB Most Significant Bit. 66 nMOS n-channel MOSFET. 7, 9, 17, 19–23, 25, 27–31, 34, 37, 39, 41, 47, 54, 57 pMOS p-channel MOSFET. 9, 17, 19–21, 23, 25, 27–31, 34, 41, 54, 57, 59 ULV Ultra Low Voltage. 3, 7, 9, 16, 17, 20, 23, 29–32 ULVDR Ultra Low Voltage Dual Rail. 1, 6–9, 16, 17, 20, 22, 27–29, 31, 32, 34, 37, 39, 41, 44, 46, 47, 49–51, 53, 54, 56, 57, 59, 60, 63–67, 69, 70 UV Ultraviolet. 22, 23. 11.

(16) 12.

(17) Glossary domino logic Logic style where each element can only perform 1 fast evaluation, and has to be reset (slowly). 7, 17, 20, 22, 29 dynamic logic Clocked logic or logic otherwise dependent on timing. Dynamic logic typically has a setup/reset time and specific timing requirements for the inputs. Opposite of static logic. 17, 20 effective width Physical width of transistor multiplied by number of fingers. 25 evaluation The evaluation period is when ϕ is low. The evaluation transistor is used to drive the output during evaluation. 13, 14, 16, 17, 20, 22, 23, 27, 29–32, 34, 39, 51, 53, 54, 56, 57, 65, 70 floating gate A transistor gate node which is (at some times) cut off from the rest off the circuit. Capacitive coupling is used to affect the floating gate voltage. During precharge these nodes are driven by the recharge transistors, and thus not floating. 7, 14, 22–24, 30, 32, 34, 51, 57 GND Short for ground. Zero-voltage reference. 30, 57 keeper transistor Used in feedback during evaluation to shut off either the pull-up or pull-down network, based on the output. 29, 30, 32, 34, 51, 57 layout The physical/spacial implementation of a circuit on a silicon wafer. Describes the structure and size of the real world product. 29 mean, µ Arithmetic mean. Average. Sum all elements and divide by element count. 17, 41 minterm Systematic approach to defining boolean algebraic expressions. Minimal terms needed to express a function. Conventionally, each term is a product of all inputs (or their inverse). For example, logic OR can be defined by 3 minterms: AB + A0 B + AB0 . 47, 51 ϕ Clock signal. During precharge: ϕ = 1, during evaluation: ϕ = 0. ϕ0 is the inverse of ϕ. 13, 17, 23, 30 13.

(18) polarity The direction of rising/falling edges for a signal/gate. A signal which can have a rising edge during evaluation is considered positive, a signal which can have a falling edge during evaluation is considered negative. A gate has, by definition, the same polarity as it’s output signal. Gates usually have opposite input and output polarity, but some complex gates can have the same polarity for both input and output. 20, 23, 37, 49, 57 precharge The precharge period is when phi is high. The precharge transistor is used to drive the output during precharge. 7, 9, 13, 14, 17, 19, 20, 22, 23, 29–32, 34, 39, 41, 51, 53, 54, 56, 57, 59, 60, 64, 65, 69, 70 programming Charging a floating gate, setting its voltage to a specific value to alter the behavior of the circuit. Originally used in RAM and later for computation. 22, 23 recharge The recharge transistor charges the floating gate during precharge. 7, 13–15, 23, 29, 30, 32, 34, 70 static logic Circuits without a clock input (or similar) to change modes of operation. Timing of inputs is not important, a static circuit always evaluates. Opposite of dynamic logic. 13 transconductance, gm δID /δVG , ratio of drain current to gate voltage on a Field-Effect-Transistor. A high transconductance means a high increase in conductance between source and gate for a low increase in voltage.. 17, 25, 29, 31, 37 VDD Supply voltage. 300mV for the purposes of this thesis. 27, 29, 30, 34, 57. |VGS | Gate-to-source voltage (usually referring to the evaluation transistor). 22, 23, 27, 29, 32, 41. 14.

(19) Chapter 1. Introduction 1.1. Motivation. Over the past 30 years, electronics have become faster, cheaper and much more prevalent. This has pushed the industry toward faster and smaller electronic devices with lower supply voltages as well as power consumption. In 1965, Moore predicted an exponential increase in components per integrated circuit [16]. This trend proved accurate for several decades, and also applied to other areas of the silicon electronics industry, like RAM density, power consumption, and fabrication plant costs [20]. Currently, 90nm, 60nm or even smaller processes are common. As feature sizes approach a few atoms in length, further miniaturization becomes impossible. In 2010, Moore stated these fundamental limits will bring an end to Moore’s law in 10-20 years [6]. Typically, lowering supply voltage reduces power consumption and speed. Lower power consumption is desirable, but lower speed is an obstacle for creating low-power devices. Faster logic at low supply voltage is an exciting opportunity for reduced energy requirements and new applications. Miniaturization and low-power technology enable small connected devices to operate without a wired power supply. Sensors, cameras, appliances, wearable technology, vehicles and infrastructure can connect to the internet, in addition to traditional computers, servers and network infrastructure. All of these are referred to colloquially as connected devices. In 2015, Cisco estimated that the number of connected devices will double within 2020, from 25 to 50 billion [18]. The expectations for a rapid increase in connected devices, where physical objects are digitally enabled to capture, generate, and process unprecedented amounts of information, is referred to as the Internet of Things (IoT). All of these devices have requirements in terms of functionality and power consumption. For many applications, it is inconvenient to replace or recharge batteries. Producing large amounts of batteries also requires significant energy and resources. The "greening of IoT" (making edge- and network-devices as energy efficient as possible) has been recognized as a key challenge for IoT [9]. 15.

(20) Energy harvesting can be used to supply low-power electronics, eliminating the need for batteries. Solar, mechanical, or thermal energy can be utilized to power small devices. These methods have been demonstrated to work in watches, shoes, wireless remotes, and Micro Electro-Mechanical Systems (MEMS) [17]. The limited power output of energy harvesting solutions restricts where they can be used. Electronics with low power requirements can enable the use of energy harvesting in more areas, preserving valuable resources. In the biomedical field, implanted sensors and devices can aid humans in a number of ways. Hearing and vision implants can benefit from increased energy efficiency. It is often desirable to perform low-power/highspeed computations on these devices. Implanted sensors can collect health data, like glucose levels for people with diabetes. Better biomedical devices can significantly impact patients’ life quality. All of the mentioned applications and trends illustrate the need for high-speed/low-power electronics. Advances in this area are necessary to enable new types of devices and use cases. Minimizing energy usage and moving away from batteries offer convenience and significant environmental benefits.. 1.2. Outline of this thesis. Chapter 1 (Introduction) gives an introduction to the Ultra Low Voltage (ULV) logic field, in terms of motivation, history, previous research and concepts. The Background section explains core concepts used in ULVDR logic. That includes Complementary Metal Oxide Semiconductor (CMOS), CVSL, and Floating Gate (FG) logic, as well as low voltage / sub-threshold transistor operation. ULV and ULVDR inverters have been extensively covered in previous papers. In Chapter 2, CMOS, CVSL, ULV, and ULVDR inverters are covered and compared. Chapters 3 and 4 present implementations of NAND and XOR gates respectively. These gates are then used in Chapter 5 to build adders and show the performance of the ULVDR logic style. All ULVDR circuits are compared to CVSL equivalents. The conclusion (Chapter 6) discusses the importance of the results and elaborates on the future oF ULVDR research. Information and various resources are in the Appendix, including papers ready for publication, scripts used, license for figures, etc. The main focus of this thesis is developing high-speed ULVDR technology. Most previous research in ULV and ULVDR has been focused on inverters. Thus, exploring the potential of ULVDR in bigger circuits, like NAND/NOR, XOR, and adders, is important. Transistor dimensions, logic verification and evaluation speed are considered essential. Concepts like power consumption, layout, manufacturing, etc. are covered in other papers and can be explored further in future research. 16.

(21) 1.3. Background. The ULV logic style utilizes many techniques to achieve high speed at low supply voltage. It has been developed over several years. This section gives an overview of the techniques used and previous iterations of the technology.. 1.3.1. Previous research. ULV logic is based on CMOS [21], domino logic [10], and Floating Gate (FG) logic [11]. The evolution of the ULV logic family can be seen in [4], [2], [3], and [14]. These papers show different iterations of the single rail ULV inverter (ULV1-ULV7). Domino logic was introduced in ULV2. ULV3 and ULV4 were inspired by pseudo-nMOS, eliminating the duplicate evaluation networks by having the input only connected to the nMOS transistor. ULV5 improved the stability and robustness of the ULV3 by adding a keeper transistor and had lower power consumption than ULV4 as it didn’t use the static precharge transistor. Thus ULV5 was named the Low Power ULV inverter. The final iteration of the ULV inverter was the ULV7, based on the ULV5. In this iteration another keeper transistor was added to turn off the precharge transistor when the output switched. Bechmann covers the single rail ULV7 inverter in his master thesis [1]. The thesis includes simulations, layout and real world measurements of the ULV7 inverter. He also discussed the potential of this logic style in adder circuits, and did some simulations on adders. ULVDR is a dual rail version of the ULV7, inspired by CVSL [7]. It has two complementary circuits - during evaluation exactly one of the outputs will switch. ULVDR Inverters and NOR gates have been shown in [15] and [5].. 1.3.2. Terminology, notation and conventions. The supply voltage, unless otherwise noted, is VDD = 300 mV. All simulations were done using TSMC 90nm process, with low threshold voltage transistors (nch_lvt & pch_lvt). Cadence Design Systems’ Virtuoso (schematic design) and Spectre (simulation) were used. This thesis focuses on dynamic logic, which is dependent on a clock signal. ϕ is the clock signal and evaluation happens when ϕ is low. When ϕ is high, circuits are reset and outputs are precharged to a predetermined voltage. This time period is called precharge. Circuit inputs are labelled A, B, C and so on, while outputs are typically labelled X, Y, Z. For the sake of consistency, input labels are always ordered based on their evaluation transistors, the input transistor closest to the output node is called A. In 2-input circuits, input transistor A is always connected directly to the output (drain), while B is connected to the voltage source. Inverted signals are typically referred to using a prime symbol - A0 is the inverse of A. Due to technical limitations, labels in the simulation software 17.

(22) Symbol: t P V I R G X B Z Y. Meaning: Time Power Voltage Current Resistance (1/G ) Conductance (1/R) Reactance (1/B) Susceptance (1/X ) Impedance (1/Y ) Admittance (1/Z ). Unit: s W V A Ω S Ω S Ω S. Table 1.1: Common symbols used in electronics. Symbol: W L N p n G S D E P s k gm r f e d µ. Meaning: Width Length Fingers pMOS ? nMOS ? Gate ? Source ? Drain ? Evaluation (transistor) ? Precharge (transistor) ? Series ? Parallel ? Transconductance Rising edge ? Falling edge ? Edge (input rise/fall time) ? Delay (parasitic) ? Arithmetic mean (Average) ?. Table 1.2: Symbols specific to transistors and sizing. ?: these are used in subscripts. 18.

(23) do not use the prime symbol. Instead 1 is used to indicate noninverted and 2 indicates inverted. Thus, A1 = A, A2 = A0 , B1 = B, B2 = B0 , X1 = X, X2 = X 0 . In some plots, labels IN (input) and OUT (output) are used for clarity. Note that for ULVDR and other precharge logic signals are only different (inverse) during the evaluation period. When talking about transistors and sizes, many acronyms are used in variables. Table 1.1 and 1.2 show common symbols and abbreviations used in formulae and calculations. For example, when discussing transistor sizes, WpP is the width of a p-channel MOSFET (pMOS) precharge transistor. See Glossary and Acronyms for more info.. 1.3.3. CMOS technology. CMOS logic can be traced back to 1963 [21] and should be a familiar concept for anyone involved in modern electronics and circuit design. The general principle is to have two combinatoric transistor networks; a pullup network of pMOS transistors and a pull-down network of n-channel MOSFET (nMOS) transistors. The networks are complementary (one pins the output to logic high, the other to logic low) and designed in such a way that one of them is always active. In practice, this means that any series connection of transistors in the pull-down network should be implemented as a parallel connection in the pull-up network, and vice versa. As pMOS and nMOS transistors perform opposite functions with respect to the input, this means that when both transistors in a series configuration are conducting, the corresponding parallell transistors are both non-conducting. Similarly, when one or both transistors in a series configuration are non-conducting, one or both corresponding parallel transistors are conducting.. A. B. B X. X. X’. A. A X. B. A. B. Figure 1.1: Standard CMOS inverter, NAND and NOR gates. 19.

(24) 1.3.4. Pseudo nMOS logic. The CMOS gates in Figure 1.1 each implement a boolean expressions twice. At any given moment a CMOS gate evaluates whether the pullup network should be open/closed and whether the pull-down network should be open/closed. This is not strictly necessary - one of the networks can be replaced by a constant resistance as long as the other network is significantly more conductive when switched on.. X A. X A. B. C. D. Figure 1.2: Pseudo nMOS inverter and NOR gate. In pseudo nMOS logic, the pull-up network is replaced by a single always conducting pMOS transistor. Figure 1.2 shows pseudo nMOS gates. Pseudo nMOS gates use fewer transistors, N + 1 ≤ 2 ∗ N, and have lower input/output capacitances (inputs are only connected to nMOS transistors, outputs are only connected to 1 pMOS) than CMOS. The biggest drawback of pseudo nMOS logic is the increase in static power consumption. Dynamic pseudo nMOS addresses this issue and domino logic presents a more useful logic style than the simplistic static pseudo nMOS.. 1.3.5. Domino logic. Domino logic is an extension of dynamic pseudo nMOS [10]. Like in physical dominos, each domino gate requires a relatively slow setup phase (precharge) before they can quickly fall (pull-down) in a chain (evaluation). The average delay of a logic gate in a circuit/chain, from input to output, is called evaluation delay. The setup (precharge) can be done in parallel and thus does not depend on the logic depth of the circuit. Big-O notation is commonly used in software to characterize time scaling of different algorithms, but the same concept applies to hardware as well. The precharge time is, O(1), constant time, while the evaluation time is O( N ), it scales linearly with the number of elements in a chain. It is clear that for long chains of logic elements it is crucial to optimize for short evaluation delay. 20.

(25) CLK X A. B. C. D. CLK. Figure 1.3: A 4 input domino CMOS gate - X = AC + BD. Figure 1.3 shows a simple 4-input domino gate. The static power consumption is eliminated by clocking the pull-up transistor and pulldown network. When CLK is low, only the pull-up transistor can be active (precharge). When CLK is high, only the pull-down network can be activated (evaluation). The inputs are connected only to the pull-down nMOS network and should be low when evaluation starts. The CMOS inverter is added so the output can have the same polarity as the input, enabling it to connect to the next gate in a chain. Alternately, one could alternate between pseudo nMOS and pseudo pMOS based domino gates, this is called NP domino. At higher supply voltages, VDD = 1200 mV, for example, a CMOS inverter is relatively fast, and avoiding pMOS networks is valuable, so the nMOS based style in Figure 1.3 is preferred. As CMOS inverters become incredibly slow at low supply voltages, ULV and ULVDR utilizes the NP domino logic style instead.. 1.3.6. Cascode Voltage Switch Logic (CVSL). CVSL is a dual rail logic style. Like in CMOS logic, every gate will contain two complementary logic networks. In CVSL both logic networks are pulldown networks of nMOS transistors. Each side looks like a pseudo nMOS circuit, except the pMOS transistor is controlled by the opposite output. When inputs change, one of the nMOS networks opens while the other closes. The conducting network pulls one rail low, and activates the pMOS for the other side, pulling that rail high. This results in two very stable outputs, one driven by an open pMOS pull-up transistor and another driven by an active nMOS pull-down network. The other transistors are 21.

(26) φ. Y. Y’. Y. f. Y’. f. f’. f’ φ. Figure 1.4: Regular(left) and dynamic(right) CVSL gate structure.. not conducting, so there is negligible static power consumption. In complex CVSL gates, parts of the nMOS network can be shared, reducing number of transistors as well as input capacitance [7]. In dynamic CVSL, CMOS inverters are added to the outputs, and the feedback is moved so that each rail is independent. Clocked CVSL has a precharge and evaluation period. It is almost identical to domino logic, as explained in Section 1.3.5, but it is dual rail - inputs and outputs come in complementary pairs, and nMOS transistors can still be shared. Similar to ULVDR both outputs start at a precharged voltage. They should be complementary, so one is correct, and the other needs to pull down. Both dynamic and static CVSL styles are shown in Figure 1.4.. 1.3.7. Floating gate logic. Floating gate programming In Metal-Oxide-Semiconductor Field-Effect-Transistor (MOSFET) logic (CMOS, CVSL, or otherwise) it is common to connect the input(s) directly to the transistor gate(s). This way the input voltage directly controls the conductivity of the transistors. A network of transistors connected to inputs is used to evaluate what the output should be. For the purposes of this thesis, transistors which are connected to inputs (and used to evaluate outputs) are called evaluation transistors. As the inputs are connected directly to the outputs of another circuit, the gate voltage, |VGS |, is limited to: 0 + VDSn ≤ VGS ≤ VDD − VSDp By electrically isolating the gate node, it becomes floating and is no longer restricted to this voltage range. The floating gate concept was introduced in 1967, by Kahng and Sze [8]. An isolated gate node could 22.

(27) be programmed, charging it to a specific voltage. Various techniques were used to to charge the floating gate, including Fowler-Nordheim tunneling, hot-electron injection and Ultraviolet (UV) radiation [13]. The precursor to ULV circuits, FLOGIC, was based on Ultraviolet (UV) radiation programming. Exposing the gate to UV radiation could charge the gate and effectively reduce the threshold voltage of both nMOS and pMOS transistors. At low supply voltages this has a great impact on speed [11]. Rapid reprogramming - recharge. Figure 1.5: Floating gate logic, the right figure shows the recharge transistor. In floating gate logic, the MOSFET gate is not directly connected to the input, but instead the voltage is affected through a capacitance. Instead of doing programming once during manufacturing, it can be done while the circuit is operating, using a transistor. The recharge transistor is connected to the floating gate and controlled by the clock signal. Figure 1.5 shows an nMOS floating gate, with and without a recharge transistor. When the rising/falling edge arrives at the input, capacitive coupling causes the gate voltage to approach 2 ∗ VDD for nMOS or −VDD for pMOS evaluation transistors. This requires that the input is precharged to a specific value, and the polarity of the input is known. Doing this rapid reprogramming of the gate eliminates the extra manufacturing step (programming). Additionally, the gate can be recharged to different voltages, and discharged if needed. The problem of threshold voltages shifting over time due to leakage is also eliminated. Charging the gate node is referred to as recharge while setting the output (connected to the input of the next gate) is called precharge. Both of these events happen during the precharge period, i.e. when ϕ is high. A simple floating gate inverter can be implemented as in Figure 1.6. During precharge, when ϕ is high, everything is high. When ϕ goes low, the recharge transistor is turned off and the gate becomes floating. When a rising edge arrives at the input the gate becomes supercharged, due to capacitive coupling. The inverter in Figure 1.6 can reach a gate voltage, |VGS |, of almost 2VDD . Section 1.3.8 on the following page explains the 23.

(28) φ’ Y X. φ. Figure 1.6: A simple floating gate inverter (NP-domino, precharge to 1). significance of this increase in |VGS |.. 1.3.8. Low voltage operation VDD. A. 50. +. Current[uA]. 40 30 20. -. 10 0. 0. 200. 400. 600. 800. Supply Voltage[mV]. 1000. 1200. Figure 1.7: Drain current,ID , as a function of supply voltage Before creating logic gates at low supply voltage it is important to look at how transistors operate at these voltages. Figure 1.7 shows the difference in drain current at supply voltages between 300mV and 1200mV. This data is also presented in Table 1.3. A CMOS inverter at VDD = 600 mV is expected to be about 70 times faster than at VDD = 300 mV (provided sizes and capacitances are similar). Note that this is not a simple linear or quadratic increase, doubling the supply voltage increases the current by 70 times. For supply voltages above 600mV, the drain current has an almost 24.

(29) linear increase. This thesis will focus on the supply voltage, VDD = 300 mV, utilizing floating gates to achieve gate voltages close to |VGS | = 600 mV. Supply voltage: VDD = 300 mV VDD = 600 mV VDD = 900 mV VDD = 1200 mV. ID : 0.13 µA 9.26 µA 32.13 µA 58.73 µA. /ID300 1 71.23 247.15 451.77. Table 1.3: nMOS drain currents at different supply voltages. 100nm. 240nm. 300nm. Figure 1.8: ID for different nMOS lengths (VDD = 300 mV) A minimal size nMOS in the TSMC process is typically W/L = 120nm/100nm. Effective width of the transistor can be adjusted to increase current draw, either by increasing the physical width or the number of transistor fingers. The smallest length possible is beneficial; it causes the highest transconductance. However, at VDD = 300 mV this is not the case. A parametric sweep of drain current vs input voltage at different transistor lengths shows an interesting effect. Figure 1.8 highlights the results for some lengths (lengths with lower drain currents were removed for clarity). For Vi > 500 mV minimum length (L = 100nm) is optimal. At Vi = 300 mV the longer 240 nm transistor has significantly higher drain current than the 100 nm version. The results are similar for pMOS transistors. Thus, the default size used in this thesis will be W/L = 120 nm/240 nm. In order to increase transconductance the effective width will be scaled by adding transistor fingers. Transconductance can be lowered by decresing length, or increasing width (up to 240 nm).. 25.

(30) 26.

(31) Chapter 2. Inverters 2.1. X. CMOS Inverter. X’. Variable: Wp Lp Wn Ln td f tdr. VGS = VDD 500 nm 240 nm 120 nm 240 nm 1.1812 ns 1.1769 ns. VGS = 2 ∗ VDD 340 nm 240 nm 120 nm 240 nm 55.9014 ps 55.8089 ps. Table 2.1: Matched CMOS inverter sizes at VDD = 300 mV (An identical inverter was used as load). A simple CMOS inverter was simulated to set some transistor dimensions for the more complicated circuits. Section 1.3.8 explains why transistor length, L = 240 nm was chosen. The results can be seen in Table 2.1. At VDD = 300 mV a pMOS of dimensions W/L = 500 nm/240 nm matches the default nMOS discussed in Section 1.3.8 (W/L = 120 nm/240 nm). Transistor fingers will be utilized later, to dimension more complicated gates. As the evaluation transistors will have a Gate to Source voltage, |VGS |, of 500 mV to 600 mV, the CMOS inverter was also simulated with higher |VGS |. At |VGS | = 600 mV the pMOS has a different characteristic, and W/L = 340 nm/240 nm matches the aforementioned pMOS. Note that supply voltage, VDD , is still 300 mV, and will be for the rest of the thesis. Traditional CMOS gates are not used for comparison in this thesis, instead, CVSL gates are shown, simulated and compared against. CVSL has lower delays than CMOS, but still within one order of magnitude. As Heller et al. states: “CVSL has been found to offer a performance advantage of up to 4X compared to CMOS ” [7]. ULVDR shares many similarities with CVSL and it is much easier to create fair comparisons as both are dual rail technologies. 27.

(32) 2.2. CVSL Inverter. Y=A’. Variable: Wp Lp Wn Ln Nn te td f tdr. Y’=A. A. A’. Value: 120 nm 240 nm 120 nm 240 nm 2 1 ps 0.4708 ns 3.9165 ns. Table 2.2: Minimal CVSL inverter (An identical inverter was used as load). CVSL is the main technology used for comparison to ULVDR. See Section 1.3.6 for a detailed account of the CVSL technology and its origin. Similarly to CMOS, in order to gain a better understanding of CVSL sizing, simulations were run at VDD = 300 mV. The results in Table 2.2 are particularly useful for creating more complex CVSL circuits. A close to ideal input, with a falling edge transition, te = 1 ps, was utilized.. 300 250. Voltage[mV]. 200 32.0. 150. 32.471. 35.917. 100 50 0. A1 X1 X2. 50 30. 31. 32. 33. 34. Time[ns]. 35. 36. 37. 38. Figure 2.1: CVSL inverter transient simulation. CVSL is different from standard CMOS, and it is impossible to match the delay of the output rising and falling edge. This is due to the fact that the inputs affect a pull-down network, which in turn enables the pull-up transistor of the other output. Pull-down of one output node always comes before pull-up of the other, this is confirmed in Figure 2.1. The nMOS pulldown network also needs to be significantly stronger than the pMOS, in 28.

(33) order to effectively pull-down while the pMOS is conducting. With this in mind, the focus was to create a minimally sized, logically functional CVSL inverter. Dynamic (Clock) CVSL [7] is another alternative, shown in Figure 1.4. This version uses a clock signal, and precharge/domino logic, similar to ULVDR. Clocked CVSL was designed for higher voltages, and includes a CMOS inverter on the output, as well as a series nMOS down to ground. Both of these elements are problematic at VDD = 300 mV and thus the static CVSL will be used. The author thinks a comparison of clocked and static CVSL would be useful and interesting but falls a little outside the scope of this thesis.. 2.3. Transistor drain currents at low voltage. Section 2.1 attempts to find some standard transistor sizes for building logic at VDD = 300 mV by simulating a CMOS inverter. However, as other gates might have very different transient behavior, it is more appropriate to measure transconductance (or drain current) directly on the transistor level. Table 2.3 shows the sizes and matched currents for an nMOS and pMOS. At |VGS | = 300 mV the resulting pMOS width is the same as when doing CMOS inverter matching (Section 2.1). For |VGS | = 600 mV the pMOS width is a little higher than that found in Section 2.1.. Wp Lp Wn Ln Ip In. |VGS | = VDD 500 nm 240 nm 120 nm 240 nm 236.5854 µA 237.677 nA. |VGS | = 2 ∗ VDD 380 nm 240 nm 120 nm 240 nm 6.6705 µA 6.6844 µA. Table 2.3: Sizes needed for matching pMOS/nMOS drain currents VDD = VDS = 300 mV. This is, of course, an approximation - as |VDS | approaches 0 nonlinearities will occur. Drain currents are not matched for the whole range, 0 ≤ |VDS | ≤ 300 mV. The sizes in Table 2.3 can be useful to create pMOS and nMOS transistors with matching currents at |VGS | = 300 mV. For example, matching the recharge transistor currents might be desirable. However, recharge and keeper transistors are usually kept at minimum sizes, to reduce layout size and capacitances. Precharge and evaluation transistors require more careful matching (precharge should be much stronger), this is done in Section 2.4.2. 29.

(34) 2.4 2.4.1. ULVDR Inverter ULV Inverters. Figure 2.2 shows the different iterations of the ULV inverter. 1 Every iteration after the first includes a similar pMOS based version. The ULV7 inverter shown to the far right is the predecessor to the ULVDR inverter. Non dual-rail ULV inverters are not the focus of this thesis, but useful for understanding ULVDR. Simulations of ULV inverters are not included, but references and results of previous research are discussed in Section 1.3.1. The ULV7 consists of many elements and signals, so an explanation of the individual parts is appropriate. V+ and V− (called Vo f f set+ and Vo f f set− in other articles) are the voltage offsets for the floating gates. It should be noted that these do not have to be equal to VDD and GND. In this thesis, they are usually replaced with VDD and GND for simplicity’s sake. The transistor connected directly to the output, but not to the input is called a precharge transistor. While ϕ is high it precharges the output to a known value. The transistor with a floating gate and drain connected to output is called the evaluation transistor. During the evaluation period, when ϕ is low, it evaluates the boolean expression of the gate based on the input(s). A precharge to 1 (0P1) gate has an nMOS network of pull-down evaluation transistors. Similarly, a precharge to 0 (1P0) gate has a pullup network of pMOS evaluation transistors. 2 (For the ULV inverter this "network" is just one transistor). During precharge, the transistors with drain connected to the floating gates will recharge them. These are intuitively called the recharge transistors. Finally, two more transistors are added, called keeper 1 See. Section 1.3.1 for more information on the different iterations, and the articles published about them. 2 Precharge to 1 is called 0P1 as the input is 0 and the output is 1 during evaluation. Similarly a precharge to 0 version is called 1P0.. V-. φ. φ. V-. φ. φ’. φ. V-. V-. X. φ X. A. X A. φ’. φ V+. X A. A φ’. φ’. φ V+. φ V+. φ’. φ V+. Figure 2.2: Iterations of the ULV inverter (1,3,5,7) - P1 version shown. 30.

(35) transistors. They are used to shut off the pull-up or pull-down network depending on the output. If one output changes, the keeper transistors will discharge the appropriate floating gates. For the side (rail) which changed, the precharge transistor should be turned off. For the side that didn’t change, the evaluation transistor(s) should be turned off. Both keeper transistors improve robustness/stability of the output and reduce power consumption.. 2.4.2. Precharge Matching. Once evaluation starts the ULV circuits have a similar state to a CMOS inverter with both transistors conducting. Intuitively, if sizes from Section 2.3 are used for precharge and evaluation transistors the output should stabilize at around VO = VDD /2. 3 This is unacceptable - the output should be stable at 90% of the precharge value. That is 270 mV for 0P1 and 30 mV for 1P0.. Pp Vo. En. Variable: Wp Lp Np Wn Ln VO IS. 0P1: 120 nm 240 nm 3 240 nm 100 nm 274.23 mV 89.77 nA. 1P0: 120 nm 100 nm 1 120 nm 100 nm 28.62 mV 51.04 nA. Ep Vo. Pn. Table 2.4: Matched transistors (minimum) for 0P1(left) and 1P0(right) Starting with minimal transistors (W/L = 120 nm/100 nm) simulations were performed to scale transistor dimensions. The resulting sizes can be found in Table 2.4 The precharge transistors have to be considerably stronger than the evaluation transistors to be within the 10% target. As the nMOS transistors are inherently stronger, the finger count and length of the pMOS precharge transistor, Pp was increased. 4 By coincidence, the minimal nMOS and pMOS achieve an output voltage very close to 30 mV and can be used without modification. However, optimal parameters should be better than 10%, to account for mismatch [15]. Optimal sizes for ULV and ULVDR precharge and evaluation transistors are listed in Table 2.5. 5 3 Not exactly as the nMOS and pMOS behave nonlinearly and differently at various voltages 4 Increasing length up to 240 nm increases transconductance (at V DD = 300 mV). See Section 1.3.8 for more info. 5 These sizes are from [15]. Simulations were performed - source currents and output voltages were added.. 31.

(36) Variable: Wp Lp Np Wn Ln VO IS. 0P1: 120 nm 240 nm 4 240 nm 100 nm 281.20 mV 90.92 nA. 1P0: 120 nm 100 nm 1 120 nm 240 nm 11.11 mV 52.33 nA. Table 2.5: Matched transistors (optimal) for ULV inverters. 2.4.3. Circuit topology. φ. φ. X’. X. A. A’. φ’. φ’ φ Figure 2.3: ULVDR Inverter 0P1. Figure 2.3 and 2.4 show the precharge to 1 (0P1) and precharge to 0 (1P0) versions of the ULVDR inverter. ULVDR is based on ULV7 and very similar. The new rail evaluates the complement function based on the inverted input, thus for an inverter it is an identical transistor network, with opposite input. The main difference is that the keeper transistor for the input (eval) floating gate is now connected to the opposite output. Once one output goes low the opposite side keeper transistor will turn off the floating gate. 32.

(37) φ’. φ. φ. A. A’. X’. X. φ’. φ’. Figure 2.4: ULVDR Inverter 1P0. 350 300. Voltage[mV]. 250 200 114.0. 150. 114.2. X1 X2 A1 PHI1. 114.233. 100 50 0 50. 113.8. 114.0. 114.2. 114.4. Time[ns]. 114.6. 114.8. 115.0. C Wp Lp L pP Wn WnE Ln LnP N NpP te td VGS. 0P1: 7 fF 120 nm 100 nm 240 nm 120 nm 240 nm 100 nm 1 4 1 ps 32.9 ps 578.8 mV. Table 2.6: ULVDR inverter sizes at VDD = 300 mV (Another inverter of opposite type used as load). 33. 1P0: 11 fF 120 nm 100 nm 120 nm 100 nm 240 nm 1 1 ps 73.4 ps 589.8 mV.

(38) 2.4.4. Simulation & speed. Using the sizes from Section 2.4.2 the ULVDR inverters in Figure 2.3 and Figure 2.4 were simulated. Recharge and evaluation transistors are minimum size W/L = 120 nm/100 nm. This gives the smallest layout size and capacitances possible. Section 2.4.6 discusses the sizes of input capacitors. The results of two transient simulations can be found in Table 2.6. Each test case used an ideal input edge, te = 1 ps, and the other type of inverter as load. Parasitic delay (with load) from 50% on the input to 50% on the output was measured, as well as maximum floating gate voltage, |VGS |.. 2.4.5. Floating gate voltages. X1 A1 FG1 FG2. 600. Voltage[mV]. 500 400 300 200 100 0 0. 20. 40. 60. Time[ns]. 80. 100. Figure 2.5: ULVDR 0P1 inverter floating gate voltage peaks at 575 mV. In Section 1.3.7 it was predicted that the floating gate voltages can reach almost VGS = 2 ∗ VDD . Transients of the floating gates can be seen in Figure 2.5 and 2.6. Both inverters’ gate to source voltages peak at more than 90% of 2*VDD : 0P1:|VGS | = 575 mV ≈ 1.92 ∗ VDD 1P0:|VGS | = 563 mV ≈ 1.88 ∗ VDD The other floating gate is also correctly discharged via the keeper transistor. This slowly shuts off the evaluation transistor which is not supercharged, limiting static power usage. A stronger keeper transistor could discharge the floating gate faster, if needed. Note that even though the pMOS recharge transistors are weaker than their nMOS counterparts, 0P1(Figure 2.5) has the fastest floating gate recharge. This is because of the higher input capacitance and gate 34.

(39) 300 200. Voltage[mV]. 100 0 100 X1 A1 FG1 FG2. 200 300 20. 40. 60. Time[ns]. 80. 100. Figure 2.6: ULVDR 1P0 inverter floating gate bottoms at −263 mV. capacitance (4 finger pMOS evaluation transistor) of the 1P0 inverter. That also explains the slower discharge of the floating gate, after the input has arrived. While the transistors can be scaled to match these timings, it is not necessary. Precharge and evaluation period (clock frequency) must be chosen to accommodate the slowest circuit, having other circuits with faster timings is unproblematic.. 2.4.6. Input capacitors Average delay(ps) per gate for np-chain of 40 elements. Min: 8.065ns. Max: 20.006ns. Input capacitance(Cin) for precharge to "1" (fF). 20 18. 450. 16 14. 400. 12 350. 10 8. 300. 6 250. 4 2 5. 10. 15. 20. Input capacitance(Cip) for precharge to "0" (fF). Figure 2.7: The impact of capacitor sizes on chain delay. In order to set optimal sizes for input capacitances a 40 inverter chain 35.

(40) was simulated. Figure 2.7 shows that the optimal sizes are approximately Cip = 11 fF and Cin = 7 fF. At these sizes, the minimum delay per inverter was 201.6 ps. Another important takeaway is that, as long as both capacitors are above approximately 7 pF there is not much variation. The scale on the right side of Figure 2.7 is divided into 1000 subdivisions, so each line in the contour plot represents a 0.3 ps change in delay: tscale ≈ 300 ps 300 ps t∆ ≈ = 0.3 ps 1000. 36.

(41) Chapter 3. NAND/NOR gates 3.1. Flexibility of dual rail gates. In dual rail logic, the complement of a digital signal is always available. Thus inverters are not needed to simply invert signals. Inverters are still useful for buffering a signal or switching polarity to match outputs from other gates. A dual rail NAND gate can perform any of the linear logic operations (NAND, NOR, AND, OR). When designing bigger circuits it is useful to have access to all these gates. The ULVDR performs all of these operations (with the Figure 3.1: AND, OR, same delay). NOR equivalents using Anytime when a ULVDR NOR/OR/AND one dual-rail NAND gate gate is mentioned, it is not a separate gate, just the NAND gate with switched inputs/outputs. For example, an OR gate is a NAND gate with inverted (switched) inputs. This concept is illustrated in Figure 3.1 and applies to both CVSL and ULVDR.. 3.2. CVSL NAND gate. A CVSL NAND gate and its’ dimensions are shown in Table 3.1. Intuitively the nMOS transistors in series should have a double width, to account for the double series resistance. This is accomplished by doubling Nns , the finger count of the nMOS series transistors. 1 This NAND was simulated in all 6 possible transitions and parasitic delays were recorded. 2 Average delays are shown in Table 3.1. The CVSL NAND gate is approximately 1.7 times slower than the CVSL inverter (Table 2.2). 1 Doubling. physical width of the transistor from 120 nm to 240 nm would not increase transconductance 2 The input changes which cause a transition on the output are: 00 → 11, 11 → 00, 11 → 01, 01 → 11, 11 → 10, 10 → 11. 37.

(42) Y=AB. Variable: Wp Lp Wn Ln Nnk Nns tie td f µ tdrµ. Y’=AB. A. A’. B’. B. Value: 120 nm 240 nm 120 nm 240 nm 2 4 1 ps 0.826 ns 6.80 ns. Table 3.1: Minimal CVSL NAND (An identical NAND was used as load). 350 300. Voltage[mV]. 250 200 150. 150.001. 150.924. 155.455. 100 50. A1 B1 X1 X2. 0 50 148. 150. 152. 154. Time[ns]. 156. 158. 160. Figure 3.2: CVSL NAND gate delay (best case). 38.

(43) The transient for the best case situation (01 → 11) is plotted in Figure 3.2. This is considered the best case as the rising edge has the lowest delay, tdr = 6.87 ns. 11 → 00 has the fastest falling edge delay, td f = 0.31 ns, as both parallel nMOS transistors are pulling down.. 3.3. ULVDR NAND gate. 3.3.1. Circuit topology. φ. φ. X’. X. A’. AB. A. A’+B’. φ’. B φ’. B’ φ’. φ. Figure 3.3: A ULVDR NAND gate - 0P1 version. Figure 3.3 and 3.4 show the 0P1 and 1P0 versions of the ULVDR NAND gate. As 2 evaluation transistors are now connected in series, the finger count of these are doubled, NE = 2. The ULVDR NAND gate has many additional capacitances connected to the outputs and a stronger (parallel) evaluation network.. 3.3.2. Precharge Matching. Simplified circuits were used to match precharge and evaluation transistors, as for inverters in Section 2.4.2. ULVDR Inverter sizes (Table 2.6) were used as a starting point, and only finger count was varied. 3 4 circuits were used, one for each rail of the ULVDR 0P1 and 1P0 circuits. Both versions have one parallel(k) and one series(s) evaluation network. The series evaluation transistors should have double finger count, to compensate for the series resistance. The precharge transistors connected to parallel evaluation transistors should also be doubled, to account for the increased current through two parallel transistors. 3 The. transistor sizes used can also be found in Table 3.3. 39.

(44) φ’. φ. φ. A. B AB. φ B’ A’ A’+B’ X. X’. φ’. φ’. Figure 3.4: A ULVDR NAND gate - 1P0 version. 0P1 NnEk NnEs NpPk NpPs VP1k VP1s IP1k IP1s 1P0 NnPk NnPs NpEk NpEs VP0k VP0s IP0k IP0s. INV sizes 1 1 4 4 253.95 mV 292.36 mV 172.85 nA 40.760 nA INV sizes 1 1 1 1 24.993 mV 4.3724 mV 102.60 nA 21.915 nA. NAND sizes 1 2 8 4 280.17 mV 280.67 mV 181.50 nA 93.010 nA NAND sizes 2 1 1 2 9.9348 mV 8.5342 mV 104.83 nA 41.178 nA. Current matching 1 4 8 8 280.17 mV 276.43 mV 181.50 nA 208.52 nA Current matching 4 4 2 8 8.5008 mV 6.3465 mV 197.20 nA 150.04 nA. Table 3.2: Matching the pull-up and pull-down networks of ULVDR NAND. 40.

(45) Table 3.2 shows the resulting finger counts, output voltages, and current through the precharge transistors. See Section 1.3.2 for an overview of all the symbols and subscripts used. For example, NnPk means the finger count of the nMOS precharge transistor connected to parallel evaluation transistors (1P0 only). Using unedited inverter sizes gives very different precharge values for the 2 rails. However, after applying the mentioned optimizations (NAND sizes column) the precharge to 1 (0P1) outputs are both close to 280 mV and the precharge to 0 (1P0) outputs are both close to 9 mV. The last column (Current matching) of Table 3.2 shows an attempt to match current for the different rails. 4 Finger counts of all transistors in the series rails were doubled, to match the precharge transistors in the parallel rail. As the currents in both rails of the precharge to 0 (1P0) version were still low, finger count was doubled to match the other inverter. Before these changes the currents were (approximately) in the range 41 nA - 182 nA, the changes reduced the range to 150 nA - 208 nA. It is also interesting/useful to note that when changing finger count, the results are not linear, and different for pMOS and nMOS. The nMOS transistors seem to scale better, as both 0P1 and 1P0 voltages are decreasing with the upscaled configuration in the "Current matching" column.. 3.3.3. Simulations & speed. 0P1 Symbol: Value: C 7 fF Wp 120 nm Lp 100 nm L pP 240 nm Wn 120 nm WnE 240 nm Ln 100 nm NnEk 1 NnES 2 NpPk 8 NpPS 4. 1P0 Symbol: Value: C 11 fF Wp 120 nm Lp 100 nm Wn 120 nm Ln 100 nm LnP 240 nm NnPk 2 NnPS 1 NpEk 1 NpES 2. Table 3.3: Minimal ULVDR NAND dimensions. Dimensions in Table 3.3 give satisfying results. The NAND gate gives correct output for all possible inputs, as can be seen in Figure 3.5. 41.

(46) 1.0 200. A1. 0.8 0 B1. 200 0.4. X1. 0 0.2 200. PHI1. Voltage[mV]. 200 0.6 0. 0.00 0.0 0. 500.2 100. 150 0.4. 200. 0.6 250. Time[ns]. 300 0.8350. 4001.0. Figure 3.5: Logic verification of ULVDR NAND 0P1. 1.0 200. A1. 0.80 B1. 200 0.4. X1. 0 0.2 200. PHI1. Voltage[mV]. 200 0.6 0. 0.00 0.0 0. 500.2 100. 150 0.4. 200. 0.6 250. Time[ns]. 300 0.8350. 4001.0. Figure 3.6: Logic verification of ULVDR NAND 1P0. Input pattern 00 01 10 11. 0P1 td |VGS | 32.108 ps 578.8 mV 62.436 ps 578.6 mV 62.468 ps 578.7 mV 72.687 ps 565.4 mV. 1P0 td |VGS | 77.257 ps 589.8 mV 179.39 ps 589.7 mV 178.045 ps 589.8 mV 163.4 ps 589.0 mV. Table 3.4: ULVDR NAND delay for all 4 possible inputs (Another NAND of opposite type used as load). 42.

(47) Evaluation. Figure 3.7: Monte Carlo simulation of CVSL NAND (200 samples). Precharge. Evaluation. Figure 3.8: Monte Carlo simulation of ULVDR NAND (200 samples). 43.

(48) 3.4. Monte Carlo simulation. Monte Carlo sampling can be used to pick samples from a population of device parameters to simulate mismatch induced variance [12]. For each parameter of a transistor (width, length, layout shape, doping concentration, etc.) means and standard deviations are computed based on empirical data (measurements). The means and standard deviations define normal distributions. For every modelled variable a value is picked at random, from the normal distribution. All of these values for the entire circuit constitute one sample. The same simulation is run on all samples. Modern circuit design and simulation software includes this functionality. Results from Monte Carlo simulations can be used to determine yield (conformance to specific requirements) as well as variance in output voltage, power consumption, delay etc. In the real world each chip manufactured is slightly different (mismatch) and maximizing the number of usable devices (yield) becomes important. A 200 sample Monte Carlo Sweep was run to show the effects of mismatch and process parameters (variance). The results for CVSL and ULVDR NAND gates are shown in Figure 3.7 and 3.8. Specific requirements in terms of delay and noise margins are necessary to calculate yield. For CVSL the delay varies a lot for the different samples. The ULVDR simuations show that a few samples evaluate incorrectly, but most are correct and very fast in comparison.. 3.5. Chain delay IN. A. A. A. A. B. B. B. B. OUT. Figure 3.9: NAND gate chain In previous chapters, the parasitic delay has been measured using very fast (ideal) input signals. Thus, the results obtained are good for comparison, but not realistic. The input of one gate will be driven by the output of another, and the outputs of ULVDR NAND gates are not 1 ps ideal edges. In order to get more realistic estimates, the delay should be measured in a chain. The easiest way to achieve this is to connect NAND gates in one single chain, acting as inverters, as illustrated in Figure 3.9. 30 ULVDR NAND gates were connected in this manner, the first one being a 0P1 version. Input patterns 00 and 11 from Table 3.4 show that these NAND gates, especially the 1P0 version, are slower than the inverter equivalent (Table 2.6). These two cases are the best and worst case and should give a good approximation of NAND speed in larger circuits. The transient response of both cases can be found in Figure 3.11. 4 This. is an experiment and not strictly necessary. 44.

(49) 350 300. .747 589. .818 543. .411 493. .345 445. .938 394. .871 346. 150. .0. 200. 300. Voltage[mV]. 250. IN1 X5 X10 X15 X20 X25 X30. 100 50 0 50 250. 300. 350. 400. 450. 500. Time[ns]. 550. 600. 650. 700. Figure 3.10: Output delay for chain of 30 CVSL NAND gates. 350 300. 125 .412. 125. 126. IN1 X6 X12 X18 X24 X30. 126 .24. 124 .53. 123 .65. 150. 122 .766. 200. 122 .0. Voltage[mV]. 250. 100 50 0 50 121. 122. 123. 124. Time[ns]. 127. 128. 129. Figure 3.11: Output delay for chain of 30 ULVDR NAND gates. 45.

(50) Figures 3.10 and 3.11 show transients from the chain delay simulations. For ULVDR Two cases were simulated, 00 → 11 and 11 → 00. This is important, as the NAND gates alternate in both gate polarity and output value. How the outputs align with the polarity affects the delay. Logic CVSL CVSL ULVDR ULVDR. Input pattern 00 11 00 11. Total delay 292.1395 ns 289.7461 ns 5.0912 ns 4.2399 ns. Average delay 9.738 ns 9.658 ns 0.1697 ns 0.1413 ns. Table 3.5: Delays for a chain of 30 NAND gates Resulting delays can be found in Table 3.5. A speedup factor, s, can be calculated (worst case delays used): s=. 9.738 ns tCVSL = ≈ 57 tULVDR 0.1697 ns. This chain test indicates that ULVDR NAND gates can be more than 50 times faster than static CVSL NAND gates. The tradeoff is, of course, the complexity and size in silicon, as well as specific clock and timing requirements.. 46.

(51) Chapter 4. XOR gates 4.1. CVSL XOR. Y’=AB + A’B’. Y=A ⊕ B. A. A’. A. A’. B’. B. B. B’. Figure 4.1: A CVSL XOR gate. The XOR gate should drive the output high for inputs 01 and 10, and low for 11 and 00. These four inputs can be expressed as minterms; A0 B, AB0 , AB, and A0 B0 . In dual rail (CVSL or ULVDR) this means that one rail should pull down for A0 B + AB0 . The other rail should pull down for AB + A0 B0 . These 4 minterms can be seen as branches in Figure 4.1. The CVSL XOR in Figure 4.1 can be simplified. As A and A0 are complementary, those transistors should never be on simultaneously. Because of this, the transistors with inputs B and B0 can be shared by both rails. 1 This simplified configuration is shown in Figure 4.2. Layout size and input capacitances are reduced. For any input, two transistors in series will pull one of the rails low. Thus, all transistors are equally sized as the series rail of the CVSL NAND (Table 3.1). This is the same as using CVSL inverter sizes (Table 2.2) 1 Heller. et al. show similar simplifications in [7].. 47.

(52) Y’=AB + A’B’. Y=A ⊕ B. A. A’. B. A. A’. B’. Figure 4.2: A simplified CVSL XOR gate. Case: 00 → 01 11 → 10 00 → 10 11 → 01 tdµ. Delay: 13.5488 ns 13.5321 ns 6.9022 ns 6.901 ns 10.2016 ns. Case: 01 → 00 10 → 11 10 → 00 01 → 11 tdw. Delay: 13.5214 ns 13.5215 ns 6.842 ns 6.8434 ns 13.5488 ns. Table 4.1: Minimal CVSL XOR delay, all output transitions (1 ps input edges, identical gate as load). 48.

(53) and doubling the nMOS finger count. Parasitic delays for this simplified CVSL XOR gate can be found in Table 4.1, and the worst case is shown in Figure 4.3.. 350 300. Voltage[mV]. 250 200 50.0. 150. 51.397. A1 B1 X1 X2. 63.549. 100 50 0 50. 45. 50. 55. 60. 65. 70. Time[ns]. 75. 80. Figure 4.3: CVSL XOR parasitic delay (worst case). 4.2. XOR implemented using other gates. The easiest way to create an XOR gate is to reuse the NAND gate from Section 3.3. Figure 4.4 shows two different approaches to creating an XOR gate. As discussed in Section 3.1, the ULVDR NAND can perform AND/OR operations, and inverted inputs are always inherently available. The most effective XOR implementation should be the 3-gate implementation using 2 AND gates and 1 OR gate. A. A X A⊕B. B. B’. X. A’ B. Figure 4.4: XOR implementation using other gates. 2 versions of an XOR gate were created based on Figure 4.4, namely a 0P0 and 1P1 version. Since this circuit has a 2 gate logic depth, the polarity of the input and output are equal. Thus, for 0P0, inputs and outputs are low during precharge, and for 1P1, inputs and outputs are high during 49.

(54) precharge. As in previous chapters, it was simulated using an appropriate version of the gate as an external load. Sizes from Table 3.3 were reused.. Logic verification. 1.0 200. A1. 0.80 200. B1. 0.60 200. X1. 0.40 200. X2. 0.20 200. PHI1. Voltage[mV]. 4.2.1. 0.00 0.0 0. 1000.2 200. 300 0.4. 400. 0.6 500. Time[ns]. 600 0.8700. 8001.0. Figure 4.5: Logic verification of ULVDR XOR 0P0. 1.0 200. Voltage[mV]. 0.80 200 0.60 200 0.40 200 0.20 200 0.00 0.0 0. A1 B1 X1 X2 PHI1 1000.2 200. 300 0.4. 400. 0.6 500. Time[ns]. 600 0.8700. 8001.0. Figure 4.6: Logic verification of ULVDR XOR 1P1. Figure 4.5 and 4.6 show that the XOR gates work for all possible inputs. Counting up on the 2 input bits (00, 01, 10, 11) shows the XOR output 0110. 50.

(55) 300. Voltage[mV]. 250. A2 X1 X2 PHI1. 200 120.0. 150. 121.0. 121.409. 100 50 0 118. 119. 120. 121. Time[ns]. 122. 123. 124. Figure 4.7: Worst case delay for ULVDR XOR 0P0. 4.2.2. Evaluation delay. Similar to the NAND gates in Chapter 3 parasitic delay (with load) was simulated using ideal supply and inputs. The ULVDR XOR 0P0 and 1P1 gates have a worst case delay of 408 ps, shown in Figure 4.7. On average (for the 4 different input combinations) the 0P0 and 1P1 gates have delays of 378.6 ps and 366.41 ps, respectively.. 4.3. ULVDR XOR gate. According to the simple parasitic delay simulation, the XOR presented in Section 4.2 is about 4 times slower than the ULVDR NAND. Some of this is because the first gates receive an ideal signal, it is reasonable to assume the XOR implemented with AND/OR gates is about 2-3 times slower than the ULVDR NAND, on average. Its logic depth is 2, and input capacitance is greater, as inputs are connected to 2 parallel gates. A proper XOR gate implemented in ULVDR, with a similar delay to ULVDR NAND, would be ideal. This section introduces the ULVDR XOR.. 4.3.1. Circuit topology. In a dual rail XOR gate, one rail should go low for A0 B + AB0 , while the other should go low for AB + A0 B0 . Two ULVDR XOR precharge to 1 (0P1) gates were designed and simulated. The first iteration evaluates all 4 minterms separately and is shown in Figure 4.8. Similar to CVSL, this circuit can be simplified by realizing that A and A0 are never true at the same time. Thus, the B and B0 transistors (evaluation) are shared. This is shown in Figure 4.9, similar to the simplified CVSL XOR (Figure 4.2). 51.

(56) φ. φ. X’. X’. A. A’. A. A’. B’. B. B. B’. φ’. φ’. φ. φ’. φ’. φ. Figure 4.8: ULVDR XOR 0P1 (not simplified). φ. φ. X. X’. A. A’. B’. B φ’. φ. A. A’. φ’. φ’. Figure 4.9: ULVDR XOR 0P1 (simplified). 52.

(57) 4.3.2. Keeper transistors in simplified ULVDR XOR. In the simplified ULVDR XOR circuit (Figure 4.9) keeper transistors have been removed for the shared evaluation transistors. As these are shared, they can be used to drive both outputs, and thus the outputs cannot be used in feedback to a keeper. The alternative is to use the intermediary nodes between evaluation transistors in feedback. There are multiple problems with this approach; mainly that these nodes don’t achieve voltages close to 0 or 300 mV. Controlling a keeper gate with this voltage is ineffective. Connecting keeper source or gate contacts to these nodes would also add extra load to the evaluation network. 2 Additionally, a keeper is not strictly necessary for these shared transistors; once one rail goes low, both parallel (input A/A0 ) evaluation transistors on the other rail are pulled low by their keepers.. 4.3.3. Precharge Matching. Pp. Ep. Ep. Ep. Ep. Vo. En. En. Vo. En. En. Pn. Figure 4.10: Circuits used for matching Full XOR. Section 2.4.2 and 3.3.2 used simple test circuits to match the precharge and evaluation transistors. The matching was done to achieve acceptable output voltages even after evaluation starts (before inputs arrive). At this point in time all floating gates are charged, and thus both evaluation and precharge transistors are conducting. A similar methodology was used to match the ULVDR XOR gates, however, the setup is a little more involved. The circuits used are shown in Figure 4.10 and 4.11. As the full (not simplified) XOR gate has 2 parallel transistors in series, their resistance/conductance should roughly cancel out. The conductance of this network should be similar to the ULVDR inverter. Thus, for this 2 Simulations. were run to verify these claims, however, the results are not very interesting. The keeper transistors did not pull the floating gates low and prevented the outputs from reaching acceptable evaluation voltages.. 53.

(58) Pp. EpS EpS. Pp VO. En. En. EnS. EnS. VO. En. En. Ep. Ep. Ep. VO. Pn. Ep. VO. Pn. Figure 4.11: Circuits used for matching simplified XOR. version of the XOR, sizes were identical to the ULVDR inverter. One could argue that when inputs arrive, only one branch will be active, thus it will be slower than an inverter and all transistors should be scaled (doubled) to account for this. However, it is desirable to find minimal configurations that work well and can be scaled for different requirements. For the simplified XOR circuit, not much has changed. As the B and evaluation transistors are now combined (shared) they should have doubled finger count NEshared . Intuitively they should conduct the same current as both original transistors and thus have double conductance. B0. Apart from finger count, N, the sizes (W/L) are identical to ULVDR inverters. One alternative approach is decreasing the width of nMOS evaluation transistors, WnE , instead of increasing finger count. A shorter width here should increase conductance, this transistor was widened to limit the current in the ULVDR inverter. The results of matching are found in Table 4.2.. 4.3.4. Simulation and Speed. Section 3.1 explains why the ULVDR NAND gates can be used as NOR/OR/AND gates. The same argument applies to dual rail XOR gates, by reversing the outputs it performs an XNOR operation. Figure 4.12 shows the logic operation of the ULVDR XOR gate. The plot confirms that output X1 correctly evaluates the XOR function, while X2 evaluates XNOR. Table 4.3 shows delays for the ULVDR XOR gates. Both versions of the 0P1 gate work very well in terms of delay. The simplified version (Figure 4.9) is about 50% faster than the first iteration (Figure 4.8) and more than 2 times faster than the XOR created with NAND gates in Section 4.2. Average delay (for all 4 inputs) of the ULVDR XOR 0P1 gate is 103.446 ps, and 305.483 ps for 1P0. Worst case delay for the 0P1 variant is 105.456 ps which is also shown in Figure 4.13. 54.

(59) Full 120 nm 100 nm 240 nm. W L WnE LnP L pP N NpP NpEshared NnE NnEshared VO IP. 0P1 XOR Simplified 120 nm 100 nm 240 nm. 240 nm 1 4. Full 120 nm 100 nm. 240 nm 1 4(6*). 1P0 XOR Simplified 120 nm 100 nm. 240 nm. 240 nm. 1. 1 2. 1 283.6127 mV 80.98 nA. 1 2 282.002 mV 87.663 nA. 9.1039 mV 43.696 nA. 8.781 41 mV 42.2741 nA. Table 4.2: ULVDR XOR precharge matching results *: The pMOS precharge transistor was increased to 6 fingers for the chain test, see Section 4.3.5 and 3.5.. A1. 0.80 200. B1. Voltage[mV]. 1.0 200. 0.60 200. X1. 0.40 200. X2. 0.20 200. PHI1. 0.00 0.0 0. 500.2 100. 150 0.4. 200. 0.6 250. Time[ns]. 300 0.8350. 4001.0. Figure 4.12: Logic verification of ULVDR XOR 0P1. Inputs 00 01 10 11. 0P1 XOR Full Simplified 182.171 ps 104.486 ps 190.178 ps 105.456 ps 169.127 ps 98.498 ps 190.314 ps 105.344 ps. 1P0 XOR Full Simplified 618.3627 ps 319.33 ps 573.459 ps 301.6 ps 562.452 ps 299.5 ps 573.27 ps 301.5 ps. Table 4.3: Delays for both versions of ULVDR XOR. 55.

(60) 300. Voltage[mV]. 250. A2 X1 X2 PHI1. 200 160.0. 150. 161.0. 161.106. 100 50 0 159.0 159.5 160.0 160.5 161.0 161.5 162.0 162.5 163.0. Time[ns]. Figure 4.13: Worst case delay of ULVDR XOR 0P1. 4.3.5. Problems with the ULVDR XOR. The delays for the precharge to 0 (1P0) version are around 3 times longer than for precharge to 1 (0P1). This is because of nonlinearities and the fact that pMOS transistors don’t scale as well with fingers as the nMOS. It becomes exceedingly difficult to match the evaluation network for both before, and after inputs arrive. Recall from Section 2.4.2 that; if the precharge transistor is too weak, it cannot hold the precharge value, if it is too strong, the evaluation network cannot change the output efficiently.. A2 X1 X2 PHI1 FGA1X2 FGA2X2 FGA2X1. 600. Voltage[mV]. 500 400 300 200 100 0 60. 80. 100. Time[ns]. 120. 140. Figure 4.14: ULVDR XOR 0P1 floating gates. Looking at the floating gates (Figure 4.14) reaveals another issue. As 56.

No results found