An FPGA Based System for Acquisition and Storage of Neural Bioelectric Signals

(1)

An FPGA Based System for Acquisition and Storage of Neural Bioelectric Signals

Thomas Lundby Heggland

Master’s Thesis, Spring 2018

(2)

I Hardware 6

3 Hardware Evaluation 6 3.1 LVDS . . . 6

3.2 Comparing FPGAs . . . 7

3.3 Choosing the FPGA . . . 7

3.3.1 Xilinx . . . 8

3.3.2 Altera . . . 8

3.4 Storage Solutions . . . 9

3.5 FIFO. . . 11

3.6 SRAM . . . 11

4 Hardware Design 13 4.1 Power Supplies . . . 13

4.2 FPGA . . . 13

4.3 Device Clock . . . 14

4.4 Potential Noise Sources . . . 15

4.5 Printed Circuit Board . . . 16

4.6 Miscellaneous Design Decisions . . . 16

4.7 Assembly . . . 18

II Firmware and Testing 20

5 Firmware 20 5.1 Platform Designer Project . . . 20

5.2 Description of SRAM-controller . . . 22

(3)

5.3 The Wishbone Bus . . . 23

5.4 FIFO Controller . . . 24

5.5 Avalon to Wishbone Bridge . . . 25

5.6 SD Card controller . . . 26

5.7 SD Card Driver . . . 28

6 Verification and Testing 31 6.1 Initial Hardware Verification . . . 31

6.2 Power Supply Benchmarks. . . 32

6.3 Testbenches . . . 33

6.3.1 SRAM Controller. . . 33

6.3.2 FIFO Controller . . . 34

6.3.3 SD Card Controller . . . 36

6.4 Software . . . 36

6.4.1 Avalon to Wishbone Bridge . . . 36

6.4.2 SD Card Controller Driver . . . 37

III Future Work and Conclusion 38

7 Future Work 38 7.1 Firmware . . . 38

7.2 Processor . . . 38

7.3 Additional Interfaces . . . 38

8 Conclusion 40

IV Appendices 41

A Schematic and Layout 41 B Pinout 62 C Miscellaneous Test Data 63 D Listings 64 D.1 SRAM . . . 64

D.2 FIFO. . . 70

D.3 SD Card Testbench. . . 80

References 93

(4)

List of Figures

2.1 Simplified System Overview . . . 2

3.1 LVDS Termination . . . 6

4.1 Example Power Supply Sharing[13] . . . 14

4.2 Layer Stackup for 6 Layer PCB[16] . . . 17

4.3 Omnetics PZN-12-AA Footprint . . . 18

4.4 Ball Grid Array on an NVIDIA IC . . . 18

5.1 Suggested System Overviw . . . 20

5.2 Platform Designer Overview. . . 21

5.3 OE Controlled Read Cycle . . . 23

5.4 W E Controlled Write Cycle . . . 23

5.5 Wishbone transaction with a synchronous slave . . . 24

5.6 Card Identification and Initialization Sequence[21, p. 28] . . . 29

5.7 Transfer Mode[21, p. 33] . . . 30

6.2 Current Measurement Setup. . . 32

(5)

List of Tables

2.1 Minimum Storage for given experiment durations . . . 5 3.1 RAM in altera devices . . . 9 4.1 Master Side Pinout for Omnetics 12-Pin Nano Strip Connector[5] 17 5.1 Minimal Required Signals for Wishbone Compliance . . . 24 5.2 Wishbone to Avalon signal translation . . . 25 5.3 SD Card Controller Register Map[20]. . . 28 6.1 Current Draw from individual power rails *average current . . . 32 B.1 List of Pin connections . . . 62 C.1 Buck Boost Input Cutoff voltage . . . 63

(6)

1 Summary

A standalone, battery powered neural recording system for use with an Intan Technologies headstage was made. The purpose of the system is to record data and store it on an SD card. The Printed Circuit Board was functional, but there was insufficient time to complete the firmware required for complete operation.

(7)

2 Introduction

This project attempts to design and implement a battery powered portable recording device for use in brain research experiments involving rats. The purpose of the project is to provide a system that offers freedom of movement for the animal.

In the existing system used by the Hafting-Fyhn group, the recording device is connected to an animal with implanted electrodes. The electrodes are connected to a headstage, which in turn is connected to a recording device via a long cable.

This cable is held by a counterweight to reduce the strain on the animal.

If a portable system can be made that is entirely contained to the animal it will provide the researchers with flexibility for new experiments. A previous attempt at a system was made in 2015. The project was partially succesful, but encounterd problems with data-rate and signal noise.[1]

The objectives of the project are:

• Attain faster data processing than the original project

• Minimal Signal Noise

2.1 System Overview

This section will explain the different subsystems required for a working system.

A working system will have to interface with the electrodes implanted in the brain of the animal. The first step in the signal chain will then be to amplify and digitize the bioelectric signals from the electrodes. Then the digitized signal needs to be transferred to a central controller and be held in temporary storage until sufficient data has been gathered to perform a write to permanent storage.

This will include several intermediate steps described further below. Figure2.1 shows a rough outline of how the system may be connected.

Figure 2.1: Simplified System Overview

(8)

2.1.1 Neural Recorder

The first steps of the signal chain will all be done in a single integrated circuit (IC) which connects directly to the electrodes. The task will be done with a device made by a company called Intantech who specialize in creating low power ICs for use in bioelectric applications. More specifically the device that will be used is the RHD2132.

The Intantech RHD2132 is a low power, low noise integrated amplifier and Ana- log to Digital Converter[2], which implements a complete recording solution by performing amplification and digitization. This way it provides a complete recording solution which is simple to interface with through digital communication.

It was used in the previous version of this project, with some success. However, a slightly different approach will be used. While previously the entire system was on the same Printed Circuit Board (PCB), this time the recording section will be on a separate board.

The advantages of this is twofold. Firstly achieving a larger distance between the parts of the circuit which generate a fair amount of high speed and thereby noisy signals should reduce the amount of noise coupled to the recording device.

Secondly by keeping the headstage small and instead mounting the rest of the system to the animals back should increase the comfort of the animal by reducing the load on its head. An added possibility is using an existing headstage that contains the RHD2132 such as the one used with the Open Ephys recording system.

2.1.2 Central Controller

Next the data will be transferred to the central controller which will consist of an FPGA. An FPGA or a Field Programmable Gate Array is a type of programmable device which excels at accelerating operations by parallelization.

In this case it will need to implement communication over Serial Programmable Interface (SPI) to communicate with the RHD2132[2]. As well as a flexible storage to account for differences in data rates between the recording device and the permanent storage. Depending on which FPGA I choose to use it might also be necessary to implement a controller for external storage.

2.1.3 Storage and Memory

Finally the data will be stored on either a removable SD card or on-board storage device. If an onboard storage device is chosen it is also necessary to include USB support or something similar to perform data transfer to a computer for analysis.

(9)

2.2 Previously Encountered Challenges

The existing version of this project uses a microprocessor for controlling the different parts of the system, and an SD card for storage. Problems were encountered in this version with noise on the sampled signals. It was hypothesized that the source of the noise was activity on the SD card arising from a shared power supply. The suggested solution to this was to include additional power supply filtration by using separate regulators.[1]

Additionally the sytem could not write data to the SD card at a consisteent speed. Even though the max write speed achieved was sufficient to keep pace with the data, occasionally the sd card would take longer than usual to complete an operation. Over time this would add up putting the SD card further and furher behind, until the system eventually ran out of memory.

In tests running the system with 32 channels and a per channel sample rate of 8kHz, he it would take 523 seconds before running into errors[1]. This is far below the maximum performance of the RHD2132 used in the system, which is capable of a per channel sample rate of30kHz[2].

2.3 Field Programmable Gate Arrays

A Field Programmable Gate Array (FPGA) is an integrated circuit architecture which is user configurable[3]. They consist of predefined logic cells usually con- taining one or more of the following a lookup-table, a multiplexer, arithmetic blocks and output registers. Being configurable devices FPGAs are highly versatile, they usually support a wide variety of Input/Output (I/O) standards as well as wide supply voltage range to correspond to these standards. This makes them highly suitable for applications where several different I/O-types are needed.

FPGAs are widely used in applications where high speed, or a high amount of parallelization is needed and in small enough numbers that fabricating an ASIC is prohibitively expensive.

Compared to microprocessors an FPGA is a more complicated device usually requiring more complicated support circuitry than a typical microprocessor.

FPGAs are usually configured at runtime meaning that additional external circuitry is required for most devices as they do not commonly contain flash based configuration storage.

2.4 Suggested Specifications

Currently the system the research group is using uses 16-channels for recording due to the headstages they currently have. This means that the system will have to process the data at a rate of a minimum of16bits×16channels×30kHz=

(10)

7.68M bps. Preferably the system should be able to process twice that to allow for up to 32 electrodes, if better performance can be achieved the recording device from Intantech is available in up to 64 channel packages[2].

The amount of storage required depends on the length of the experiment, the sampling rate of the system and the number of electrodes. A typical experiment length can be around 15 minutes, but it would be preferable to allow for more.

Using a sample rate of30kHz at 16 bits per channel, the amount of necessary storage to store the entire experiment can be seen in table2.1As can be seen the system produces a fair amount of data, and it is appearent that aiming for a total storage capacity of at least8GBshould be enough for most circumstances.

Table 2.1: Minimum Storage for given experiment durations Channels 15 Minutes 60 Minutes

16 864M B 3.46GB

32 1.73GB 6.91GB

There are also requirements regarding weight and size, although not as clear.

Since the system has to be carried by the animal, the weight should be fairly low. In the previous design it is suggested that it should not weigh more than 20g[1]. That is however for a design that is meant to be carried entirely on the head of the animal. If this system is to be divided into two different parts it should be possible allow for a somewhat heavier device.

(11)

Part I

Hardware

3 Hardware Evaluation

This section discusses different solutions for the implementation of several subsystems.

3.1 LVDS

To achieve the best possible noise and speed performance with the RHD2132 it is required to use a bus standard called Low Voltage Differential Signalling, or LVDS[2]. LVDS is as the name implies a signal standard that instead of using the full voltage swing of the signal line, eg0V to3.3V, it uses a smaller voltage swing[4]. A lower voltage swing is necessary to reduce noise and increase the transmission speed. Additionally, it also yields lower power consumption due to the lower signal swing requiring less energy.

This is done by instead of being directly a voltage mode signal it drives a constant current through a termination resistor[4]. Doing it this way ensures almost constant power consumption due to the differential signalling using the same amount of current in both directions. The termination scheme is very simple requiring only a single resistor at the reciever, as can be seen in figure3.1.

Figure 3.1: LVDS Termination

LVDS is a physical specification, and completely independent of different communication protocols. It can be used with most protocols so long as both trans- mitter and reciever support LVDS. For instance the RHD2132 used in this project uses the SPI protocol for digital communication [2].

(12)

3.2 Comparing FPGAs

The FPGA to be used in the system needs to fulfill certain requirements. Ini- tially I intended to use the internal memory on the FPGA for the FIFO buffer required to store the data between writing to the SD-card. The minimum amount of memory needed for the FIFO is16bits×32channels×32kHz×250ms = 4096M bits. This should ideally be enough memory to prevent data loss due to FIFO overflow, but a larger FIFO would be preferable to have a larger buffer.

For the RHD2132 to reach its specified noise performance at high speeds it is required to use LVDS[2]. Therefore, the FPGA must support a minimum of 4 LVDS interfaces to support one RHD2132, with a total of 5 it can however support two devices simultaneously as the connector used on the open ephys headstage cotnains a total of 12 pins[5]. A case where two headstages are used at the same time might be unlikely, but if there is little difference in cost adding support for it may be useful in the long run.

For the inital prototype it would also be useful if the FPGA is large enough to support a soft processor to offer more flexibility if that is required,as it will most likely be far simpler to use the sd card through a software interface rather than performing everything in hardware. An advantage to using an FPGA with a large amount of logic cells is that there will be little chance of running out of space, then if it proves to be much larger than necessary a smaller and cheaper device can be used for subsequent iterations of the hardware.

In some cases the smaller devices are pin compatible with the larger devices meaning that in many cases they can be used as a direct replacement for the larger devices without redesigning the board. Additionally, to reduce the size of the system the FPGA should either have an internal configuration memory or be flash based. Otherwise, it is necessary to include external devices to program the FPGA on startup, either on the board or requiring the user to program the device every time they use it. Programming the device every time it is used is not particularly user-friendly and should be avoided in a final product, but may be acceptable in a testing environment.

3.3 Choosing the FPGA

When choosing which FPGA to use for the project, the following criteria where taken into account:

• Available IO

• Amount of block ram

• Power consumption

• Internal Configuration or FLASH based

• Availability

(13)

There are several manufacturers of FPGAs of which Altera and Xilinx are the largest and most known. Because the size of the FPGA image is uncertain before the firmware is completed it was deemed appropriate to mostly consider the largest available devices. This way if the FPGA proves to be much larger than necessary it will be possible to use a smaller device in a future version of the system.

3.3.1 Xilinx

Xilinx offers a large range of different devices in different performance grades.

For the purposes of this project where the data rate is fairly modest by most standards a low power device should be sufficient.

This leaves the Spartan 7 and Artix 7 series as the most relevant device families.

Unfortunately at the time of writing the Spartan 7 was not available for purchase and only the Artix series was available. Looking mostly at the largest available device in the series with the intention that it will be used for the prototype then exchanged with a smaller device the Artix series looks like a good candidate with regard to IO.

If the largest available device is chosen there should be plenty of available memory for the FIFO implementation as well as the implementation of a soft processor[6, p. 3]. However, the smaller versions barely have enough memory to fit the FIFO, so if choosing to use a smaller device the system will have to be partially redesigned.

The Artix 7 series does not have an internal configuration memory meaning that additional circuitry must be added for storage of the configuration image when powered down.[7]

3.3.2 Altera

Altera, now a part of Intel, has a very interesting low cost FPGA family called the Max 10 series. The Max 10 series is meant to be a low cost alternative to their high performance offerings.

It is the only series from Altera or Xilinx that offers internal configuration flash memory (CFM). Memorywise the largest device can have enough memory to support 16 channels, but will not be sufficient for all 32 channels. See table3.1 for an overview of RAM in the available devices. If any of these are to be used external memory will be required for the FIFO.

All the devices in the series are available with two different power supply modes.

The first alternative is to use a single supply mode where the devices only require a 3.3V external voltage to operate[8] and provide the internal voltages themselves. This is the simplest available power scheme, but comes at the cost of using less efficient internal switching regulators.

(14)

The second alternative is the dual supply devices. These require a 2.5V and 1.2V external supply to operate as well as whatever voltages are necessary for the IO depending on the specific application. They offer better power efficiency at the cost of a slightly more complex power distribution network[8].

Compared to the Xilinx devices the Max 10 should prove simpler to use, with the advantage of needing only external memory, a clock and power supply network.

In the end a 10M50 without the ADC feature was chosen due to the convenience of not needing to include additional circuitry for programming the device. Since external memory would also need to be used with any but the largest Xilinx Artix 7 device, it should prove simpler to use a Max 10.

Table 3.1: RAM in altera devices Device M9K (kbits) M9K kbytes

10M02 108 13.5

10M04 189 23.625

10M08 378 47.25

10M16 549 68.25

10M25 675 84.375

10M40 1260 157.5

10M50 1638 204.75

3.4 Storage Solutions

The previous solution used an SD card to store the recorded data before transferring to a computer[1]. There are several other possible options that could be viable. As it is completely possible that the system might run out of power during a prolonged experiment or during several experiments in rapid progression, it will be necessary to store the data in non-volatile memory. In cases where a large amount of data need be stored in this fashion the usual solution is to use a flash based memory.

The most commonly available type of flash based storage devices are so called NAND devices, they are called this because the basic cell resembles a NAND- gate. In this case the NAND storage solution is quite convenient as they are available in larger sizes and are significantly faster than NOR devices for sequen- tial access. As NAND requires block based access which makes them unsuitable for RAM and program storage, but very well suited to continuous mass storage devices such as SD cards and SSDs.[9]

A downside to flash storage is the fact that each block can only be written to a limited number of times before they break down[9]. This leads to sectors of memory which are non-functional, this can be mitigated using what is called wear-levelling. Wear-levelling attempts to even the amount of times each block

(15)

is written to, this is to incerease the life-time of each block. In the case of SD cards it is entirely up to each manufacturer whether they include this feature however in general the more expensive SD cards usually have it.

The previously mentioned types of NAND storage are what is called managed storage. This means that they have an integrated controller which takes care of the details behind memory operations. They usually contain a processor or some kind of logic controller to perform device maintenance such as error correction and the previously mentioned wear-levelling.

A different type of NAND storage called unmanaged or RAW storage is also available. In these devices the system is given direct access to the memory.

This can be faster, but there is no error checking so this will either have to be implemented elsewhere or errors must be tolerated. For this project where the data-rate is not that high this is unnecessary and a managed device is more suitable.

There are several different options when it comes to managed NAND devices.

The most common one is regular SD cards, commonly used in everything from laptops to mobile phones and digital cameras. They are popular because they offer reasonable performance and a very wide range of storage sizes in small package and relatively affordable way.

For a portable recording system like this one SD cards offer several advantages over a discrete IC NAND device. Firstly they are widely available and user replacable, thereby increasing the lifetime of the recording device. In the case where an onboard solution is used the storage may wear out before the rest of the system. Using an SD card completely eliminates this problem as when it starts to wear out, the user can themselves provide a new card. Whereas the NAND IC would require soldering equipment and experience to exchange.

Secondly using a NAND IC requires a tethered transfer solution, like for instance USB. Transferring this way forces the recording system to be out of use until the transfer is complete. If it is desired to run several experiments in rapid succession, this is somewhat impractical. The SD card offers the advantage that when an experiment is complete the user can remove the SD card and immediately continue wih a new SD card for more experiments.

Because of this it appears to be far more convenient to use an SD card approach for this project.

(16)

3.5 FIFO

A First In First Out buffer (FIFO) is a way to transfer data between two subsystems in a manner that preserves the order of the data as well as allows for asymmetrical data-rates. This is useful in my case as there will be a constant flow of data from the Intan device, whereas the memory card controller functions best when given enough data to fill one block at a time that is 512 bytes.

This is normally done with a generic memory module and a controller which keeps track of the current read and write positions as well as whether the buffer is full or empty. Given the additional requirement of being able to store enough data to ensure no data is lost during the SD-card busy time, the buffer is required to hold at least250msof data, and preferably more.

This means that the FIFO needs to hold at least 30kS/s×16 channels× 2Bytes×0.25s= 240kBdatapoints for the current set of electrodes. If all 32 channels are to be used, it is necessary to have a total size of at least480kB. In the case where only an FPGA is used with a softcore processor this will require using most of the RAM, leaving little for program memory. Therefore, it is necessary to include an external storage device to use as the actual memory in the FIFO.

Because of the power constraints and relatively low memory capacity needed the most suitable solution is to use an SRAM device. Compared to other technologies SRAM uses less power, and has a simpler interface, making it well suited for use in battery powered embedded systems.

3.6 SRAM

An SRAM (Static Random Access Memory) storage device is a storage device based around flip-flop cells[10, p. 1320]. As the name implies SRAM is non- volatile, that means that the data is stored until it is erased or new data is written to the same cell as opposed to DRAM where the data needs to be periodically refreshed[11, p. 1015]. This leads to SRAM usually having lower power consumption except when operating at very high frequencies. A standard SRAM-cell consists of six transistors, this is significanly larger than for instance DRAM. Therefore, SRAM is usally more expensive than DRAM and not available with the same storage capacity. However, due to not having refresh the contents it has a much simpler communication interface than DRAM. This combined with the lower power consumption makes SRAM the default choice for small battery powered systems that have no need for very large external memory devices. A basic SRAM cell is shown in figure3.2a, it can be compared with a basic DRAM cell in figure3.2b.

(17)

(a) 6 Transistor SRAM cell (b) Basic DRAM cell

(18)

4 Hardware Design

4.1 Power Supplies

This system is intended to be a battery powered system. Batteries come with some inherent non-ideal properties, the most problematic of which is that when the stored energy is released the output voltage drops. This voltage drop needs to be compensated for to ensure stable operation of the device. The usual way to do this is to use a buck-boost converter which lowers (”bucks”) the input voltage when it is above a specific value, and raises (”boosts”) it when its lower than the threshold. Because the highest voltage needed to power the devices in the circuits is 3.3V the output voltage of the buck-boost converter should be slightly above3.3V to give some room for other regulators.

Both the SRAM and the SD card have fairly flexible power supply requirements.

Therefore, to keep things simple they are powered from the same regulator that provides3.3V to the FPGA. In particular the SRAM can be powered at a lower voltage to reduce the power consumption, however the gain in power efficiency will most likely be negligible compared to the amount of power the FPGA requires[12].

The RHD2132 is the most noise sensitive part so a Low Dropout linear regulator (LDO) was used to power this part. However, after choosing a part it is apparent that most noise from the switchmode power supplies is at a frequency high enough that it will give little protection against it. I have not had the opportunity to verify if this is a problem or not, but the LDO should at the very least not make anything worse than if it were powered from an SMPS.

4.2 FPGA

To get better power efficiency a dual power supply version of the MAX10 FPGA was chosen. This version requires externally supplied 1.2V and 2.5V power supplies to operate, this is as opposed to the single supply version which only needs 3.3V to function.

Using external voltage regulators allows better power efficiency by using devices specifically suited for the task. In addition to 1.2V and 2.5V an additional 3.3V regulator is necessary to provide the correct voltage to the IO banks. The IO banks can be powered by any voltage between 1.2V and 3.3V depending on which IO levels are required for the application. In this case all external devices operate on 3.3V and is therefore the required IO voltage. The manufacturer suggested power distribution network [13, p. 115] is shown in figure 4.1. The suggested network uses the three different power supplies mentioned. Not shown in the figure is the power supply filtering which can be found in the complete schematic included in AppendixA.

(19)

Altera suggests using a series of high effiecency switch mode power supllies that they provide called the EP53357. The devices are relatively small compared to solutions where an external inductor is needed, requiring only a total of14mm² for a complete solution as it only requires a few external capacitors for the recommended setup [14].

Using a compact power solution is an additional bonus as it reduces the size of the entire PCB making it better suited for mounting on an animal. Because they have the inductor integrated into the IC, using these are much simpler than a conventional switcher where the efficiency of the device is highly dependent on the external components. Depending on load they may have an efficiency up to93%which is desireable when working with a battery powered solution.

Figure 4.1: Example Power Supply Sharing[13]

4.3 Device Clock

For the device to operate properly a clock is required. This can be provided either through the internal oscillator in the MAX 10 FPGA, or from an external clock generator or crystal. The internal oscillator is accessible through the use of a preexisting module which Altera provides. The main advantage of using the internal oscillator is that it reduces the amount of external components required, thereby reducing the size of the PCB. However, the internal oscillator cannot drive the PLL [15, p. 33]. This makes the internal oscillator largely useless for my purposes as the different peripherals will be run at different clock frequencies. For instance the Intan RHD2132 has a maximum operating frequency of24M Hz, but the SD card controller can be run as fast as50M Hz. Therefore, an external oscillator is required from which the remaining frequences can be generated.

(20)

4.4 Potential Noise Sources

If we disregard the inherent noise in the amplifier, the primary noise culprit will be coupling noise from parts of the circuit operating at high switching frequencies. This means that the noise will likely originate in either the SD card, the FPGA itself or the switch mode power supplies. To minimize electromagnetically and capacitively coupled noise the simplest solution is to place the components far from eachother with minimal signal overlap. This is easily achievable, but the in the original design it does appear as if there should be sufficient spacing to eliminate this issue.

In particular the SD card can be a source of high frequency switching noise when data is transmitted, therefore the SD card should be placed far awyay from the noise sensitive components. Additionally, since the Hafting-Fyhn group already uses the Open Ephys headstage which uses the RHD2132 chip it is simple to get a degree of separation by keeping my PCB as a backpack and connecting to the headstage via cable. There are several advantages to this approach, first of which is that physically separating the boards will prevent most noise coupled directly from loud parts of the circuit to noise sensitive parts both through capacitive and inductive coupling leaving only noise that is coupled through the power supply rail. Some of the noise coupled this way should be mitigated by the use of additional power supply filtering for the headstage supply. Secondly using the Open Ephys boards reduces the size and complexity of my board by offloading the essential parts to surrounding the RHD2132. Lastly it should result in a cheaper overall solution as it is possible to use hardware they already have for new experiments.

The second kind of noise that is likely to occur is noise coupled to the ampli- fiers through the power supplies. This noise can be coupled to the power rail electromagnetically or capacitively or it can come from the supply powering the voltage regulator. To get an estimate of the effect this type of noise may have on the measurements we can look at the Power Supply Rejection Ratio (PSRR) given in the datasheet of the device. The PSRR is a measure of the degree to which a variation in the power supply voltage will carry through to the output of the device. Higher PSRR indicates that less of the variation in the power supply is carried through, PSRR is usually given in decibels.

For the RHD2000 series this is given to be75dB at both 10Hzand 1kHz [2].

Unfortunately they do not specify the PSRR at higher frequencies, the noise from the rest of the circuit will mostly be at much higher frequencies than 1kHz which makes it difficult to say whether this will matter in this scenario.

To be on the safe side it would be a good idea to include a fair amount of power supply decoupling to try to reduce the noise that can arise from problems with the power supply.

(21)

4.5 Printed Circuit Board

The schematic and PCB design have been done in CADStar, the software used by the ELAB group at the physics department. The reason behind using CADSstar is that I have had experience with this software and ELAB provides very good support for this CAD package. Due to the noise issues in the previous version some care has to be taken when designing the PCB.

An important consideration when designing a PCB that operates on high frequencies or requires low noise is the layer stackup. That is the number of layers and what is put on each layer. When deciding on which solution to choose several things should be taken into consideration, of which the primary concern would be the number of signals that need to be routed in comparison to cost.

Operating at higher frequencies require more specific considerations in regard to routing, however as the devices will mostly at or below 50M Hz where it’s still in the area where modern circuits are fairly tolerant. Because of the SRAM and the goal of keeping the PCB as small as possible its desirable to spread more of them out to more layers. The two most relevant stackups to use would be the 4 and 6 layer boards with power planes.

When designing a 4-layer board the most common configuration is to let the top and bottom layers be for routing of all signals, and the two inner layers be used for power planes. For many boards this is more than sufficient, but in my case where the FPGA requires 4 different power planes (3,3V, 2.5V, 1.2V and GND) and the SRAM alone requires 48 pin connections that will lead to a board that is very difficult to route and debug.

Instead, when using a 6-layer board the most common stackup shown in figure 4.2. Which will give access to an additional two layers suitable for high speed signals. This configuration offers better EMC performance compared to a 4-layer board due to how the power planes act as shielding for the inner high speed layers [16, p. 642]. Although it doesn’t have dedicated layers for the additional power planes, due to how most of the signals in use will be going to the SRAM, there should be sufficient space on the two inner layers to put several large copper filled areas for the remaining power rails. The complete schematic and layout can be found in AppendixA.

4.6 Miscellaneous Design Decisions

As this design is very much a prototype, just in case all power supplies have been connected through a pin header shorted with a jumper. The primary purpose of doing so is to provide a way of powering the system if one of the power supplies does not work. It also provides a simple way to perform current measurements for system benchmarking.

The headstage uses a very specific connector with by a company called Om- netics specializing in exotic connectors. It is a surface mount component (Part

(22)

Figure 4.2: Layer Stackup for 6 Layer PCB[16]

Number: PZN-12-AA[5]), but requires special care during assembly due to its low temperature tolerance[17]¹, and therefore has to be assembled by hand. The connector is a polarized reversed connector so a different pinout is required for the master and slave boards[5], the master side pinout can be seen in table4.1.

When viewed from the front pin B1 is the bottom left, and T1 is the top left, the specific layout is shown in Figure4.3.

Table 4.1: Master Side Pinout for Omnetics 12-Pin Nano Strip Connector[5]

Signal Positive Negative

CS B1 T1

SCLK B2 T2

M OSI B3 T3

M ISO1 B4 T4 M ISO2 B5 T5

V CC B6 N/A

GN D N/A T6

1A version with higher temperature tolerance was released after production of the PCB, should be considered for future versions of the PCB to simplify assembly.

(23)

Figure 4.3: Omnetics PZN-12-AA Footprint

4.7 Assembly

Because the FPGA used is only available in a Ball Grid Array (BGA) package it is difficult to assemble by hand. A BGA package is a dense grid of pads with solder balls attached, see figure4.4. For the MAX 10 the pitch is 1.0mm, while it is possible to hand place these it is much better to use a pick-and-place machine. As there is some setup required to be able to use a pick and place machine it was decided to assemble the entire PCB in this way.

Figure 4.4: Ball Grid Array on an NVIDIA IC

All components used in the design have to be placed in trays beforehand in a specific order and orientation for the machine to recognize them. Then the machine will pick up the parts from the trays using a vacuum nozzle, and compare

(24)

the part against the data it has using machine vision. Sometimes it will drop the part, or the part will be in the wrong orientation. In these cases the machine requires manual intervention, it is therefore recommended having an attendant ready to deal with issues as they arise.

Since this PCB has components on both the top and bottom sides of the board the process has to be done in several steps. First the bottom side with only small components is mounted and then put in the solder oven. When completed the same procedure is done for the top side. It is done this way so that the larger components on the top won’t fall of in the process.

(25)

Part II

Firmware and Testing

5 Firmware

In this section the firmware implementation is described in further detail. A more detailed system overview than in figure2.1has been made. In figure5.1is shown the subsystems required for the FPGA to interact with external devices on the PCB. Not pictured is the compatibility layer between the Wishbone bus and Avalon bus described below.

Figure 5.1: Suggested System Overviw

5.1 Platform Designer Project

Platform Designer (formerly Qsys) is a system integration tool provided with the Quartus suite of tools. Its purpose is to increase the reuse of IP cores and speed up system development. It provides access to a large range of existing IPs provided by Altera as well as allows integration of your own or other’s IPs to simplify reuse.

The Nios processor is also provided and setup through the Platform Designer environment. A minimal system only requires the Nios II core, RAM and a JTAG core for communication with the processor. The design for this project additionally requires a PLL, and the Avalon to Wishbone bridge.

(26)

The complete setup used can be seen in5.2. In the following list the settings chosen for the IP cores are listed where they differ from the default.

• Nios II

– Core: Nios II/e – Reset Vector: RAM

– Exception Vector: RAM

• RAM

– Type: RAM (Writable) – Data Width: 32

– Total size: 65536 (64kiB)

• JTAG

– Write Buffer Depth: 16 – IRQ Threshold: 8 – Read Buffer Depth: 64 – IRQ Threshold: 8

• PLL

– Frequency: 50M Hz

Figure 5.2: Platform Designer Overview

(27)

5.2 Description of SRAM-controller

SRAM is by design an asynchronous technology, making the timing constraints more complicated than a synchronous device. To use it within a synchronous design it is necessary to create a controller that takes into account the timing constrains of the device to ensure proper operation. The complete implementation can be found in AppendixD.1.

The switching sequence to properly read from the SRAM is shown in figure5.3, with a detailed description below. Due to the asynchronous nature of SRAM the controller needs to be designed around the timing constraints, this means taking into account the operating frequency of the system. In this case the system is designed around a clock with a frequency of50M Hzprovided through a PLL, this means that a read or write cycle to the SRAM takes three clock periods.

When the SRAM controller receives the command to perform a memory operation these actions are performed:

Read Cycle

1. Switch Control signals:

• Put the address on the address bus

• Tristate master to slave data bus

• ready→0

• CE1→0

• CE2→1

• BHE/BLE→0

• OE→0

• W E→1

2. Wait until data has stabilized max55ns

3. Return control signals to default value:

• ready→1

• CE1→1

• CE2→0

• BHE/BLE→1

• OE→1

Write Cycle

1. Switch Control signals:

• Put the address on the address bus

• ready→0

• CE1→0

• CE₂→1

• BHE/BLE →0

• OE→1

• W E→0

2. Wait until data has stabilized max55ns

3. Return control signals to default value:

• ready→1

• CE₁→1

• CE2→0

• BHE/BLE →1

• W E→1

(28)

Figure 5.3: OEControlled Read Cycle

Figure 5.4: W E Controlled Write Cycle

5.3 The Wishbone Bus

The wishbone bus is an open communication protocol for use in both FPGA and ASIC development. The bus is developed by the OpenCores Project, an initiative for Open Source development of hardware IPs. It is a versatile bus supporting several different modes of operation and differing signals [18]. A minimal bus for use with the standard mode of operation consists of the signals described in table 5.1. To initiate a bus read cycle the bus master puts the address on the address bus, then assertsstb_oand cyc_o high. Then it waits for the acknowledge signalack_ito be asserted by the slave, when asserted the master can read the data from thedat_ivector and returnstb_oandcyc_oto low. An example bus transaction can be seen in figure5.5.

The Wishbone bus is very similar to the Avalon bus used by Altera simply

(29)

Figure 5.5: Wishbone transaction with a synchronous slave

requiring some combinatorial logic to convert the control signals to the ones used by the other bus[19]. This should make it fairly simple to interface it with the Nios CPU in particular for the purpose of simplifying access to the SD-card as the process because it requires a fairly complicated order of operations.

Table 5.1: Minimal Required Signals for Wishbone Compliance Signal name Description

Common signals

clk_i Common clock rst_i Global reset Master signals

dat_i Data from slave, variable width0−64bits dat_o Data to slave, variable width0−64bits

adr_o Variable width binary address, optional in some cases eg. FIFO stb_o Strobe, indicate data transfer

cyc_o Initiate bus cycle we_o Write enable

ack_i Input acknowledge, when asserted indicates completed cycle Slave signals

dat_i Data from master, variable width0−64bits dat_o Data to master, variable width0−64bits adr_i Input address

stb_i Strobe cyc_i cycle

we_i Write enable

ack_o Output acknowledge, when asserted indicates completed cycle

5.4 FIFO Controller

The implementation of a synchronous FIFO can be fairly simple. In essence, it consists of two counters that keep track of the read location and the write location. Then when a read or write is performed the respective counter is incremented. Usually there is a signal signifying whether the FIFO is empty or full. This can be done by making the counters 1 bit longer than the number required for the total addreses and then comparing the two counters. If theN−1

(30)

least significant bits are the same, but the most significant bits are different the FIFO is full, if the most significant bits are the same the FIFO is empty.

Because of insufficient memory in the FPGA the FIFO controller relies on the SRAM as the actual memory location. The current version of the FIFO can be controlled through the wishbone bus. Currently it does not have complete support of wishbone functionality.

5.5 Avalon to Wishbone Bridge

To be able to use the wishbone IP cores with the Nios II processor it is necesary to create a wrapper or a bridge to convert the bus command signals from one standard to the other. In the case of wishbone to avalon this only requires a small amount of combinatorial logic to create the required signals. Most of the signals have obvious counterparts in the two buses, the ones that don’t are shown in table5.2.[19].

Table 5.2: Wishbone to Avalon signal translation Wishbone Avalon

stb chipselect ack !waitrequest

we !write_n AND !read_n cyc !write_n OR !read_n

When using this together with the Nios processor it was found that the easiest way to connect the systems is to use Altera’s Platform Designer system integration tool and add it as a new IP core, then exporting the wishbone signals to the top-level component instantiation of the Platform Designer system. The wrapper is now a separate IP in the local Platform Designer library allowing us to add more with very little effort.

Initially an attempt to write a custom bridge was made. However due to problems in the software, that are expanded upon in Section 6.4.1, that solution was abandoned and an existing bridge was used instead. The core used is available at https://github.com/openrisc/orpsoc-cores/tree/master/cores/

wb_avalon_bridge.

When starting a software project with the Nios II Software Build Tools they provide a pregenerated Hardware Abstraction Layer and Board Support Package from your Qsys configuration files. This includes macros for writing and reading to the avalon bus. The same macros as used for the avalon bus will be used for the wishbone bridge. The macros shown in Listing 5.1 take into account the data width of the bus, this is to provide the byteenable functionality used by some modules. They take a base address which is speciic to the module that

(31)

1 #define IOWR_32DIRECT(BASE, OFFSET, DATA) \

2 __builtin_stwio (__IO_CALC_ADDRESS_DYNAMIC ((BASE), (OFFSET)), (DATA))

4 __builtin_sthio (__IO_CALC_ADDRESS_DYNAMIC ((BASE), (OFFSET)), (DATA))

6 __builtin_stbio (__IO_CALC_ADDRESS_DYNAMIC ((BASE), (OFFSET)), (DATA))

Listing 5.1: Avalon Bus Write Macros

1 IOWR_32DIRECT(0x4000,1,0x1234);

Listing 5.2: Example Single Register Write

you want to access, an offset specifying which register in the device you want to access and the data you want to write. In this case there is a 32-bit data width in which case theIOWR_32DIRECTis supposed to be used.

The read macros are identical to the write macros except that they don’t take a data argument.

The SD-card controller IP has 19 internal registers, meaning that if I want to access the first register the offset is 0, for the second register it should be 1 etc. Initially when writing to the registers it appeared that the same value was written to all the registers, however after spending too long debugging it, it was apparent that the address offset was not working as expected. When writing to a device with a base address of for example 0x4000 and a data width of 32- bits the first address should be found at 0x4000 then there should be a 4 byte offset placing the second register at 0x4004. For a write-operation this would look something like in listing5.2. That should write to the register located at 0x4004, in my case however when reading it back it seemed to fill more than one register. When writing or reading from a register the addresses are byte addressable. This means that 0x4000 is accessing the data that begins at that byte, however since the data is 32-bit long the next address starts 4 bytes higher at 0x4004. If you try to access any of the addresses inbetween you will still only access the data in the first register, in this case address 0x4000.

5.6 SD Card controller

To get the best performance when accessing an SD-card a very complicated controller is required. Writing my own controller would take much too long so it was chosen to use an existing and proven design from OpenCores. The project is available athttps://opencores.org/project/sd_card_controllerand is a fork of an existing design. It has a large feature set, the most important for this project however is the fact that it supports the use of 4-bit SD mode and direct memory access allowing it to circumvent the processor for memory operations. The maintainer has made an example project for the OpenRISC processor with example drivers available which possibly could be ported to the

(32)

Nios II instruction set. Dependencies used in the driver are mostly standard libraries except for the specific implementation of how to access the bus.

The SD card controller has 19 internal registers used not only for configuration but also for executing SD commands and retrieving data. It does not automati- cally perform the card setup procedure required for the SD card to be in a state where it’s ready to recieve data, this has to be implemented in the driver. To use the module first it is necessary to configure it for use with the current system.

This involves setting several configuration registers to values appropriate to the system. The actual setup will be described in more detail in Section5.7. A list of all available registers is found in table5.3[20].

To send a command, first the specific command is written to the command register. Then when the argument is written to argument register, the core starts transmitting the command to the SD card. Most commands offer options which it is possible to specify when preaparing the command. These options are command dependant and can be found in the SD Association’s specification [21], some of this information may also be found in the IP core documentation [20]

in the section describing the command register. After the command has been written to the command register, the transmission of the command will be initi- ated when the something is written to the argument register. Not all commands take an argument and this field may therefore be completely empty.

When a command is sent the command event register will generate a response depending on whether the command is succesful or not, also giving some information as to the type of error encountered if unsuccesful. For example in the case of a succesful transaction bit 0 of the command event register will be ’1’ and the rest will be zero. If instead the transaction failed bit 1 of the event register will be ’1’ and the error encountered will be given by bit 2 through 4. After a succesful or failed operation the register should be overwritten with all zeros to prepare for the next command. This means that a command transaction will usually involve continuously reading from the command event register until bit 0 or bit is 1 then reset the register and depending on if it failed or not continue with the next operation. The data event register works in exactly the same way as the command event register except it only triggers when transferring data.

This controller uses the wishbone bus so it requires a wrapper to translate to the avalon bus used by the Nios II processor, as described in Section5.5.

(33)

Table 5.3: SD Card Controller Register Map[20]

Name Address Access Description

argument 0x00 RW Command Argument

command 0x04 RW Command configuration

response0 0x08 R Response bits 31-0

response1 0x0C R Response bits 63-32

data_timeout 0x18 RW Data transfer timeout configuration

control 0x1C RW IP core control register

cmd_timeout 0x20 RW Command transfer timeout configuration clock_divider 0x24 RW SD interface clock divider configuration

reset 0x28 RW Software reset

voltage 0x2C R Power Control information

capabilities 0x30 R SD feature support information

cmd_event_status 0x34 RW Command transaction event status/clear cmd_event_enable 0x38 RW Command transaction events enable data_event_status 0x3C RW Data transaction events status/clear data_event_enable 0x40 RW Data transaction events enable

block_size 0x44 RW Block transfer size

block_count 0x48 RW Transfer block count

dst_src_address 0x60 RW DMA destination/source address

5.7 SD Card Driver

The SD card controller comes with an existing driver made for the OpenRISC processor. There are some changes required to port the driver to the Nios processor. First the read and write macros need to be changed to the Nios specific macros, they can be found in the fileio.hwhich will be generated together with the rest of the software project from the Nios Software Development Kit. The specific changes can be seen in listing5.3.

1 // Original

2 #define readl(addr) (*(volatile unsigned int *) (addr))

3 #define writel(b, addr) ((*(volatile unsigned int *) (addr)) = (b))

4 // Nios

5 #define readl(addr) IORD_32DIRECT(addr,0)

6 #define writel(data, addr) IOWR_32DIRECT(addr, 0, data)

Listing 5.3: Changes to Write Macros

Additionally, the functionvoid flush_dcache_range()in the driver must be removed because the macros shown in listing5.3 bypass the cache thereby removing the necessity entirely. These two changes are all that is necessary to have it function with the Nios processor.

(34)

The driver itself implements the required procedure for SD card setup as well as basic read and write procedures. The initalization and memory operation procedures are fairly complicated, depending both on the capabilities of the SD card and the controller. Figure 5.6 and 5.7 provided by the SD Association, show the procedures for initalization and data transfers respectively.

Figure 5.6: Card Identification and Initialization Sequence[21, p. 28]

First the driver initializes the core by setting the core in reset mode by writing’1’ to the reset register. Then the desired configuration is written to the registers.

The data and command timeout registers control how long the controller will wait before generating a timeout eror event, if set to ’0’ no event will be generated.

The event enable registers selects which events that can generate a change in the status register[20]. Setting a bit to’1’enables that event.

The control register selects between 4-bit mode and 1-bit mode, to set 4-bit mode write’1’to bit 0. The final register that has to be set is the clock divider register, which sets the output clock frequency. After all the registers are set, the core is returned to an idle state by writing’0’to the reset register.

Next the driver will start the card identification process shown in the flow dia-

(35)

gram in figure5.6. In the case where a modern SD card is used it will follow the centre branch. The process involves getting information about the SD Cards such as card capacity, voltage supply suport and a unique card identification number[21]. After executing CMD3, which asks the SD card for a relative address used when addressing that specific card, the card enters an idle state.

Figure 5.7: Transfer Mode[21, p. 33]

(36)

6 Verification and Testing

In this section some of the test and verification procedures are explained.

6.1 Initial Hardware Verification

Before powering the FPGA, the different power supplies were tested at no load by disconnecting the pin headers connecting them to the rest of the circuit. First the Buck-Boost converter was tested by itself, then each of the subsequent power rails were tested when powered from the Buck-boost converter. This was simply a functional test to verify that there were no shorts and that the correct voltages were supplied. The testing showed that two out of three boards were providing the correct voltages, but the board marked number 3 has a malfunctioning 2.5V regulator. This can be either due to a short circuit underneath the IC or simply a faulty device. Either issue requires removing the device to repair so no further testing was performed as the two other devices were functioning as expected.

Then a simple hardware image where the LEDs on the board are turned off and on by activating the pushbuttons. This simply verifies that it is possible to program the device and that it is functional. The two functional boards both showing positive results.

A minimal Nios II system was then put together in Platform Designer, for further testing. When the image was programmed, the CPU turned out to be unresponsive when attempting to program it with a simple “Hello World”

program. A version of the image used in the inital test where the LEDs where activated in a synchronous process instead of being directly connected to the input from the buttons implied that the external clock was inactive or not functioning. After verifying with an oscilloscope, it was discovered that the external clock IC was mounted rotated 90 degrees. Desoldering and mounting them in the correct orientation fixed this issue and the clock was working as expected.

(a) PCB Top (b) PCB Bottom

(37)

6.2 Power Supply Benchmarks

Some rudimentary measurements with regard to the power consumption and performance of the power supplies were also made. A conceptual illustration of the test setup is shown in Figure6.2. Due to not having enough multimeters to measure all channels simultaneously, each channel was measured by itself before turning the device off and on again for the measurement of the next supply rail.

When measuring any rail, all others were left connected to the FPGA to ensure that the device was operational.

Figure 6.2: Current Measurement Setup

The current was measured in three different scenarios. First with no image programmed to the FPGA, that is the device is completely idle. Second with the image described in Section5.1, but no software programmed to the device.

And finally with the previous image, but a simple program that continuously prints “Hello World!” to the terminal. The results can be seen in table 6.1, the current shown in the PSU tab is the current measured at the power supply.

Idle With CPU Programmed PSU3.7V 56mA 116mA 127mA*

3.6V 56mA 121mA 129mA*

3.3V 20mA 70mA 79mA*

2.5V 21mA 25mA 25mA

1.2V 10mA 46mA 50mA

Table 6.1: Current Draw from individual power rails *average current A simple test checking the minimum required input voltage for the Buck-Boost converter to provide enough power for the system was performed. The method for doing this was to measure the output voltage from the Buck-Boost converter with the system active and gradually lowering the input voltage until a significant voltage drop is observed on the output. The complete result can be seen in TableC.1in AppendixC, but the interesting value is that the converter

(38)

was functional until the input voltage was 1.73V. When the input voltage was dropped lower than this, the output cuts off.

6.3 Testbenches

All the testbenches made for this project have the same basic structure. The scripts folder contains .tcl scripts for compiling the testbench, setting up the test environment and running the tests. Thesrc folder contains all HDL files used by the Device Under Test, thetbfolder contains the testbench itself.

Finally, thetb_libs folder contains the Universal VHDL Verification Method (UVVM) library used for simplification of the testbench environments.

• component_name – scripts – src – tb

– tb_libs

To run any of the testbenches open ModelSim and navigate to the tb folder and run the following command do ../scripts.do. The testbench will then be compiled and run. If there are any errors, they will show up in the terminal.

For some modules the compilation may take a significant amount of time, so if running the tests multiple times in a row it can be a good idea to comment out the files to which there will be no changes such as the UVVM library files to reduce the amount of time spent compiling. All testbenches are written in VHDL, though some DUTs are primarily written in Verilog.

6.3.1 SRAM Controller

The SRAM controller testbench relies on a model of the specific device used that is provided by the manufacturer which is instantiated as a component in the VHDL testbench. This testbench is an extremely simple testbench not relying on any outside libraries. When this test is run, the wave diagram has to be compared manually to the expected output. The reason this testbench is so simple is that the SRAM is a fairly simple to verify module.

The verification done in is simply writing a value to the SRAM and then reading it back and comparing it to the written value. If the controller has timing issues the model has checks built-in for some common issues.

(39)

6.3.2 FIFO Controller

The testbench for the FIFO controller is what is called a self-checking testbench.

This means that it applies a stimulus, and tests the reaction to verify if its correct. If the reaction or result is not as expected the testbench throws an error or a warning depending on the severity set.

The FIFO controller relies on the SRAM controller to work since the working memory of the FIFO is the SRAM.

In the file fifo_ctrl_pkg.vhd several procedures are defined to simplify the creation of a comprehensive test. There are currently three defince procedures push(), pull() and check_data(). The push procedure puts data into the FIFO by first checkin thefullflag, if not full then put the data on the bus and set the control signals. It then waits until thereadysignal is low before it resets the control signals. The complete procedure body can be found in Listing6.1.

The pull procedure is implemented similarly and can be seen in Listing6.2.

1 procedure push(

2 -- Inputs

3 data : in integer;

4 signalfull : in std_logic;

5 signalready : in std_logic;

6 -- Outputs

7 signaldata_m2s : out std_logic_vector(15 downto 0);

8 signalread : out std_logic;

9 signalwrite : out std_logic;

10 signalstart : out std_logic

11 )is

12 begin

13 if notfull = '1' then

14 if not ready ='1' then

15 wait until ready ='1';

16 end if;

17 write <= '1';

18 read <= '0';

19 start <= '1';

20 data_m2s <= std_logic_vector(to_signed(data,16));

21 wait untilready = '0';

22 start <= '0';

23 write <= '0';

24 end if;

25 end procedure push;

Listing 6.1: Data Push Procedre

(40)

1 procedure pull(

2 -- Inputs

3 signaldata_s2m : in std_logic_vector(15downto 0);

4 signalempty : in std_logic;

5 signalready : in std_logic;

6 -- Outputs

7 signalread : out std_logic;

8 signalwrite : out std_logic;

9 signalstart : out std_logic

10 )is

11 begin

12 if notempty = '1'then

13 if not ready ='1' then

14 wait until ready ='1';

15 end if;

16 write <= '0';

17 read <= '1';

18 start <= '1';

20 start <= '0';

21 read <= '0';

22 end if;

23 end procedure pull;

Listing 6.2: Data Removal Procedure

The check_data procedure verifies the data on the bus against a specified value given as input. If the data is wrong it throws an error and provides the value it found and the value that was expected. The implementation is shown in Listing6.3.

1 procedure check_data(

2 data : in integer;

3 signaldata_s2m : in std_logic_vector(15downto 0);

4 signalready : in std_logic

5 )is

6 begin

7 if notready = '1'then

9 end if;

10 check_value(data_s2m =std_logic_vector(to_unsigned(data, 16)),error,"Wrong Value. Expected " & to_string(data) &" got "&

to_string(to_integer(unsigned(data_s2m))), C_SCOPE);

,→

11 end procedure;

Listing 6.3: Data Validation Procedure

When the procedure is called, all signals in the procedure definitions must be included in the procedure call. An example procedure call for each procedure is shown in Listing6.4.

1 push(10, full, ready, data_m2s, read, write, start);

2 pull(data_s2m, empty, ready, read, write, start);

3 check_data(10, data_s2m, ready);

Listing 6.4: Example Procedure Calls

(41)

The testbench tests currently implemented first writes data to the FIFO, then it removes the data and verifies that it is the same data. After this is complete a test is done where data is written until the FIFO is full, and then emptied completely. There is significant room for improvement in this test as currently it only writes the same value each time. A better implementation would be to provide the testbench with enough random values in a file which it could then write to the FIFO and verify against the file when removing values.

6.3.3 SD Card Controller

The SD card controller comes with its own testbench for verifying its functionality. The Testbench is written in verilog and comes with a verilog model of an SD card in the filesDModel.v.

To learn how the SD card controller operates, and because of problems running the included testbench in a Windows environment some time was spent implementing a partial testbench in VHDL. The VHDL testbench is a reim- plementation of a small subset for the verilog testbench. To avoid having to reimplement a working SD card model the one used in the verilog testbench is reused here.

For it to work with a VHDL testbench some changes had to be made to the top-level ports in the model. In the original thedatandcmd signals are both set as thetri type, in verilog this seems like it is implicitly is a bidirectional signal.

6.4 Software

Very little software was actually written for this project, but a large amount of time was spent debugging and unsuccessfully trying to get the SD card driver to work as intended.

6.4.1 Avalon to Wishbone Bridge

Initially the Avalon Wishbone bridge that was written seemed to be nonfunc- tional. When written to or read from all registers returned the same values. A premade bridge was then used which displayed the same issues, which implied that the problem was not necessarily a part of the bridge.

A small program was made that wrote the number of the register to each register then read it back. This uncovered that the 4 first registers all contained the same value, and the next 4 the next value, etc. all the way to the final register.

In the first program an offset of 4 for each register was used, as the offset is supposed to be the number of bytes in the data. When adjusting this offset to be 16 instead, the program worked as intended.

An FPGA Based System for Acquisition and Storage of Neural Bioelectric Signals