• No results found

3.2 Digital implementation

3.2.6 Readout

The device supports two primary ways of data readout, a packet-based serialized readout and a parallel readout. The parallel readout bypasses all the data proces-sing of the device and pushes the rawADC data directly out on the data lines, while the packet based method enables all the processing of the device and can send the data out on a programmable number of links, up to eleven. In the packet based mode, the data can so be sent from one to another device in a daisy-chained fashion, to lower the number of upstream serial links that are required.

3.2.6.1 Serialized data readout

The serialization of data is as sketched in figure 3.19. The drawing shows the rea-dout for a single serial link of the available eleven. The serializer has no knowledge of format or content of the data it serializes. It receives a data-ready signal from a channel that indicates data is ready to be read out and it sends a signal back to the module to tell it to advance the data it presents to the serializer. It will read from the selected channel for as long as the data ready signal is high. A masking of the data ready signal is done to restrict the number of channels a serializer can read data from, depending on the number of serial links that have been enabled.

The number of channels per links is divided equally and the serializer reads data from each channel in a round robin fashion and operates independently of each other. A priority is enabled so that if a heartbeat packet needs to be transmit-ted, it always has the highest priority and will be sent after the current packet is completed.

When the serializers have no data to send, i.e. there are no heartbeat packets

Channelf0

Figure 3.19:Conceptual overview of the serial readout implementation.

or channel data, they will transmit sync packets to keep the receiving end in sync. The sync packets are the same size and format as a packet header, but the information in it is fixed. As this keeps the links toggling while the device is idle, it will consume some extra power. An option would have been to power off the serial links whenever there are no more data to send and when a new packet is ready to be transmitted, a sync packet would be sent first to enable the reviving side to synchronize to the data stream. Currently power-off is only done for a serial link when it is not enabled.

DCbalancing of the output serial links is not implemented in the device, as it is not a requirement for interfacing with the GBTx. In other applications that do not use theGBTx, theDC balancing might be of benefit to provide better signal integrity. As the native word format of theSAMPAis 10-bit based, the best option would be to use a 5b/6b encoding which would increase the overall bandwidth usage by 20 %. The existing sync packets would not be necessary any more as the DC balancing code has unused codes that can be used for link synchronization.

To reduce cost and development time, it is common to produce only one or maybe a couple of different versions of a front-end card to mate with the detector pads. However, the detector might have several different mappings of physical pads to input channel on the front-end card, depending on the position in the detector. As the serial links are independent of each other, the data from one chip can potentially be sent to two or more upstream devices for further processing of

the data for instance for event reconstruction. If it is important to the detector to have specific input pads arrive at specific upstream devices, independent of the mapping, theSAMPAprovides a way of remapping which input channels are presented to each serializer.

Due to the setup with a fixed set of channels available to each serializer, the SAMPAcurrently provides no way of load balancing between the serializer in case a set of channels for one serializer has a higher occupancy than the set for another.

This could cause unnecessary overflow of buffers, even though there is available bandwidth on other serial links. If all serial links would have access to read from all of the channels, it will provide a more equal load across the serial links. Each link would then iterate through its subset and whenever all the channels in its subset is empty, it picks another one not in its subset and which is non-empty and not already being read. Normally it would send sync packets in this case. This method will however cause the predictability of which channel will arrive next on which link to be lost. It will also render the option to re-order which channels connect to which serial link mostly pointless.

3.2.6.2 Daisy chained readout

For detectors with very low data rates, specifically the MCH, a daisy chaining option is implemented. This lets multiple devices share a single serial link, incre-asing the number of devices that can be read out by a singleGBTx. Each device is provided with an extra data input port and two data-control signals for this purpose. A connection diagram can be seen in figure 3.20.

The upstream device is set up to run with one serial downlink and the busy out signal from the downstream device is connected to the busy in of the upstream device. When the busy signal is high, it tells the upstream device to halt its transmission of data after the current packet is done. While waiting to send data again the upstream device will send sync packets to keep the communication in sync.

The downstream device has the serial output of the upstream device connected to its daisy input port. Sync packets arriving on the link are filtered out and only heartbeat packets and data packets are forwarded. The accepted packets from the upstream device are added to a ring buffer in the same way as for the inter-nal channels. A configuration register defines how many upstream devices there

are from the current device. This value is used by the serialization module that serializes packets from the ring buffers for defining the number of packets that should be forwarded from the upstream link before a packet from one of the inter-nal channels can be forwarded. The handshaking between the serial link module and the module that handles the readout of the buffers is however constructed in a way that it is not possible to read out two consecutive packets from the same buffer without a few cycles delay, which causes there to be sent a sync packet in between two packets from the upstream device when this setting is enabled. In principle there is no limiting factor to fix it in the existing buffer readout module, but there would need to be an addition of an extra state machine in the buffer re-adout module that pre-reads the next header from memory so that it is ready to be transmitted. The primary issue however lies in that the data ready signal from the buffer also is used to indicate that a stream of data is completed and that the serializer should select another buffer or a sync packet to send, but this could be solved by separating the data ready and packet done signal. Since the MCHwill only operate with two devices in a chain, this was not prioritized to be fixed for the SAMPAv3.

SAMPA 1 SAMPA 2 SAMPA 3

NBflowstop_in

Figure 3.20: Connection setup for daisy chaining.

Commonly, all the devices in a chain will be clocked from the same source. As the data is sent and captured at the same clock speed, it is possible that the data stream could be corrupted after exiting the synchronizer on the receiving end if the data that is received changes too close to the clock edge causing metastability in the synchronizer. Since the time between data being clocked out of the upstream device to it is received on the downstream device is dependent on various factors such as the distance between the devices, the properties of the Printed Circuit Board (PCB) material, the process variation of the device etc. a programmable delay chain has been added before the signal enters the synchronizer. The incoming data can be delayed by up to 12.5 ns in steps of 0.2 ns. The maximum length allows for delaying the signal by a full clock cycle at 80 MHz and provides enough of a granularity to allow for 15 steps of delay at 320 MHz.

The busy signal is not provided with a delay chain as the signal is level based.

A delay in the reception of the busy signal of one or two cycles due to metastability in the synchronizer is not considered a problem.

To recover from communication errors during transmission between the devi-ces, the receiving unit does Hamming correction on the received headers. Errors are corrected before storing in ring buffer for retransmission. In case of the de-tection of double errors, the receiving unit drops the data and waits for resyn-chronization. The receiving unit can also detect a stuck high/low data input.

When the daisy chaining option is not in use, the units relating to the daisy chaining is turned off, through clock gating, to save power.

A drawback of this readout method is that if there is a malfunction of one of the devices in a chain that makes it inoperable, there will also be no data received from the previous devices in the chain. It is also vulnerable to channels or devices that are overproducing data, either through misconfiguration or from malfunctio-ning input channels. Options are available to turn off misbehaving channels, but monitoring must be in place upstream to detect and reconfigure the device. In case a device is partly operational, it is possible to configure it so that the com-plete device is bypassed by forwarding the control and data signal directly from input to output.

3.2.6.3 Direct readout - serialization

For detectors that would prefer not to use the data handling capabilities of the SAMPA, a mode is available where the raw ADC samples are directly serialized and the rest of the digital circuitry is powered down through clock gating. This mode operates with a serialization speed of 32 times the ADC sampling speed and can be configured in two modes, referred to as the normal mode and the split mode.

In the normal mode, the 10-bit data for channel 0 will be output in parallel on the serial links. In the consecutive cycle, the data for channel 1 will be put out, and so on. Since the serialization speed is 32 times theADCsampling speed, it is possible to transmit the current sample for all of the 32 channels in the time it takes to sample the next sample.

In the split mode, five first serial links are dedicated to channel 0-15 and the other five serial links are dedicated to channel 16-31. The serialization speed is

still 32 times the ADC sampling speed, but it takes two serialization cycles to transmit a full sample. In the first cycle, the 5 lower bits for channel 0 and the 5 lower bits for channel 16 will be transmitted. On the consecutive cycle, the 5 upper bits for the same channels are transmitted, it then continues with the 5 lower bits for the next channel, and so on. In this way, it is possible to direct the data for half the channels to one upstream receiver and the other half to another.

Upon start up, a 32-cycle sync pattern is transmitted, so that the receiving end can synchronize to the stream. As the data transmission is cyclic, there is no need to have a separate data bit to indicate the start of a new sample. If the receiver loses track, it can restart the transmission to get a new sync word without losing more than a couple of samples.

The control circuitry for the serialization is protected from Single Event Upsets (SEUs), as further discussed in section 3.4.1, but the internal generation of the ADC sampling clock is not. This can be problematic for detectors that need to have the sampling clock in phase across several devices. To mitigate this, the spare 11thserial link is used to transmit the ADCclock together with the data stream.

AnSEUin one of the ADCclock derivation registers will present itself as a phase shift in the clock. By counting the number of high and low cycles on the receiving end of the 11th link, it is possible to determine if anSEUhas occurred and from there it can be considered if the device will need to be reset.

This serialization method has the benefit that it is simple to implement a re-ceiver for it on the upstream device. It also opens the possibility for implementing other compression forms and filtering methods that are more suited to the de-tector application, with a trade-off in added development time. The filtering and compression can be done separately on the front-end card by use of anFPGA. Ho-wever, it has the drawback of increased power consumption of the front-end card and the added effort of radiation qualification of the FPGAdevice. It can also be done off-board by sending the data through the GBTxto a readout device. The drawback then will be that there will be an increase in cost in the form of a need for extra GBTxdevices and optical fibre links. Since there is no error correction added in the data itself and there is no synchronization during the data trans-mission, it is up to the upstream device to verify the integrity of the link during the initial 32-cycle sync pattern. As all serial links will toggle in a predetermined pattern, it can be verified that none of the links are stuck at a fixed value.

3.2.6.4 Direct readout - combinatorial

For applications where there is only a need to use a subset of the channels the device provides, a mode is available where the data from theADCchannel inputs are multiplexed to the 10 serial link outputs. The channel that should be used is determined through the configuration of five inputs pins. Data from several channels can be acquired by cycling through input pin configurations during a sampling cycle. As the configuration pins for setting the channel are single-ended CMOS, the cycling speed would be limited to the single-ended switching speed of the driving device. Since the data rate using this method is much lower than with the other methods, it opens up the possibility to interface the device to low cost FPGAor a microcontroller solutions.