# Upgrade of the Data Read-Out and Control Systems of the HADES Detector

### Jan Michel

HADES Collab. Meeting, Sesimbra 2009

# HADES – High Acceptance DiElectron Spectrometer

- 8 distinct detector systems
- 80.000 data channels
- Passed many succesful beamtimes in the last years



| 2002 | C + C 2 AGeV                  |
|------|-------------------------------|
| 2004 | C + C 1 AGeV                  |
| 2004 | <b>p</b> + <b>p</b> 2.2 GeV   |
| 2005 | Ar + KCl 1.75 AGeV            |
| 2006 | <b>p</b> + <b>p</b> 1.25 GeV  |
| 2007 | <b>p</b> + <b>p</b> 3.5 GeV   |
|      | <b>d</b> + <b>p</b> 1.25 AGeV |
| 2008 | <b>p</b> + <b>Nb</b> 3.5 GeV  |

- Medium sized systems @ 1-3 AGeV
  - Events written to storage: 10k/s
  - Size per event: 1-2 kbyte
  - Data : up to 15 MByte/s



- LVL 1 trigger
  - Hit in Start but no hit in VETO detector
  - Analog electronics count number of charged particles in TOF
- If positive, Central trigger system distributes information to all detectors
- Detector data is digitized / parts of it are read-out
- LVL2 trigger
  - Based on TOF, RICH and Pre-Shower detector data
- If positive, all data is read-out and sent to processing nodes (VME grates)
- Data is sent via Ethernet to PC (EventBuilder)





#### current DAQ system

|              | Particles / Event      | LVL1 Trigger Rate      | Data Rate            |
|--------------|------------------------|------------------------|----------------------|
| p + p        | 5                      | 10 – 20 kHz            | 10 MByte/s           |
| Ca + Ca      | 40                     | 3 – 4 kHz              | 10 MByte/s           |
| Au + Au *    | 200                    | o.7 kHz                | 10 MByte/s           |
| Imp          | prove data acquisition | capabilities by a fact | or of 20             |
| ungraded DAO | system                 |                        | (design performance) |

upgraded DAQ system

(design performance)

|       | Particles / Event | LVL1 Trigger Rate | Data Rate   |
|-------|-------------------|-------------------|-------------|
| p+p   | 5                 | 100 kHz           | 100 MByte/s |
| Αυ+Αυ | 200               | 20 kHz            | 200 MByte/s |



- Replace TOFino with Resistive Plate Chambers (RPC)
  - Higher granularity
  - Better time resolution
- Rebuild one layer of drift chambers (MDC 1)
  - Higher efficiency than old chambers



- New readout electronics for all detectors
- Integration of better control- and configuration possibilities (SlowControl)
- Development of a combined network transporting triggers, data and monitoring





- Fast switching signals cause noise in analog electronics
- Electrical connection causes ground loops
- Limited distances
- ~ 0.5 Gbit/s under good conditions
- Comparably high weight of cables
- Cable failures hard to detect
- Now available: Even smaller & cheaper transceivers and cables based on plastic fibres
  - Will be used for readout of MDC
  - Even cheaper than regular copper wires!





- Optical signals
- Sender and receiver electrically decoupled
- > 100 m length
- Up to 4 Gbit/s per fibre (SFP)
- Light-weight
- Small footprint on PCB





Output

+ fixed special blocks like memory, multipliers...

Programmable in VHDL / Verilog

x 1.000.000

- Typical numbers: 100 200 MHz, 500 I/O pins, equiv. 6 million logic gates
- Like in normal electronics, all the logic runs in parallel, not serialized as in a CPU
  - A big number of small tasks can be done at the same time —





4-input

.ook-Up

Table







- Virtex <sub>4</sub> FPGA
- TigerSharc DSP
- Etrax CPU (Linux & Ethernet)
- 2 GBit/s optical link
- 128 TDC channels (time resolution: 40 ps)
- On the back: Connector for AddOn-Boards
- Already used in several beamtimes
- ... and by other experiments, e.g. for detector tests





• RPC, Forward Wall, Start & Veto detectors: directly read out by TRB



# TOF: TRB with AddOn converting charge in detector to a pulse width



#### Shower: AddOn with 96 ADC channels





- RICH ADCM
  - Stand-alone board
  - FPGA, Optical link
  - 2x 8 channel, 12 Bit, 40 MSPS ADC



- General-Purpose AddOn
  - Connection to VULOM the central "brain" of the trigger system





- Driver Card (OEPB)
  - Only 4 x 5 cm size (constraint by given detector setup)
  - FPGA + 250 MBit/s optical link
  - Bootloader feature
  - Reads out front-end digitizer chips
  - 380 boards for MDC system
  - AddOn
    - 32x 250 MBit/s
      + 2x 3.125 GBit/s optical links
    - Controlled by 3 FPGAs







- The main "working horse" of the network
- 20x up to 3.125 GBit/s
- Capable of Gigabit-Ethernet to send data to standard PC
- Implements basic data processing features









- CTS-TRB
  - Connects to Vulom
  - Controls trigger and data channels
- SlowControl-TRB
- Cross connection between both boards



Status of (TrbNet)-Links between Boards





- Integration of trigger distribution, data read-out and slow control into one system
- Short latencies (3 µs between two endpoints)
  - Latency of trigger transport directly affects detector deadtime
- Minimized electromagnetic influence on analog electronics
  - Usage of optical fibers instead of copper wires
- Controlled transfers
  - no arbitrary data loss or corruption
- Individually accessible boards with feedback path for integrated monitoring and controlling
- Flexible to adapt to different FPGAs and media types
  - Modular design to adapt to different media (POF, glass fiber, LVDS) and different hardware



- Network is divided into several logical channels
  - Internal handling of each channel is completly decoupled
  - Different kinds of transfer are able to run in parallel





#### Low Latency transfer during transport of big data blocks

- All transfers are divided into small packets (80 bit)
  - Channel can be switched after each packet
  - Transfers can easily be interrupted for high priority signals
  - Different packet types e.g. for data, flow control, error detection
  - 64 bit payload per packet (4x 16 bit words)

| Name | Туре | Description                     | но | FO            | F1               | F2            | F3            |
|------|------|---------------------------------|----|---------------|------------------|---------------|---------------|
| DAT  | 0×0  | Normal data word                |    | Data          | Data             | Data          | Data          |
| HDR  | 0×1  | Transfer start / source changed |    | Source adress | Target adress    |               | SEQNR / DTYPE |
| EOB  | 0x2  | End of Buffer                   |    | Checksum      |                  | Data count    | Buffer number |
| TRM  | 0x3  | Transfer Terminated             |    | Checksum      | Error pattern    | Error pattern | SEQNR / DTYPE |
| аск  | 0×5  | Buffer acknowledge              |    |               | Length of buffer |               | Buffer number |



#### Controlled transfers – no dataloss

- Data Blocks with defined size
  - Receiver is able to store this in its buffers
  - Sender waits until receipt is acknowledged
  - Block size is negotiated between both sides





#### Check for all nodes being active

- Each session (one data transfer) consists of two parts:
  - The CTS sends a data transfer to the network (distributed to all endpoints)
  - All endpoints answer the transfer to acknowledge receipt
  - Only addressed endpoints send a longer answer (if necessary)
  - Before all endpoints have answered, no other transfer on this channel is allowed
    - One blocked channel doesn't influence other running transfers





## Network Control Layers

- Data Handler prepares provided data
  - Adds information needed to process data
- API adds header and termination to data
  - HDR: target address & type of data
  - TRM: end of each transfer & gives basic status information
- Link layer divides data into blocks
  - Receiver buffers must be able to store a whole block
  - Sender waits for acknowledge (buffer cleared) before
- Blocks contain CRC error checking
- Media Interfaces control the physical connection between FPGAs



Flexibility – Modular Design



# Access to the Network: Data Handlers

- One data handler for each channel
  - LVL1 trigger ("trigger")
  - Data ("IPU-data")
  - Slow Control ("RegIO")
- The whole network endpoint is encapsulated in one entity
  - trb\_net16\_endpoint\_hades\_full.vhd
  - Only the media interface and own logic has to be connected





- Each board gets two trigger signals
  - Timing trigger signal using a dedicated copper wire
  - Trigger information packet via the main network:
    - Trigger number, Trigger Type
- Network protocol includes "busy-logic":
  - No more triggers may be sent before all FEE have acknowledged to be ready to take data again
- Data is stored in memory & marked with trigger information





• LVL1\_TRG\_RECEIVED\_OUT

- Goes high when all information is valid and goes low after busy release was set

- LVL1\_TRG\_TYPE\_OUT(3..0)
- LVL1\_TRG\_NUMBER\_OUT(15..0)
- LVL1\_TRG\_CODE\_OUT(7..0)
- LVL1\_TRG\_INFORMATION\_OUT(7..0)
  - A random code to help preventing event mixing
  - Additional information byte for future extensions
- LVL1\_ERROR\_PATTERN\_IN(31..0)
  - Basic status information
- LVL1\_TRG\_RELEASE\_IN
  - "busy release" strobe, 1 clock cycle





- Upon request the FEE sends event data to the next data concentrator (Hub)
- Data is preceded by an event information header
  - Contains event number & data size
- A concentrator merges data from several FEE to one stream and forwards it to the next concentrator or via Ethernet to the Eventbuilder



CTS

EventBuilder (PC)



- Upon request the FEE sends event data to the next data concentrator (Hub)
- Data is preceded by an • event information header
  - Contains event number & data size

MODULE A

D0

D1

D2

Length

A concentrator merges data from several ulletFEE to one stream and forwards it to the next concentrator or via Ethernet to the Eventbuilder





- Data Handler provides event number & additional information
- Application provides DHDR1 (Event Information)
- Handler generates DHDR2
- Application sends event data
- Finally application set finished signal



IPU\_NUMBER\_OUT(15...0)

IPU\_INFORMATION\_OUT(7..0)

IPU\_START\_READOUT\_OUT

IPU\_DATA\_IN(31..0) IPU\_DATAREADY\_IN IPU\_READOUT\_FINISHED\_IN IPU\_READ\_OUT

IPU\_LENGTH\_IN(15..0)
IPU\_ERROR\_PATTERN\_IN(31..0)





- IPU\_START goes high after request is received and goes low some clock cycles after IPU\_FINISHED was high
- IPU\_READ goes high only after IPU\_DATAVALID is high



# Slow Control – Main Features

- Aim: Provide an standardized interface for monitoring software
  - Easy-to-run during beamtime shifts
- Standardized registers give fast overview of current status of the whole detector system
- Board information memories give information about firmware version etc.
- PC software periodically polls certain registers to monitor the systems status
- Monitoring & Controlling algorithms are directly built into the FPGA design
- Additional features:
  - Advanced monitoring features

| Address Range | Description              |
|---------------|--------------------------|
| 0000 - 001F   | common status registers  |
| 0020 - 003F   | common control registers |
| 0040 - 0048   | board information ROM    |
| 0050 - 005F   | board information RAM    |
| 0080 - 00FF   | user defined registers   |
| 0100 - FFFF   | internal data port       |

Slow Control Address Map

Deadtimes, data amounts of each FEE ...

Board temperatures & over-heating prevention

Blocking data from erroneous boards ...

Possible monitoring & controlling features

- Methods to automatically monitor the change of a given value over time



• Accesses to the low addresses (oo – FF) are independently handled by RegIO

0x00

0x20

0x80

0xA0

- RegIO provides control registers, User provides status registers
- Width is configurable via generics

| REGIO_ | COMMON_STAT_REG_IN  |  |
|--------|---------------------|--|
| REGIO_ | COMMON_CTRL_REG_OUT |  |
| REGIO_ | REGISTERS_IN        |  |
| REGIO_ | REGISTERS_OUT       |  |

- Standard data/address port for r/w operations (addresses 0100 FFFF)
- 5 control signals to identify different kinds of reactions from the user logic

REGIO\_ADDR\_OUT(15..0) REGIO\_READ\_ENABLE\_OUT REGIO\_WRITE\_ENABLE\_OUT REGIO\_DATA\_OUT(31..0) REGIO\_DATA\_IN(31..0)

REGIO\_DATAREADY\_IN REGIO\_NO\_MORE\_DATA\_IN REGIO\_WRITE\_ACK\_IN REGIO\_UNKNOWN\_ADDR\_IN REGIO\_TIMEOUT\_OUT



- DAT\_DATAREADY\_IN
  - DATA\_OUT is valid after read
  - High until READ\_OUT is high too
- DAT\_NO\_MORE\_DATA\_IN
  - Read: no more data is available from this address (FIFO)
  - Write: endpoint is not able to process more data now
- DAT\_WRITE\_ACK\_IN
  - Write access was successful
- DAT\_UNKNOWN\_ADDR\_IN
  - The given address does not exist
- DAT\_TIMEOUT\_OUT
  - Transfer terminated, since user did not respond in time
- Signals are strobes, one clock cycle long



All boards have to be individually addressable

• DHCP-like assignment of addresses based on ID-Chips on each board

1.Ids of all boards are read out

2.Central address database is consulted

3.List of ids and assigned addresses is sent

4. Receipt of address is acknowledged

Address assignment procedure

#### Some boards are not accessible anymore after mounting

- On-board flash memories are programmable via TrbNet
- One flash contains a fixed boot-loader giving basic network functions ("Micro Kernel")
- Second flash contains updatable firmware



## Software: trbcmd

- Shell-based software running on Etrax
- Connecting to special trbnet endpoint on TRB •
- Allows for all request types on TrbNet •
- Split into TrbNet-library, FPGA-connection library & high-level software •
  - Easy to implement in own code

#### Commands:

- r <trbaddress> <register>
- w <trbaddress> <register> <data>
- rm <trbaddress> <register> <size> <mode> -> read register-memory
- wm <trbaddress> <register> <mode> <file>
- i <trbaddress>
- s <uid> <endpoint> <trbaddress>
- T <input> <type> <random> <info> <number> -> trigger by slowcontrol
- I <type> <random> <info> <number>
- f <channel>
- R <register>
- W <register> <value>
- > trbcmd i ffff 0xee000001e43c17c1 0x01 0x8e000001fc533228 0x01

- -> read register
- -> write register
- -> write to register-memory from ASCII-file
- -> read unique ID
- -> set trb-address
- -> read IPU data slowcontrol
- -> flush FIFO of channel
- -> read register of the FPGA
- -> write to register of the FPGA



- Full Endpoint needs 2000 slices
  - 20% of available resources in smallest FPGA (OEPB)
  - 5% on biggest FPGA (RICH ADCM)
- Hub needs much more resources
  - Hubv1 can handle ~ 6 links with its 12k slices
  - Hubv2 is occupied by 50% when handling 17 links (16x optical + up-link)
  - Scales almost linearly with number of links (~ N \* 1500 + 1000 slices)
- Big FPGAs are much more complicated to handle than small ones
  - 20% occupancy in ECP2M20 : No problem at all (up to 150 MHz)
  - 20% of ECP2M100: Software shows very different results (70 115 MHz) with identical input VHDL code
  - (Xilinx Virtex4 shows similar maximum frequencies for both)



- Packet: 80 Bit of data, smallest amount of data on any connection, organized in 5 16bit
   Words. The first word is Packet Start (Ho) containing packet type and channel id. The next
   4 words are the Payload (Fo F3), the "useful" data transported. Internally the Packet
   Number (Ho, Fo-F3) is mandatory to be transported along with each word.
- The Medium is divided into three channels with different priorities. The Media Interfaces provide access to the physical cable. Data then goes through the Multiplexer to the IOBuf that provides the security and divides data streams into Blocks. The API provides the interface to the network. The user logic connects to data handlers (RegIO on the Slow Control channel, Trigger on the LVL1 channel, Ipudata on the data channel ) whoch provide additional features.
- The packet types the user sees are HDR, DAT and TRM. The IOBuf additionally uses EOB (End of Block) and ACK.
- Trigger Data contains the trigger type, trigger number, trigger information and a random trigger code.
- The 16bit **addresses** of each board are assigned based on the **unique ids** of its temperature sensors.
- A network **session** consists of two an **init transfer** and a **reply transfer**.



- Board is not reacting
  - Automatically detected, since all boards have to answer each transfer
  - Port / Endpoint can be switched off via slow control
- Bit errors in data words
  - I/O-Buffers have additional error detection:
  - 16bit CRC are calculated & checked for each buffer
  - Number of data words is checked
  - Apparently corrupted data is marked by setting a bit in the status pattern
- Word loss due to bit error in 8b/10b encoded stream
  - Damaged packets are detected and deleted
  - Automatic realignment to packet boundary
  - Optional: Error correction based on redundant encoding



- Current Status of Development
  - Online monitoring features both in hard- and software are being developed
  - Tests with productive usage of all channels are ongoing
  - Adaption to different hardware requirements is in progress
  - Installation of new detector and electronics will be finished by the end of this year
- TrbNet is a versatile network protocol
  - Optimized for HADES, but the modular design makes adaption to different requirements easy
- The HADES DAQ upgrade
  - Trigger- and data-rates achieved with the new network will greatly improve the performance of the HADES detector
  - Slow Control provides all tools for efficient monitoring during experimental runs



- 500 boards
- 10 board types
- Very different sized FPGAs from two vendors
- Unlike media types
  - glass fibres, plastic optical fibres, AddOn-Connector, on-board communication
- Diverse requirements for transport for triggers, data, monitoring information



Urgent need for one common, flexible, adaptable network protocol:

Trigger- and Readout Board Network - TrbNet