Difference: GeneralInformation (1 vs. 33)

Revision 33
31 Jan 2011 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 96 to 96
 
  • Bought by Coimbra: 16
  • Installed: lxhadeb01, 02, 03, 04
Changed:
<
<

lxhadeb01(a/b)

>
>

lxhadeb01

 

lxhadeb01 is our new powerful server for parallel event building.
Line: 157 to 157
 

IPMI Module

Changed:
<
<
IPMI module provides a remote access to the machine. It is connected to ITM 'yellow' network. Currently we have hades27.gsi.de machine
>
>
IPMI module provides a remote access to the machine. It is connected to ITM 'yellow' network. Currently we have hades30.gsi.de machine
  in the 'yellow' network for an access to IPMI module.

How to access:
Changed:
<
<
  • Start browser on hades27.gsi.de
>
>
  • Start browser on hades30.gsi.de
 
Line: 173 to 173
 

  • lxhadeb01b : 1 Gbps NIC
  • lxhadeb01 : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
Changed:
<
<
    • ixgbe driver version: 2.0.75.7-NAPI
>
>
    • ixgbe driver version: ixgbe-2.1.4
 
Changed:
<
<
    • source code: http://sourceforge.net/projects/e1000/files/ixgbe stable/
>
>
    • This might be old: source code: http://sourceforge.net/projects/e1000/files/ixgbe stable/
 
    • To start ixgbe at boot time with single queue setting: /etc/modprobe.d/ixgbe
                                options ixgbe MQ=0,0
                                
    • To load new driver: rmmod ixgbe; modprobe ixgbe MQ=0,0
Changed:
<
<

lxhadeb02(a/b)

>
>

lxhadeb02

 

kernel

Line: 198 to 198
 

IPMI Module

How to access:
Changed:
<
<
  • Start browser on hades27.gsi.de
>
>
  • Start browser on hades30.gsi.de
 

Network

Changed:
<
<
  • lxhadeb02a : 1 Gbps NIC
  • lxhadeb02b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AF Dual Port Network Connection (device: 10f1)
>
>
  • lxhadeb02b : 1 Gbps NIC
  • lxhadeb02 : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AF Dual Port Network Connection (device: 10f1)
 
    • ixgbe driver version: 2.0.75.7-NAPI
Changed:
<
<

lxhadeb03(a/b)

>
>

lxhadeb03

 

kernel

Line: 225 to 225
 

IPMI Module

How to access:
Changed:
<
<
  • Start browser on hades27.gsi.de
>
>
  • Start browser on hades30.gsi.de
 

Network

Changed:
<
<
  • lxhadeb03a : 1 Gbps NIC
  • lxhadeb03b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AT Dual Port Network Connection (device: 10f1)
    • ixgbe driver version: 2.0.75.7-NAPI
>
>
  • lxhadeb03b : 1 Gbps NIC
  • lxhadeb03 : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AT Dual Port Network Connection (device: 10f1)
    • ixgbe driver version: ixgbe-2.1.4
 
Changed:
<
<

lxhadeb04(a/b)

>
>

lxhadeb04

 

kernel

Line: 252 to 252
 

IPMI Module

How to access:
Changed:
<
<
  • Start browser on hades27.gsi.de
>
>
  • Start browser on hades30.gsi.de
 

Network

Changed:
<
<
  • lxhadeb04a : 1 Gbps NIC
  • lxhadeb04b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: 2.0.75.7-NAPI
>
>
  • lxhadeb04b : 1 Gbps NIC
  • lxhadeb04 : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: ixgbe-2.1.4

lxhadeb05

Info:
  • 24 cores (AMD Opteron Processor 800 MHz)
  • 64 GB memory
  • 24 slots for hard disks:
    • RAID 1 for 2 system disks (slots 0-1)
    • Stand alone disks (slots 2-23)

kernel

Event Builders require the following settings:
  • kernel.sem="250 128000 32 512"
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
    • 128000 : SEMMNS defines the total number of semaphores for the system (changed: 250x512)
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
    • 512 : SEMMNI defines the number of entire semaphore sets for the system (changed)
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size

IPMI Module

To gain an access to lxhadeb05 using the new firmware of IPMI module, the SUN java libs must be installed.

How to access:

Network

  • lxhadeb05b : 1 Gbps NIC
  • lxhadeb05 : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: ixgbe-2.1.4
 
Changed:
<
<
-- SergeyYurevich - 01 Apr 2010
>
>
-- SergeyYurevich - 31 Jan 2011
 

lxhadesdaq

Revision 32
11 Oct 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 110 to 110
 

kernel

Event Builders require the following settings:
Changed:
<
<
  • kernel.sem="250 32000 32 256"
>
>
  • kernel.sem="250 128000 32 512"
 
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
Changed:
<
<
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
>
>
    • 128000 : SEMMNS defines the total number of semaphores for the system (changed: 250x512)
 
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
Changed:
<
<
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
>
>
    • 512 : SEMMNI defines the number of entire semaphore sets for the system (changed)
 
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
Added:
>
>
Possible errors:
  • "No space left on device". This error occurs when the event builder application tries to open more than 128 sets of semaphores (the standard setting is kernel.sem="250 32000 32 128"). 128 sets mean 64 shared memory segments since two semaphore sets are required per memory segment. In this case, daq_evtbuild -m 65 will lead to an error.
  • "File exists". This error occurs when semaphores remained from previous execution of daq_evtbuild are not properly cleaned. Use ipcrm -s semid (or /home/hadaq/bin/ipcrm.pl).

Howto:
  • List open semaphores: ipcs -s
  • List open shared memory segments: ipcs -m
  • List all: ipcs -a
  • Remove semaphore: ipcs -s semid
  • Remove all open semaphores: /home/hadaq/bin/ipcrm.pl
 

RAID Array Controller

Adaptec RAID Controller has been exchanged on 04.06.2009.
Line: 176 to 187
 

kernel

Event Builders require the following settings:
Changed:
<
<
  • kernel.sem="250 32000 32 256"
>
>
  • kernel.sem="250 128000 32 512"
 
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
Changed:
<
<
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
>
>
    • 128000 : SEMMNS defines the total number of semaphores for the system (changed: 250x512)
 
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
Changed:
<
<
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
>
>
    • 512 : SEMMNI defines the number of entire semaphore sets for the system (changed)
 
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
Line: 203 to 214
 

kernel

Event Builders require the following settings:
Changed:
<
<
  • kernel.sem="250 32000 32 256"
>
>
  • kernel.sem="250 128000 32 512"
 
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
Changed:
<
<
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
>
>
    • 128000 : SEMMNS defines the total number of semaphores for the system (changed: 250x512)
 
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
Changed:
<
<
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
>
>
    • 512 : SEMMNI defines the number of entire semaphore sets for the system (changed)
 
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
Line: 230 to 241
 

kernel

Event Builders require the following settings:
Changed:
<
<
  • kernel.sem="250 32000 32 256"
>
>
  • kernel.sem="250 128000 32 512"
 
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
Changed:
<
<
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
>
>
    • 128000 : SEMMNS defines the total number of semaphores for the system (changed: 250x512)
 
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
Changed:
<
<
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
>
>
    • 512 : SEMMNI defines the number of entire semaphore sets for the system (changed)
 
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
Revision 31
06 Jul 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 6 to 6
 


Deleted:
<
<

The following machines are dedicated DAQ-eventbuilders:

  • lxhadesdaq
  • lxhadeb01 (our future event builder)
 

hadeb05

Due to JUMBO frames hadeb05 can stop receiving DHCP discovers from TRBs. Solution:
Line: 98 to 92
 
  • Installed: lxhadeb01

  • Type: WD RE4-Green Power 2TB 2.5 SATA WD2002FYPS
Changed:
<
<
  • Bought by GSI: 70
>
>
  • Bought by GSI: 70 + 10 (which are not yet installed)
 
  • Bought by Coimbra: 16
  • Installed: lxhadeb01, 02, 03, 04
Line: 166 to 160
 

Network

Changed:
<
<
  • lxhadeb01a : 1 Gbps NIC
  • lxhadeb01b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: 1.3.18-k2
>
>
  • lxhadeb01b : 1 Gbps NIC
  • lxhadeb01 : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: 2.0.75.7-NAPI
 
    • Vendor-device table to recognize devices.
    • source code: http://sourceforge.net/projects/e1000/files/ixgbe stable/
Added:
>
>
    • To start ixgbe at boot time with single queue setting: /etc/modprobe.d/ixgbe
                                options ixgbe MQ=0,0
                                
    • To load new driver: rmmod ixgbe; modprobe ixgbe MQ=0,0
 

lxhadeb02(a/b)

Line: 197 to 196
 

  • lxhadeb02a : 1 Gbps NIC
  • lxhadeb02b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AF Dual Port Network Connection (device: 10f1)
Changed:
<
<
    • ixgbe driver version: 2.0.44.14-NAPI
>
>
    • ixgbe driver version: 2.0.75.7-NAPI
 

lxhadeb03(a/b)

Line: 224 to 223
 

  • lxhadeb03a : 1 Gbps NIC
  • lxhadeb03b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AT Dual Port Network Connection (device: 10f1)
Changed:
<
<
    • ixgbe driver version: 1.3.18-k2
>
>
    • ixgbe driver version: 2.0.75.7-NAPI
 

lxhadeb04(a/b)

Line: 251 to 250
 

  • lxhadeb04a : 1 Gbps NIC
  • lxhadeb04b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
Changed:
<
<
    • ixgbe driver version: 1.3.18-k2
>
>
    • ixgbe driver version: 2.0.75.7-NAPI
 

-- SergeyYurevich - 01 Apr 2010
Revision 30
17 Jun 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 170 to 170
 
  • lxhadeb01b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
Added:
>
>
    • source code: http://sourceforge.net/projects/e1000/files/ixgbe stable/
 

lxhadeb02(a/b)

Revision 29
25 May 2010 - Main.JanMichel
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 16 to 16
 

Due to JUMBO frames hadeb05 can stop receiving DHCP discovers from TRBs. Solution:
  • ifconfig eth0 down
Changed:
<
<
  • modprobe sk98lin; modprobe -r sk98lin
>
>
  • modprobe -r sk98lin; modprobe sk98lin
 
  • ifconfig eth0 192.168.0.1 netmask 255.255.255.0 up
  • rcdhcpd restart
Revision 28
17 May 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 139 to 139
 
    • Wait a bit, the rebuild of logical device should start automaticaly
Changed:
<
<
arcconf is a command line interface. To get information about RAID controller status, try the following:
>
>
arcconf is a command line interface.
  • To get information about RAID controller status, try the following:
 
  • For controller 1 and first RAID array: /root/bin/arcconf GETCONFIG 1 LD 0
  • For controller 1 and second RAID array: /root/bin/arcconf GETCONFIG 1 LD 1
Added:
>
>
  • To silence the alarm: arcconf setalarm 1 silence
 

Configuration

Revision 27
17 May 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 111 to 111
 
  • 32 GB memory (4x8 GB Kingston DDR2)
  • 24 slots for hard disks:
    • RAID 1 for 2 system disks (slots 0-1)
Changed:
<
<
    • RAID 5 for 9 data disks (slots 2-10)
    • slot 11 - hot spare
>
>
    • Stand alone disks (slots 2-23)
 

kernel

Revision 26
11 May 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 12 to 12
 
  • lxhadesdaq
  • lxhadeb01 (our future event builder)
Added:
>
>

hadeb05

Due to JUMBO frames hadeb05 can stop receiving DHCP discovers from TRBs. Solution:
  • ifconfig eth0 down
  • modprobe sk98lin; modprobe -r sk98lin
  • ifconfig eth0 192.168.0.1 netmask 255.255.255.0 up
  • rcdhcpd restart
 

hadeb07

hadeb07 parameters:
Revision 25
22 Apr 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 85 to 85
 

Hard disks for servers

Added:
>
>
  • Type: Seagate Barracuda ES.2 ST31000340NS 1TB
  • Bought by GSI: 12
  • Installed: lxhadeb01
 
  • Type: WD RE4-Green Power 2TB 2.5 SATA WD2002FYPS
  • Bought by GSI: 70
  • Bought by Coimbra: 16
Added:
>
>
  • Installed: lxhadeb01, 02, 03, 04
 

lxhadeb01(a/b)

Revision 24
22 Apr 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 81 to 81
 

-- SergeyYurevich - 04 Jun 2009
Added:
>
>

Miscellaneous

Hard disks for servers

  • Type: WD RE4-Green Power 2TB 2.5 SATA WD2002FYPS
  • Bought by GSI: 70
  • Bought by Coimbra: 16
 

lxhadeb01(a/b)

lxhadeb01 is our new powerful server for parallel event building.
Revision 23
01 Apr 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 145 to 145
 

Network

  • lxhadeb01a : 1 Gbps NIC
Changed:
<
<
  • lxhadeb01b : 10 Gbps NIC

  • Switch bonding on
    • ifdown eth0.1000
    • ifdown eth1.1000
    • ifdown eth2.1000
    • ifdown eth3.1000
    • ifdown eth0
    • ifup bond0
    • ifup vlan1000
  • Switch bonding off
    • ifdown vlan1000
    • ifdown bond0
    • ifup eth0
    • ifup eth0.1000
    • ifup eth1.1000
    • ifup eth2.1000
    • ifup eth3.1000
>
>
  • lxhadeb01b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
 

lxhadeb02(a/b)

Line: 188 to 173
 

Network

  • lxhadeb02a : 1 Gbps NIC
Changed:
<
<
  • lxhadeb02b : 10 Gbps NIC
>
>
  • lxhadeb02b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AF Dual Port Network Connection (device: 10f1)
    • ixgbe driver version: 2.0.44.14-NAPI
 

lxhadeb03(a/b)

Line: 214 to 200
 

Network

  • lxhadeb03a : 1 Gbps NIC
Changed:
<
<
  • lxhadeb03b : 10 Gbps NIC
>
>
  • lxhadeb03b : Intel Corporation (vendor: 8086), 82598EB 10-Gigabit AT Dual Port Network Connection (device: 10f1)
    • ixgbe driver version: 1.3.18-k2
 

lxhadeb04(a/b)

Line: 240 to 227
 

Network

  • lxhadeb04a : 1 Gbps NIC
Changed:
<
<
  • lxhadeb04b : 10 Gbps NIC
>
>
  • lxhadeb04b : Intel Corporation (vendor: 8086), device: 82599EB 10 Gigabit Network Connection (device: 10fb)
    • ixgbe driver version: 1.3.18-k2
 
Changed:
<
<
-- SergeyYurevich - 05 Feb 2010
>
>
-- SergeyYurevich - 01 Apr 2010
 

lxhadesdaq

Revision 22
17 Mar 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 93 to 93
 
    • RAID 5 for 9 data disks (slots 2-10)
    • slot 11 - hot spare
Added:
>
>

kernel

Event Builders require the following settings:
  • kernel.sem="250 32000 32 256"
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
 

RAID Array Controller

Adaptec RAID Controller has been exchanged on 04.06.2009.
Line: 155 to 166
 

lxhadeb02(a/b)

Added:
>
>

kernel

Event Builders require the following settings:
  • kernel.sem="250 32000 32 256"
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
 

IPMI Module

How to access:
Line: 170 to 192
 

lxhadeb03(a/b)

Added:
>
>

kernel

Event Builders require the following settings:
  • kernel.sem="250 32000 32 256"
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
 

IPMI Module

How to access:
Line: 185 to 218
 

lxhadeb04(a/b)

Added:
>
>

kernel

Event Builders require the following settings:
  • kernel.sem="250 32000 32 256"
    • 250 :SEMMSL is the maximum number of semaphores per semaphore set (default)
    • 32000 : SEMMNS defines the total number of semaphores for the system (default)
    • 32 : SEMOPM defines the maximum number of semaphore operations per semaphore call (default)
    • 256 : SEMMNI defines the number of entire semaphore sets for the system (changed)
  • net.core.rmem_max=10485760 : Receive socket buffer size
  • net.core.wmem_max=10485760 : Send socket buffer size
 

IPMI Module

How to access:
Revision 21
26 Feb 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 53 to 53
 
Added:
>
>

Note: since kp1pc098 has Ubuntu installed, 'marek' is a 'root'.
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Revision 20
10 Feb 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 123 to 123
 
  • Login
  • Click 'Remote Control' -> 'KVM Console'
Added:
>
>
How to get MAC address of IPMI module:
  • Execute as 'root': ipmitool lan print
 

Network

  • lxhadeb01a : 1 Gbps NIC
Revision 19
05 Feb 2010 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 76 to 76
 

-- SergeyYurevich - 04 Jun 2009
Changed:
<
<

lxhadeb01

>
>

lxhadeb01(a/b)

 

lxhadeb01 is our new powerful server for parallel event building.
Line: 125 to 125
 

Network

Added:
>
>
  • lxhadeb01a : 1 Gbps NIC
  • lxhadeb01b : 10 Gbps NIC
 
  • Switch bonding on
    • ifdown eth0.1000
    • ifdown eth1.1000
Line: 142 to 145
 
    • ifup eth2.1000
    • ifup eth3.1000
Changed:
<
<
-- SergeyYurevich - 16 Jul 2009
>
>

lxhadeb02(a/b)

IPMI Module

How to access:

Network

  • lxhadeb02a : 1 Gbps NIC
  • lxhadeb02b : 10 Gbps NIC

lxhadeb03(a/b)

IPMI Module

How to access:

Network

  • lxhadeb03a : 1 Gbps NIC
  • lxhadeb03b : 10 Gbps NIC

lxhadeb04(a/b)

IPMI Module

How to access:

Network

  • lxhadeb04a : 1 Gbps NIC
  • lxhadeb04b : 10 Gbps NIC

-- SergeyYurevich - 05 Feb 2010
 

lxhadesdaq

Revision 18
22 Jul 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 112 to 112
 
  • Many configuration files are overwritten by cfagent which is started as a cron job at reboot and once per day (/etc/cron.d/gsi). If you want to stop it, you should comment out a couple of lines in /etc/cron.d/gsi.
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
Added:
>
>

IPMI Module

IPMI module provides a remote access to the machine. It is connected to ITM 'yellow' network. Currently we have hades27.gsi.de machine in the 'yellow' network for an access to IPMI module.

How to access:
 

Network

  • Switch bonding on
Revision 17
16 Jul 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 112 to 112
 
  • Many configuration files are overwritten by cfagent which is started as a cron job at reboot and once per day (/etc/cron.d/gsi). If you want to stop it, you should comment out a couple of lines in /etc/cron.d/gsi.
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
Added:
>
>

Network

 
Changed:
<
<
-- SergeyYurevich - 04 Jun 2009
>
>
  • Switch bonding on
    • ifdown eth0.1000
    • ifdown eth1.1000
    • ifdown eth2.1000
    • ifdown eth3.1000
    • ifdown eth0
    • ifup bond0
    • ifup vlan1000
  • Switch bonding off
    • ifdown vlan1000
    • ifdown bond0
    • ifup eth0
    • ifup eth0.1000
    • ifup eth1.1000
    • ifup eth2.1000
    • ifup eth3.1000

-- SergeyYurevich - 16 Jul 2009
 

lxhadesdaq

Revision 16
04 Jun 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 12 to 12
 
  • lxhadesdaq
  • lxhadeb01 (our future event builder)
Changed:
<
<

hadeb07 (remote backup system)

>
>

hadeb07

 

hadeb07 parameters:
  • SuSE 10.2
Line: 20 to 20
 
  • Two 3GHz processors
  • 2GB memory
Changed:
<
<
After a power outage hadeb03 experienced a hardware failure. We have decided to use hadeb07 instead of hadeb03 for the remote backup.
>
>

hadeb04 (remote backup system)

 
Changed:
<
<
hadeb07 serves as a remote backup system.
  • Software: rsnapshot, executed 6 times a day (crontab -e)
>
>
hadeb04 serves as a remote backup system.
  • Software: rsnapshot, executed 1 times a day (crontab -e)
 
  • Config file: /etc/rsnapshot.conf
Changed:
<
<
  • Backup disk: /data/backup
>
>
  • Backup disk: /data/hadeb04/backup
  • Test mode: rsnapshot -t hourly
 
Changed:
<
<
The backup is temporarily done on hadeb04:/data/hadeb04/backup since the hadeb07's disks failed. Fixes for rsnapshot on hadeb04:
>
>
Fixes for rsnapshot on hadeb04:
 
  • WARNING: Could not lchown() symlink
    • Reason: perl Lchown module is missing
    • Fix: perl -MCPAN -e 'install qw(Lchown)'
Line: 37 to 37
 
    • Fix: upgrade to rsync 3.0.0 or newer. This uses an incremental recursion mode to avoid the need to hold the entire file list in memory.

The following machines/directories are backed up:
Added:
>
>
 
Line: 48 to 49
 
Changed:
<
<
>
>
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Changed:
<
<
-- SergeyYurevich - 11 Mar 2009
>
>
hadeb04 is a 1.7TB fileserver with a 3200+ AMD 64 CPU. It is running a 64bit linux. It has not system-disk. The bootserver is lxhadesdaq. Some things about the Asus-Board:
  • The BIOS upgrade to version 1013 is not working.
  • BIOS upgrades only work via "ALT-F2" during booting and providing the file on a Floppy-Disk. All other methods failed with a Checksum-Error!
  • One has to boot over the second Giga-Bit interface. The first (NVIDIA) is for some reason not recoginized by the kernel

UNAME_MACHINE="x86_64" make modules_install

Compile the kernel 2.6.12.5 for x86_64:
  1. one has to change directory to: lxhadesdaq:/var/diskless/hadeb04/usr/src/linux-2.6.12.5
  2. one has to use cross-compiler. It is in lxhadesdaq:/var/diskless/hadeb04/usr/src/linux-2.6.12.5/x86_64-unknown-linux-gnu directory,
  3. type: export PATH=$PATH:/var/diskless/hadeb04/usr/src/linux-2.6.12.5/x86_64-unknown-linux-gnu/gcc-3.4.0-glibc-2.3.2/bin,
  4. make ARCH=x86_64 menuconfig
  5. make ARCH=x86_64 CROSS_COMPILE=x86_64-unknown-linux-gnu-
  6. INSTALL_MOD_PATH=/var/diskless/hadeb04 make ARCH=x86_64 CROSS_COMPILE=x86_64-unknown-linux-gnu- modules_install
  7. cp arch/x86_64/boot/bzImage /tftpboot/vmlinuz_2.6.12.5_64bit

-- SergeyYurevich - 04 Jun 2009
 

lxhadeb01

Line: 69 to 90
 

RAID Array Controller

Added:
>
>
Adaptec RAID Controller has been exchanged on 04.06.2009.
  Adaptec Storage Manager is a java tool to control RAID Arrays. You can start it under root by executing /usr/StorMan/StorMan.sh
  • How to rebuild degraded logical device with failed segment:
    • Click on lxhadeb01.gsi.de (Logical system) in Enterprise view,
Line: 90 to 113
 
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
Changed:
<
<
-- SergeyYurevich - 11 Mar 2009
>
>
-- SergeyYurevich - 04 Jun 2009
 

lxhadesdaq

Line: 192 to 215
 

-- SergeyYurevich - 16 Sep 2008
Deleted:
<
<

hadeb04

hadeb04 is a 1.7TB fileserver with a 3200+ AMD 64 CPU. It is running a 64bit linux. It has not system-disk. The bootserver is lxhadesdaq. Some things about the Asus-Board:
  • The BIOS upgrade to version 1013 is not working.
  • BIOS upgrades only work via "ALT-F2" during booting and providing the file on a Floppy-Disk. All other methods failed with a Checksum-Error!
  • One has to boot over the second Giga-Bit interface. The first (NVIDIA) is for some reason not recoginized by the kernel

UNAME_MACHINE="x86_64" make modules_install

Compile the kernel 2.6.12.5 for x86_64:
  1. one has to change directory to: lxhadesdaq:/var/diskless/hadeb04/usr/src/linux-2.6.12.5
  2. one has to use cross-compiler. It is in lxhadesdaq:/var/diskless/hadeb04/usr/src/linux-2.6.12.5/x86_64-unknown-linux-gnu directory,
  3. type: export PATH=$PATH:/var/diskless/hadeb04/usr/src/linux-2.6.12.5/x86_64-unknown-linux-gnu/gcc-3.4.0-glibc-2.3.2/bin,
  4. make ARCH=x86_64 menuconfig
  5. make ARCH=x86_64 CROSS_COMPILE=x86_64-unknown-linux-gnu-
  6. INSTALL_MOD_PATH=/var/diskless/hadeb04 make ARCH=x86_64 CROSS_COMPILE=x86_64-unknown-linux-gnu- modules_install
  7. cp arch/x86_64/boot/bzImage /tftpboot/vmlinuz_2.6.12.5_64bit
 

Data Sources

Data is transported via Ethernet and UDP to the eventbuilder. The sequence of sources is the following:
Revision 15
11 Mar 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 28 to 28
 
  • Config file: /etc/rsnapshot.conf
  • Backup disk: /data/backup
Changed:
<
<
The backup is temporarily done on hadeb04:/data/hadeb04/backup since the hadeb07's disks failed.
>
>
The backup is temporarily done on hadeb04:/data/hadeb04/backup since the hadeb07's disks failed. Fixes for rsnapshot on hadeb04:
  • WARNING: Could not lchown() symlink
    • Reason: perl Lchown module is missing
    • Fix: perl -MCPAN -e 'install qw(Lchown)'
  • ERROR: rsync returned error 12 in rsync_cleanup_after_native_cp_al()
    • Reason: old versions of rsync cannot hold the entire file list in memory at once when there are too many files to be rsynced.
    • Fix: upgrade to rsync 3.0.0 or newer. This uses an incremental recursion mode to avoid the need to hold the entire file list in memory.
 

The following machines/directories are backed up:
Line: 42 to 48
 
Added:
>
>
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Changed:
<
<
-- SergeyYurevich - 09 Mar 2009
>
>
-- SergeyYurevich - 11 Mar 2009
 

lxhadeb01

Line: 55 to 63
 
  • 2x4 cores (Dual Quad-Core AMD Opteron Processor 2.3 GHz)
  • 32 GB memory (4x8 GB Kingston DDR2)
  • 24 slots for hard disks:
Changed:
<
<
    • RAID 1 for 2 system disks
    • RAID 5 for 10 data disks
>
>
    • RAID 1 for 2 system disks (slots 0-1)
    • RAID 5 for 9 data disks (slots 2-10)
    • slot 11 - hot spare
 

RAID Array Controller

Line: 64 to 73
 
  • How to rebuild degraded logical device with failed segment:
    • Click on lxhadeb01.gsi.de (Logical system) in Enterprise view,
    • then click on Controller in Physical devices
Changed:
<
<
    • Goto actions -> Scan controllers
    • Wait until scan is finished. The failed disk should be taken out of the logical device.
    • Click on the failed disk -> Initialize
>
>
    • Goto Actions -> Rescan
    • Wait until rescan is finished. You will see that the failed disk is taken out of the logical device.
    • Replace the failed disk if needed
    • Click on the 'failed' disk -> Initialize
 
    • Wait a bit, the rebuild of logical device should start automaticaly
Added:
>
>
  arcconf is a command line interface. To get information about RAID controller status, try the following:
  • For controller 1 and first RAID array: /root/bin/arcconf GETCONFIG 1 LD 0
  • For controller 1 and second RAID array: /root/bin/arcconf GETCONFIG 1 LD 1
Line: 79 to 90
 
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
Changed:
<
<
-- SergeyYurevich - 10 Mar 2009
>
>
-- SergeyYurevich - 11 Mar 2009
 

lxhadesdaq

Revision 14
10 Mar 2009 - Main.MichaelTraxler
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 97 to 97
 

Username: user
Changed:
<
<
Password: waVe**** (ask Michael)
>
>
Password: w**** (ask Michael)
 

Software

Revision 13
10 Mar 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 61 to 61
 

RAID Array Controller

Adaptec Storage Manager is a java tool to control RAID Arrays. You can start it under root by executing /usr/StorMan/StorMan.sh
Added:
>
>
  • How to rebuild degraded logical device with failed segment:
    • Click on lxhadeb01.gsi.de (Logical system) in Enterprise view,
    • then click on Controller in Physical devices
    • Goto actions -> Scan controllers
    • Wait until scan is finished. The failed disk should be taken out of the logical device.
    • Click on the failed disk -> Initialize
    • Wait a bit, the rebuild of logical device should start automaticaly
 

arcconf is a command line interface. To get information about RAID controller status, try the following:
  • For controller 1 and first RAID array: /root/bin/arcconf GETCONFIG 1 LD 0
Line: 72 to 79
 
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
Changed:
<
<
-- SergeyYurevich - 02 Mar 2009
>
>
-- SergeyYurevich - 10 Mar 2009
 

lxhadesdaq

Revision 12
09 Mar 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 28 to 28
 
  • Config file: /etc/rsnapshot.conf
  • Backup disk: /data/backup
Added:
>
>
The backup is temporarily done on hadeb04:/data/hadeb04/backup since the hadeb07's disks failed.
  The following machines/directories are backed up:
Line: 43 to 45
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Changed:
<
<
-- SergeyYurevich - 16 Feb 2009
>
>
-- SergeyYurevich - 09 Mar 2009
 

lxhadeb01

Revision 11
02 Mar 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 56 to 56
 
    • RAID 1 for 2 system disks
    • RAID 5 for 10 data disks
Changed:
<
<

Adaptec Storage Manager

>
>

RAID Array Controller

 
Changed:
<
<
Adaptec Storage Manager is a java tool to control RAID Arrays. You can start it under root by executing /usr/StorMan/StorMan.sh
>
>
Adaptec Storage Manager is a java tool to control RAID Arrays. You can start it under root by executing /usr/StorMan/StorMan.sh

arcconf is a command line interface. To get information about RAID controller status, try the following:
  • For controller 1 and first RAID array: /root/bin/arcconf GETCONFIG 1 LD 0
  • For controller 1 and second RAID array: /root/bin/arcconf GETCONFIG 1 LD 1
 

Configuration

  • Many configuration files are overwritten by cfagent which is started as a cron job at reboot and once per day (/etc/cron.d/gsi). If you want to stop it, you should comment out a couple of lines in /etc/cron.d/gsi.
Changed:
<
<
  • To enable remote logins for new users you should add the user to /etc/security/access.conf.
>
>
  • To enable remote logins for new users you should add the user to /etc/security/access.conf (access.conf is also overwritten!)
 

-- SergeyYurevich - 02 Mar 2009
Revision 10
02 Mar 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 53 to 53
 
  • 2x4 cores (Dual Quad-Core AMD Opteron Processor 2.3 GHz)
  • 32 GB memory (4x8 GB Kingston DDR2)
  • 24 slots for hard disks:
Changed:
<
<
    • RAID 0 for 2 system disks
>
>
    • RAID 1 for 2 system disks
 
    • RAID 5 for 10 data disks
Changed:
<
<
-- SergeyYurevich - 27 Jan 2009
>
>

Adaptec Storage Manager

Adaptec Storage Manager is a java tool to control RAID Arrays. You can start it under root by executing /usr/StorMan/StorMan.sh

Configuration

  • Many configuration files are overwritten by cfagent which is started as a cron job at reboot and once per day (/etc/cron.d/gsi). If you want to stop it, you should comment out a couple of lines in /etc/cron.d/gsi.
  • To enable remote logins for new users you should add the user to /etc/security/access.conf.

-- SergeyYurevich - 02 Mar 2009
 

lxhadesdaq

Revision 9
24 Feb 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 49 to 49
 

lxhadeb01 is our new powerful server for parallel event building.
Changed:
<
<
More will be added soon.
>
>
Info:
  • 2x4 cores (Dual Quad-Core AMD Opteron Processor 2.3 GHz)
  • 32 GB memory (4x8 GB Kingston DDR2)
  • 24 slots for hard disks:
    • RAID 0 for 2 system disks
    • RAID 5 for 10 data disks
 

-- SergeyYurevich - 27 Jan 2009
Revision 8
16 Feb 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 28 to 28
 
  • Config file: /etc/rsnapshot.conf
  • Backup disk: /data/backup
Changed:
<
<
The following machines/directories are backuped:
>
>
The following machines/directories are backed up:
 
Line: 38 to 38
 
Added:
>
>
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Revision 7
16 Feb 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 37 to 37
 
Added:
>
>
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Changed:
<
<
-- SergeyYurevich - 27 Nov 2007
>
>
-- SergeyYurevich - 16 Feb 2009
 

lxhadeb01

Revision 6
27 Jan 2009 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 9 to 9
 

The following machines are dedicated DAQ-eventbuilders:
Deleted:
<
<
  • hadeb02
  • hadeb03
  • hadeb07 (remote backup system)
 
  • lxhadesdaq
Changed:
<
<

hadeb02, hadeb03

These are identical machines with 2 1GHz processors. They have 1 GByte memory. Currently (14.09.05), all hades-daq-computer are connected to the Gbit/s Netgear switch.

habeb03 is now (28.01.05) stable again. It did not crash for many weeks. The nfsserver is now running again, so it can be used for writing hades-data from the event-builder which is currently used as the main eventbuilder.

hadeb02: redhat something
hadeb03: SuSE 9.2 with quite new security patches
>
>
  • lxhadeb01 (our future event builder)
 

hadeb07 (remote backup system)

hadeb07 parameters:
Changed:
<
<
>
>
  • SuSE 10.2
 
  • Hard disks: sda+sdb = 0.32TB, sdc+sdd = 1TB. The two last disks (sdc, sdd) were additionally put into the machine to serve as backup disks. They are not really fixed inside the machine: take care when moving the machine. Nagios monitors the temperature of both disks.
  • Two 3GHz processors
  • 2GB memory
Line: 55 to 42
 

-- SergeyYurevich - 27 Nov 2007
Changed:
<
<

lxhadesdaq

>
>

lxhadeb01

 
Added:
>
>
lxhadeb01 is our new powerful server for parallel event building.

More will be added soon.

-- SergeyYurevich - 27 Jan 2009

lxhadesdaq

 

It is a "higher-availability computer", that means:
  • Raid 0 for the system disks
Line: 131 to 124
 

Lustre mount

Changed:
<
<
The new kernel 2.6.22-gsi-lustre was installed by Thomas Roth and Lustre cluster is mounted as /lustre_alpha.
>
>
The new kernel 2.6.22-gsi-lustre was installed by Thomas Roth and Lustre cluster is mounted as /lustre_alpha.
 

-- SergeyYurevich - 13 May 2008
Line: 190 to 182
 
MDC0 2227
MDC1 2228

Changed:
<
<

Meaning of EvtId in Eventbuilder-Output on text-console

>
>

Meaning of EvtId in Eventbuilder-Output on text-console

 

            evtId09: 4086             evtId11:   11k            evtId19:  894 
Revision 5
16 Sep 2008 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 146 to 146
 

If after reboot the ram disk is not mounted automatically, you may try to mount it manually: /root/start_ramdisk
Changed:
<
<
-- SergeyYurevich - 25 Mar 2008
>
>
Sequence of steps to be done to remount ramdisk (under root):
  • lsof /ramdisk (list all processes which access ramdisk)
  • /etc/start_res_services stop
  • /etc/init.d/xinetd stop
  • killall vsftpd
  • umount /ramdisk (try to unmount the disk, if succeeded go to next step)
  • /root/start_ramdisk
  • /etc/init.d/xinetd start
  • /etc/init.d/start_res_services start

-- SergeyYurevich - 16 Sep 2008
 

hadeb04

Revision 4
13 May 2008 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 129 to 129
 

to be continued
Changed:
<
<
-- SergeyYurevich - 03 Apr 2007
>
>

Lustre mount

The new kernel 2.6.22-gsi-lustre was installed by Thomas Roth and Lustre cluster is mounted as /lustre_alpha.

-- SergeyYurevich - 13 May 2008
 

hadeb06a (hadeb06b)

Revision 3
07 Apr 2008 - Main.MichaelTraxler
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 49 to 49
 
Added:
>
>
 

the backup of hades17: hades25/home/hadaq/backups/hades17_hadaq_home/
Revision 2
25 Mar 2008 - Main.SergeyYurevich
Line: 1 to 1
 
META TOPICPARENT name="EventBuilder"
Table of contents:
Line: 130 to 130
 

-- SergeyYurevich - 03 Apr 2007
Added:
>
>

hadeb06a (hadeb06b)

  • 2 x 2GHz AMD CPUs
  • 16 GB RAM (12 GB available for a use)
  • 130 GB hard disk.

hadeb06 is running a 64bit Linux. Currently it is used as a file server for QA. In fact, it can be used as an Event Builder.

If after reboot the ram disk is not mounted automatically, you may try to mount it manually: /root/start_ramdisk

-- SergeyYurevich - 25 Mar 2008
 

hadeb04

hadeb04 is a 1.7TB fileserver with a 3200+ AMD 64 CPU. It is running a 64bit linux. It has not system-disk.
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Hades Wiki? Send feedback
Imprint (in German)
Privacy Policy (in German)