IOC for HADES SCS running on Linux

Introduction - HADES IOC

There is an EPICS IOC running under Linux on machine hadesdaq02.
Right now, it services
  • the HV for all but MDC,
  • the LV power supplies,
  • temperature monitoring (old and new)
  • connects to the hadcon boards and dreamplugs on the internal HADES VLAN
  • the stats of all running iocs
  • as a kind of gateway for process variables available for the GSI LAN
  • ...

Sources and Maintenance

CVS

    CVS repository
    :ext:scs@lx-pool.gsi.de:/misc/hadesprojects/slowcontrol/cvsroot
    CVS module
    EPICS/apps/hades

In order to do small maintenance, ...

  • ... connect to hadesdaq02 as user scs using the default scs password.
  • The configuration is stored under the production directory /home/scs/apps/hades. There you will find a normal IOC directory tree for development and booting the IOC.
    • The IOC is named "hades".
  • To change the database files,
    • make your edits in hadesApp/Db
    • and afterwards type make.
    • Don't forget to commit the changes to CVS.
  • (re-)start the IOC

In order to do bigger maintenance or development first use the playground ...

  • ... connect to hadesdaq02 as user scs using the default scs password.
  • The configuration is stored under the development directory /home/scs/playground/EPICS/apps/hades. There you will find a normal IOC directory tree for development and booting the IOC.
    • The IOC is named "hades".
  • To change the database files,
    • make your edits in hadesApp/Db
    • and afterwards type make.
  • Make a small dummy test of your database modifications by using softIoc
    softIoc -d <Your modified db Files>
  • On success
    • commit the changes to CVS
    • change to the production directory /home/scs/apps/hades
    • run cvs update
    • followed by make
    • (re-)start the IOC

Operation

(Re-) Starting the IOC

    To restart the server, which is running

      1. login to hadesdaq02
      ssh scs@hadesdaq02
      2. check whether the procServ server, which itselfs controls the EPICS server, is running, by checking for processes:
      ps x
          PID   TTY      STAT   TIME COMMAND
          13061 ?        S      0:02 procServ -L ioc-cave-hadesdaq02.log 4813 startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd  ### this is the process server 
          13062 pts/4    Ssl+   1:51 startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd ### this is the EPICS server 
      1. NO:
        1. check for remaining semaphores (c.f. chapter "Server Problems")
        2. start server from scratch ( c.f. next chapter )
      2. YES: continue
      3. Now you can login to the server:
      1. telnet localhost 4813
      2. Switch off auto-restart (procServ), toggle it to be OFF:
        <CTRL+T>
        @@@ Toggled auto restart to ON/OFF
      3. ... then hit
        <CR>
      4. You should see the epics prompt:
        epics>
        1. NO you don't see it:
          1. quit telnet session
            <CTRL+]>
            telnet> quit
          2. check for remaining semaphores (c.f. "Server Problems")
          3. again login to the server
            telnet localhost 4813
        2. YES, you have the prompt, i.e. EPICS is running, exit it.
          exit
      5. wait
        1. if the IOC has been running, you first should see the IOC shutting down:
               DEBUG: shutting down crate 0
               Shutdown: successfully disconnected from crate x1
               [...]
               DEBUG: shutting down crate 6 
               Shutdown: successfully disconnected from crate x7 <
                  
        2. then restart the IOC by using:
          <CTRL+R>
        3. then you should see it restarting:
               @@@ @@@ @@@ @@@ @@@
               @@@ Received a sigChild for process WXYZ. The process was killed by signal 11
               @@@ Current time: Sun Sep 12 19:29:51 2010
               @@@ Child process is shutting down, auto restart is disabled
               @@@ Use ^R to restart the child, ^Q to quit the server
               @@@ Restarting child "../../bin/linux-x86/hades"
               @@@ The PID of new child "../../bin/linux-x86/hades" is: abcdef
               @@@ @@@ @@@ @@@ @@@
               #!../../bin/linux-x86_64/hades
               ## You may have to change hades to something else
               ## everywhere it appears in this file
               < envPaths
               [...]
               ## Load record instances
               [...]
               iocInit
               Starting iocInit
               ############################################################################
               ## EPICS R3.14.10 $R3-14-10$ $2008/10/27 19:39:04$
               ## EPICS Base built Oct 22 2009
               ############################################################################
               Starting CAEN x527 driver
               pthread_attr_setstacksize error Invalid argument
               iocRun: All initialization complete
               dbl > /home/scs/apps/hades/iocBoot/ioccave/hadesdaq02.dbl
               [...]
      6. To quit from the server without stopping it use the telnet escape (CTRL+]) sequence and quit:
        <CTRL+]>
        telnet> quit
      7. optionally continue with further checks, q.v. below

    To start the server ((almost) from scratch)

      Just follow the same procedures crontab would do during restart
      1. login to hadesdaq02
        ssh scs@hadesdaq02
      2. clean up
        1. make sure procServ is not running by checking for processes:
          ps x
                 PID   TTY      STAT   TIME COMMAND
                 13061 ?        S      0:02 procServ -L ioc-cave-hadesdaq02.log 4813 startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd 
        2. if it is still there ...
          1. brute force: if there is still a procServ process kill it.
          2. else: try restart running server
      3. change to scs' procServ director
        cd ~scs/procServ &&
        ./ioc-cave.sh
      4. optionally continue with further checks, q.v. below

    Further checks

    1. Check the log ~/apps/hades/iocBoot/ioccave/ioc-cave-hadesdaq02.log
      1. In case "Semaphore already present" look chapter "Semaphore hanging"
        • obsolete : taken care by startup command:
          startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd
      2. all other error messages, notify experts,
        • system maybe working, but not completely
    2. check whether server is running,
      1. by checking for processes:
        ps x
            PID TTY       STAT   TIME COMMAND
            13061 ?        S      0:02 procServ -L ioc-cave-hadesdaq02.log 4813 startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd
            13062 pts/4    Ssl+   1:51 startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd 
      2. You can login to the server:
        1. telnet localhost 4813
        2. ... then hit "Enter", carriage return:
          <CR>
        3. You should see the epics prompt:
          epics>
        4. With the command
          dbl
          you will get a very long list of all process variables.
        5. To quit from the server without stopping it use the telnet escape (CTRL+]) sequence and quit:
          <CTRL+]>
          telnet> quit

Known server problems

    If the server is not starting properly read the log file: ~/apps/hades/iocBoot/ioccave/ioc-cave-hadesdaq02.log .

    Semaphore hanging

      Automatic procedure at startup

      Taken care by startup command:
      startEpicsIoc_includingHVSemaphoreCheck.sh ../../bin/linux-x86/hades st.cmd. But in case this did not work ...
      Removing by hand

      • call script ~/apps/hades/iocBoot/ioccave/removeDeadHVSemaphores.sh
      • or really by hand
        It could be that a semaphore is not cleaned up from a previous start, this is indicated by the message:
        Semaphore already present
         There is another process using the semaphore.
         Or a process using the semaphore exited abnormally.
         In That case try to manually release the semaphore with:
           ipcrm sem XXX.
        

        In order to cure this do the following:
        • find the semaphore id
          ipcs -s
              ------ Semaphore Arrays --------
              key        semid     owner   perms   nsems      
              0x30222aea <Sem ID>  scs     666     1 
        • Then delete this semaphore by
          ipcrm sem <Sem ID>

    SY1527 hanging

      If you see in the log file that one pf the HV crates does not connect you should first check if the crates is physically powered on and has Ethernet connection.

      • Ping the crate:
          ping hadhvp05
          PING hadhvp05.gsi.de (192.168.100.69) 56(84) bytes of data.
          64 bytes from hadhvp05.gsi.de (192.168.100.69): icmp_seq=1 ttl=64 time=0.908 ms
          64 bytes from hadhvp05.gsi.de (192.168.100.69): icmp_seq=2 ttl=64 time=0.873 ms 
          ^C
        

      • Login and check hanging CMD sections
          telnet hadhvp05 1527
          user admin
          password admin
          About menu-->Sessions  (left most menu)
        
        screen shot
        If you see there a CMD TCP/IP connection you have to power cycle the crate.

    -- PeterZumbruch - 30 May 2012
Topic revision: r20 - 2018-11-27, PeterZumbruch
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki Send feedback | Imprint | Privacy Policy (in German)