OCuLUS/Problems/Parallel File System

Aus PC2 Doc
Wechseln zu: Navigation, Suche


OCULUS Maintenance from 14.09.20 until 17.09.2020

  • Downtime due to work on power access line

09.07.20, 13:30 - 13.07.20,14:45

  • One of the seven storage nodes of BeeGFS crashed. All data chunks which are stored on this node are currently unaccessible.
  • Your applications may hang.
  • A HW-RAID controller crashed in storage06. We have replaced it with a spare part. The parallel file system is now working again.

23.06.20, 19:40 - 25.06.20,15:00: Failure of one BeeGFS meta data server

  • The HW RAID controller of one of the two metadata servers is defect. Therefore, some directories under /scratch are not available.
    • If your directory is affected, you will get something like: ls: cannot access /scratch/hpc-prf-xyz/...: Communication error on send
    • In our last BeeGFS update in Feb 2020, we enabled [metadata mirroring].
    • Refer to Migrating Existing Metadata to get more information why your directory may be not accessible.

OCULUS Maintenance from 10.02.2020 until 17.02.2020

  • The system and all frontends will be offline during this period.