|
|
|
## Introduction
|
|
|
|
|
|
|
|
This is an emergency plan for severe operational degradation of services of
|
|
|
|
- Artemis (https://artemis.codeability.uibk.ac.at/) and
|
|
|
|
- Sharing Plattform (https://sharing-codeability.uibk.ac.at/)
|
|
|
|
|
|
|
|
It should give initial guidelines how to handle an emergency and help to cope with the impacts.
|
|
|
|
|
|
|
|
This plan is also mirrored to https://git.uibk.ac.at/informatik/qe/codeability/austauschplattform/austauschplattform/-/wikis/technical/Emergency-Plan
|
|
|
|
|
|
|
|
## Contact Persons
|
|
|
|
|
|
|
|
In case of an system emergency the following persons should be contacted
|
|
|
|
|
|
|
|
- **technical Applications**: Daniel Crazzolara: daniel.crazzolara @ uibk.ac.at (Skype "Daniel Crazzolara")
|
|
|
|
- **technical OS/VM**: Nicolas Stolz: nicolas.stolz @ uibk.ac.at
|
|
|
|
- **organisational**: Michael Breu: Michael.Breu @ uibk.ac.at (Skype "Michael Breu")
|
|
|
|
- **organisational**: Simon Priller: Simon.Priller @ uibk.ac.at (Skype "Simon Priller")
|
|
|
|
- **organisational**: Lukas Kaltenbrunner: Lukas.Kaltenbrunner @ uibk.ac.at (Skype Lukas Kaltenbrunner)
|
|
|
|
|
|
|
|
## Identification of an Emergency Situation
|
|
|
|
|
|
|
|
This covers the following situations
|
|
|
|
1. complete application break down: Application is completely unavailable or unusable
|
|
|
|
2. severe service degradation: Responses from the application is reproducably slow or flawed
|
|
|
|
3. severe security breach: there are strong indications that the integrity or confidentiallity of the application data is broken
|
|
|
|
|
|
|
|
In any of this cases at least one technical and one organisational contact person should be informed immediately.
|
|
|
|
|
|
|
|
Communication Channels:
|
|
|
|
- **[UIBK-Elements: Codeability Austausch](https://matrix.to/#/!YEcVAkOsngjOspMTkT:ifi-chat.uibk.ac.at?via=ifi-chat.uibk.ac.at&via=uibk.ac.at&via=matrix.org)**
|
|
|
|
- Skype
|
|
|
|
|
|
|
|
## Procedure for Problem Rectification
|
|
|
|
|
|
|
|
### 1. Preservation of Evidence
|
|
|
|
|
|
|
|
If possible, the system state should be documented for later analysis. E.g.
|
|
|
|
- Make a screen shot of the current error situation
|
|
|
|
- Save a complete java VM stack trace (e.g. https://stackoverflow.com/questions/10756105/how-to-get-a-complete-stack-trace-of-a-running-java-program-that-is-taking-100)
|
|
|
|
- ensure that monitoring data (e.g. grafana, a database snapshot, logfiles) is savely stored (and survives the following rectification actions)
|
|
|
|
|
|
|
|
### 2. Immediate Rectification Actions (by technical/organisational staff)
|
|
|
|
|
|
|
|
- Immediate possible actions (if potentially helpful)
|
|
|
|
1. restart respective docker containers (`sudo docker restart <container_name>`)
|
|
|
|
2. if no containers/images/volumes are displayed, restart docker daemon with `sudo service docker restart`
|
|
|
|
3. try rebooting complete VM (`sudo reboot`)
|
|
|
|
4. if still not up and running, backup individual Docker applications/database/volume or the whole mount file system from a stored snapshot (Snapshots location: `/mnt/qt-<mnt_name>/.snapshots`) *(This might take several hour to complete, because of the tight throughput of the NFS file system)*
|
|
|
|
|
|
|
|
- if service degradation is still ongoing, inform users
|
|
|
|
- Either set up user message in application: "We currently observe a major service problem in our applications, we are currently investigating the issue, and will keep you informed as soon as we come up with a solution. Please bear with us."
|
|
|
|
- or (in case of complete service breakdown), set up an alternative simple information page on https://artemis.codeability.uibk.ac.at/ and/or https://sharing-codeability.uibk.ac.at/ with appropriate information of the current state.
|
|
|
|
|
|
|
|
### 3. Problem analysis
|
|
|
|
|
|
|
|
- Setup a joint conference call with the respective persons.
|
|
|
|
- Identify current options and decide on the next steps.
|
|
|
|
- In case of a data protection breach inform the data protection officer (see https://www.uibk.ac.at/datenschutz/)
|
|
|
|
|
|
|
|
### 4. Reviewing of this Guidelines
|
|
|
|
|
|
|
|
This guidelines should be reviewed by all involved persons on a yearly basis. |
|
|
\ No newline at end of file |