Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
.. _ref_filehooks:
FileHooks
=========
The fileHooks project comprises the backend of the CodeAbility Sharing Platform.
The services GitLab and Elasticsearch are considered backend services.
It is responsible for the data collection and preparation.
This section describes the fileHooks used in GitLab and the infrastructure setup.
Finally, some tips to handle errors are provided.
GitLab FileHooks
----------------
Currently, there is one fileHook for GitLab - doing both health check and indexing - to extend the functionality of GitLab.
Details are provided in the following.
FileHook - trigger_project_update
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This file hook does two tasks:
1. Health check and validation: It informs the user who modified the project via email if the metadata information is incomplete or invalid after a modification in a repository was conducted.
Validation happens on the ``master``-branch of all projects in the group ``sharing``.
It will mainly be triggered by push events, but also by moving or renaming a project.
The check proceeds as follows:
First, the root directory of the repository is checked for files named ``metadata.json``, ``metadata.yaml``, or ``metadata.yml``.
There must be exactly one such file, otherwise the check fails.
Subsequently, the correctness of all metadata files is validated (also dependent metadata files, if it is a collection).
If an error occurred, an email is sent to the user who pushed the changes.
2. It keeps the Elasticsearch index up-to-date by adding/updating/deleting files according to the triggered GitLab event.
Only the ``main``-branch (or ``master`` if ``main`` does not exist) and the group ``sharing`` (including subgroups and all subprojects) are indexed in Elasticsearch.
Metadata files (``metadata.json``, ``metadata.yaml``, or ``metadata.yml``) at the project root are indexed in the alias ``metadata``.
.. warning::
Note that GitLab does not trigger an event if a group is transferred! Those changes remain unnoticed in Elasticsearch! To prevent those inconsistencies, users should not transfer groups!
Infrastructure Setup
--------------------
It is currently assumed that all services run on the same host as separate docker containers.
The setup of the containers and the host server is discussed in the following.
Lastly, the manual installation procedure for file hooks is given as a reference.
Container Setup
~~~~~~~~~~~~~~~
Subsequently, the setup for GitLab and Elasticsearch is shown.
The setup of the Services GitLab search and MySQL are discussed in the section :ref:`ref_git_search`.
To create all containers for the backend in production, a script situated in ``setup/`` is provided.
It takes a configuration file as the only argument.
The configuration files can be found in ``setup/config/``.
For a local development setup, the file ``local`` can be used without further modification.
For deployment on a server, the files ``development_template`` and ``production_template`` are provided.
These configurations require secrets.
Do not put secrets into these files but create a copy and put them into the copy.
Files in the ``setup/config/`` directory are ignored by git by default, so writing secrets into a copy
prevents accidentally committing them.
The secrets which need to be added to the copy of the configuration file are:
- MAIL_USERNAME: The email user name to authenticate with at ``smtp.uibk.ac.at`` (KeePass @ artemis-support MailBox).
- MAIL_PASSWORD The password for the authentication at ``smtp.uibk.ac.at`` (KeePass @ artemis-support MailBox).
The following code block shows how to deploy the project in production.
.. code-block::
cd setup
cp config/production_template config/production
$EDITOR config/production # add the secrets to the copy
./setup-infrastructure.sh config/production
Similarly, the containers for the development backend can be created with:
.. code-block::
cd setup
cp config/development_template config/development
$EDITOR config/development # add the secrets to the copy
./setup-infrastructure.sh config/development
The following environment variables are set within the config files.
No modification should be required for those if the correct config file is used.
- ``GITLAB_HOME``: Directory where data generated by GitLab is persisted
- ``EXTERNAL_URL``: External Url of the GitLab instance
- ``GITLAB_HOSTNAME``: Hostname of GitLab
- ``ES_HOME``: Directory where data generated by Elasticsearch is persisted
- ``FILEHOOKS_CONFIG_FILE``: Name of the file in ``filehooks/conf/`` used to configure the filehooks code
+----------------------+----------------------------------------+------------------------------------------------+
| Environment variable | Production | Development |
+======================+========================================+================================================+
| GITLAB_HOME | /mnt/qt-sharing-codeability/srv/gitlab | /mnt/qt-codeability-austria/sharing/srv/gitlab |
+----------------------+----------------------------------------+------------------------------------------------+
| ES_HOME | /mnt/qt-sharing-codeability/es | /mnt/qt-codeability-austria/sharing/es |
+----------------------+----------------------------------------+------------------------------------------------+
| EXTERNAL_URL | https://sharing-codeability.uibk.ac.at | https://sharing.codeability-austria.uibk.ac.at |
+----------------------+----------------------------------------+------------------------------------------------+
| GITLAB_HOSTNAME | sharing-codeability.uibk.ac.at | sharing.codeability-austria.uibk.ac.at |
+----------------------+----------------------------------------+------------------------------------------------+
| GITLAB_HOSTNAME | conf/production.ini | conf/staging.ini |
+----------------------+----------------------------------------+------------------------------------------------+
.. note::
If the container is set up from scratch (there are no persisted data available), a password for the user ``root`` has to be specified using the web interface.
For the development and production server, this password should be added to KeePass.
Alternatively, the password can also be set directly in the GitLab container:
``docker exec -it sharing_gitlab gitlab-rake 'gitlab:password:reset[root]'``
Installing the Filehooks package
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the previous section, the container infrastructure is set up.
When this is successfully done, the filehooks code needs to be installed in the GitLab container.
There is another script in the ``setup`` directory for this job:
.. code-block::
./install_filehooks_locally.sh --create-index
This script copies files from the repository into the GitLab container
and sets up the code such that it is run whenever GitLab emits an event.
The ``--create-index`` flag causes the initial index to be created in ElasticSearch.
Local Development Setup
~~~~~~~~~~~~~~~~~~~~~~~
For development, it can be beneficial to have the setup running
on the development machine.
In order to have access to the GitLab and ElasticSearch containers via http,
modify ``setup/docker-compose.yml`` and enable the lines marked with the comment
.. code-block::
# add this for your local testing setup
It might also be useful to remove the lines saying ``restart: always``.
For local development the ``local`` config file can by used directly:
Afterwards, the same setup procedure as for deployments can be used.
The configuration for local development does not need to be copied and modified
if the defaults are used.
.. code-block::
cd setup
./setup-infrastructure.sh config/local
When this completes, the filehooks code needs to be installed as described previously:
.. code-block::
./install_filehooks_locally.sh --create-index
At this point, GitLab should be reachable at http://localhost:10082 and ElasticSearch at http://localhost:9200.
To view the entire index check http://localhost:9200/metadata/_search.
The index will be updated whenever a repository or group in the GitLab group "sharing" is updated.
Execution logs of the filehooks code can be found in the GitLab container at ``/var/log/gitlab/gitlab-rails/trigger_project_update.log``
for normal logs and ``/var/log/gitlab/gitlab-rails/file_hook.log`` for crashes due to uncaught exceptions.
A convenient way to observe these files is running
.. code-block::
docker exec -it sharing_gitlab tail -f /var/log/gitlab/gitlab-rails/file_hook.log
for errors and similarly for the logs.
To deploy modifications of the code to GitLab, the relevant files need to be copied to the mounted volume.
The script ``install_filehooks_locally.sh`` does this automatically when called without any arguments.
This allows to quickly install new code without having to restart the container.
The GitLab container can be accessed interactively by running
.. code-block::
docker exec -it sharing_gitlab /bin/bash
Server Setup
~~~~~~~~~~~~
To make GitLab reachable from outside, conduct the following steps after connecting to the server via ``ssh``:
1. Add the following snippet to ``/etc/apache2/sites-enabled/default-ssl.conf``:
.. code-block::
# Sharing
<VirtualHost *:443>
SSLProxyEngine On
AllowEncodedSlashes NoDecode
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
##### Portainer #####
RewriteRule ^/portainer$ /portainer/ [R,L]
ProxyPass /portainer/ http://sharing-codeability.uibk.ac.at:9000/
ProxyPassReverse /portainer/ http://sharing-codeability.uibk.ac.at:9000/
ProxyPass /portainer/api/websocket/ http://sharing-codeability.uibk.ac.at:9000/api/websocket/
ProxyPassReverse /portainer/api/websocket/ http://sharing-codeability.uibk.ac.at:9000/api/websocket/
#####################
ProxyPass / https://sharing-codeability.uibk.ac.at:10083/ nocanon
ProxyPassReverse / https://sharing-codeability.uibk.ac.at:10083/
ErrorLog ${APACHE_LOG_DIR}/error.sharing-codeability.uibk.ac.at.log
CustomLog ${APACHE_LOG_DIR}/access.sharing-codeability.uibk.ac.at.log combined
# Michael further tools settings
Include /etc/apache2/codeAbility/sharing/*.conf
</VirtualHost>
.. note::
Please review the configuration above carefully. Gitlab is very sensitive, when run behind a reverse proxy!
2. Add the following snippet to ``/etc/apache2/sites-enabled/000-default.conf``:
.. code-block::
<VirtualHost *:80>
# ...
Redirect / https://sharing-codeability.uibk.ac.at
</VirtualHost>
3. ``sudo systemctl restart apache2``
Afterward, the containers can be started:
1. Navigate to ``/mnt/qt-sharing-codeability/file-hooks`` (prod).
2. Pull this repository for the latest updates
- It could be necessary to ``git reset --hard`` the repository before, because some scripts replace confidential variables during installation.
3. Navigate to ``setup`` and set up the containers (see above)
Manual File Hook Setup
~~~~~~~~~~~~~~~~~~~~~~
.. note::
Just for reference.
As a reference on how to add other file hooks to GitLab, the steps to install the file hook ``trigger_project_update.py`` are given below:
1. Install python requirements:
.. code-block:: bash
pip3 install --upgrade setuptools
pip3 install -r requirements.txt
2. Create API-Token with the user ``root`` and the scopes ``api``, ``read_api``, ``read_repository``
3. Add the API-Token in ``conf.production.ini`` (section ``gitlab``, key ``token``)
4. Install the ``filehooks`` package
.. code-block:: bash
pip3 install .
5. Install java
.. code-block:: bash
apt-get install openjdk-8-jdk
6. Initialize the indices in elasticsearch using the script ``create_index.py``. This script can be deleted once the indexes were created successfully.
.. code-block:: bash
python3 create_index.py
7. Add the file ``trigger_project_update.py`` to ``/opt/gitlab/embedded/service/gitlab-rails/file_hooks`` in the GitLab container
8. Ensure that the script ``trigger_project_update.py`` has the permissions ``755``
9. Validate the file hooks:
.. code-block:: bash
gitlab-rake file_hooks:validate
Infrastructure Update
---------------------
Subsequently, a guide for updating GitLab and the filehooks is provided.
Update Guide GitLab
~~~~~~~~~~~~~~~~~~~
1. Navigate to the root directory of filehooks repository.
2. Create a backup of GitLab with the script ``setup/backup_sharing_gitlab.sh``
3. Navigate to the parent directory of ``GITLAB_HOME`` and copy the mounted volume, e.g.,
.. code-block::
cp -a srv srv_2021_01_31
4. Change the GitLab version in the file ``setup/sendmail/Dockerfile``.
5. Run the script ``setup/setup-infrastructure.sh``
6. Wait & check if GitLab starts successfully. Usually it takes about 10 minutes until GitLab reaches the status ``healthy``. It may very well be that the status is ``unhealthy`` for a period of time along the way.
7. Install the filehooks in GitLab using the script ``setup/install_filehooks_locally.sh``.
8. Check if the filehooks work properly.
.. note::
When upgrading the GitLab version, follow the `upgrade recommendations <https://docs.gitlab.com/ee/policy/maintenance.html#upgrade-recommendations>`_ from GitLab.
Update Guide Filehooks
~~~~~~~~~~~~~~~~~~~~~~
1. Check out the version of the code which should be deployed somewhere in the file system.
2. Run ``setup/install_filehooks_locally.sh``.
Errors
------
In case a container crashes, it should start automatically.
Consequently, it should not be necessary to start any container manually after the setup was executed successfully.
.. warning::
If the GitLab container crashes, the python-package ``filehooks`` is not re-installed automatically. (TODO check whether this is still true)
Hence, new or changed files will not be added to elastic search.
You have to install the filehooks (see update guide filehooks)! And do a complete reindexing, to ensure a consistent index.
Subsequently, the logging systems for GitLab and FileHooks are discussed.
GitLab
~~~~~~
GitLab has an advanced logging system distributed over many log files. Details can be found in the `GitLab documentation <https://docs.gitlab.com/ee/administration/logs.html>`_.
For example, the command ``docker logs -f -n 10 sharing_gitlab`` can be used to inspect the logs.
FileHooks
~~~~~~~~~
- ``/var/log/gitlab/gitlab-rails/file_hook.log``: Fatal errors (e.g., unexpected exceptions) are logged in this file.
- ``/var/log/gitlab/gitlab-rails/trigger_project_update.log``: General logging information for the fileHook ``trigger_project_update.py`` are logged in this file.