Project

General

Profile

Actions

Task #2210

closed

monitoring notifications

Added by Florian Effenberger over 7 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
High
Category:
-
Target version:
Team - Q2/2019
Start date:
Due date:
% Done:

0%

Tags:

Description

The monitoring system doesn't send notifications properly, and not everyone uses Telegram.
We should switch to e-mail, by which we then can also use SMS via a gateway and PushOver for those who use it, or NotifyMyAndroid.


Related issues

Related to Infrastructure - Task #2211: greylog/log parsingNewGuilhem Moulin

Actions
Related to Infrastructure - Task #2208: add missing hosts to monitoringClosedGuilhem Moulin

Actions
Related to Infrastructure - Bug #1079: Status pageClosedGuilhem Moulin

Actions
Actions #1

Updated by Florian Effenberger over 7 years ago

  • Priority changed from Normal to High
  • Target version changed from Q2/2017 to Q3/2017

For my private boxes I've switched to PushOver (which needs some small investment as we need to buy licenses)
TDF also has a SMS gateway we can use
Both can be fed via e-mail if required, with PushOver giving some more flexibility

In first place we should ensure proper notification via e-mail, as this can then be easily expanded to one of the other services
Assigning it a High priority for Q3

Actions #2

Updated by Florian Effenberger about 7 years ago

  • Target version changed from Q3/2017 to Q4/2017

Shifting to Q4 for further work on it, but I'd be interested if there was any progress in Q3

Actions #3

Updated by Guilhem Moulin about 7 years ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of course is highly suboptimal as it lacks notifications.

Actions #4

Updated by Florian Effenberger about 7 years ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of
course is highly suboptimal as it lacks notifications.

Then let's address that early in Q4

Actions #5

Updated by Florian Effenberger about 7 years ago

A topic for the next admin call would be to agree on a method to deliver notifications
E-mail for sure, but a direct push message to the mobile as well would be very much desirable

I see three choices:

  • use GMail's push feature -> drawback: everyone needs a @gmail.com address and I'm not sure if certain mails can create a direct alert
  • use SMS notification -> drawback: small fee, limited amount of text; advantages: works everywhere even w/o data
  • use PushOver -> drawback: license costs; advantages: works on Android, iOS and browser push notification
  • use a Telegram bot -> no insight into that, but could be the least expensive solution; works on desktops too
  • add your proposals here

I tried NotifiyMyAndroid which seems to work as well (but also has some license fee), however it was impossible for me to reach the vendor by any means, so I'd not pursue that further.

Actions #6

Updated by Guilhem Moulin almost 7 years ago

From the Nov. 21 infra call minutes:

  • SMS is the way to go, but we need to the ability to specify schedules
    so not everyone is waken up in the middle of the night
  • cloph: want to revive the telegram notifications that we had in the
    past (TDF Monitoring bot)
  • Norbert: need the ability to temporarily disable the rules when doing
    manual maintenance
  • Norbert: it's crucial to avoid false positives (cf. infra ML…)
Actions #7

Updated by Florian Effenberger over 6 years ago

  • Target version changed from Q4/2017 to Q2/2018

I'd like to put the monitoring bits into Q2, I think it is really critical to have something up to date here
Where do we stand? I heard about some experiments by Brett?

Actions #8

Updated by Guilhem Moulin over 6 years ago

  • Related to Task #2211: greylog/log parsing added
Actions #9

Updated by Guilhem Moulin over 6 years ago

  • Status changed from New to In Progress

It's the hot topic in Q1 (and probably also early Q2). See minutes from the Feb 20 infra cal:
https://listarchives.libreoffice.org/global/website/msg15038.html .

We've got a working test VM, but we need to bring that offsite. We agreed to reuse the monitoring.tdf host, and take the opportunity to switch OS from 14.04.5 LTS to Debian 9. I've yet to contact Filoo about that, though.

Actions #10

Updated by Guilhem Moulin over 6 years ago

  • Related to Task #2208: add missing hosts to monitoring added
Actions #11

Updated by Florian Effenberger over 6 years ago

Cool, thanks for your work on this!

Actions #12

Updated by Florian Effenberger over 6 years ago

Actions #13

Updated by Christian Lohmaier over 6 years ago

Brett volunteered to start working on the alerting system on the recently setup prometheus based monitoring - so once we have some reliable/workable alerting rules in place, we can hook those up to email and irc as first step, and then expand to sipgate sms gateway.

Actions #14

Updated by Florian Effenberger over 6 years ago

Brett volunteered to start working on the alerting system on the
recently setup prometheus based monitoring - so once we have some
reliable/workable alerting rules in place, we can hook those up to email
and irc as first step, and then expand to sipgate sms gateway.

Sounds great! Both PushOver and the SMS gateway accept e-mail and HTTPS,
the latter probably be preferred.

Actions #15

Updated by Florian Effenberger about 6 years ago

Is there any ETA?
I really would like to get the cronjob mails cleaned up, and have some very basic notification system in place

Actions #16

Updated by Guilhem Moulin about 6 years ago

Florian Effenberger wrote:

Is there any ETA?

Was part of Brett's ‘prometheus’ branch, it's being tested since early- to mid-September. We only have email notifications for now.

Actions #17

Updated by Florian Effenberger about 6 years ago

Was part of Brett's ‘prometheus’ branch, it's being tested since early-
to mid-September. We only have email notifications for now.

Where do these notifications get sent? :)

Actions #18

Updated by Florian Effenberger over 5 years ago

  • Target version changed from Q2/2018 to Q2/2019

Florian Effenberger wrote:

Was part of Brett's ‘prometheus’ branch, it's being tested since early-
to mid-September. We only have email notifications for now.

Where do these notifications get sent? :)

Ping? :)

Actions #19

Updated by Guilhem Moulin over 5 years ago

Florian Effenberger wrote:

Florian Effenberger wrote:

Was part of Brett's ‘prometheus’ branch, it's being tested since early-
to mid-September. We only have email notifications for now.

Where do these notifications get sent? :)

Ping? :)

They reach cloph and I at the moment. They're intentionally not being sent to the tdf-admin list as we want to keep monitoring and infra as separated as possible.

Actions #20

Updated by Florian Effenberger over 5 years ago

They reach cloph and I at the moment. They're intentionally not being
sent to the tdf-admin list as we want to keep monitoring and infra as
separated as possible.

Does it make sense to ask other volunteer admins if they want to be
connected to the notification system?

Actions #21

Updated by Florian Effenberger over 5 years ago

  • Status changed from In Progress to Closed
Actions

Also available in: Atom PDF