Project

General

Profile

Task #2210

monitoring notifications

Added by Florian Effenberger over 1 year ago. Updated 22 days ago.

Status:
In Progress
Priority:
High
Category:
-
Target version:
Team - Q2/2018
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:
URL:

Description

The monitoring system doesn't send notifications properly, and not everyone uses Telegram.
We should switch to e-mail, by which we then can also use SMS via a gateway and PushOver for those who use it, or NotifyMyAndroid.


Related issues

Related to Infrastructure - Task #2211: greylog/log parsingNew

Related to Infrastructure - Task #2208: add missing hosts to monitoringIn Progress

Related to Infrastructure - Bug #1079: Status pageNew

History

#1 Updated by Florian Effenberger about 1 year ago

  • Priority changed from Normal to High
  • Target version changed from Q2/2017 to Q3/2017

For my private boxes I've switched to PushOver (which needs some small investment as we need to buy licenses)
TDF also has a SMS gateway we can use
Both can be fed via e-mail if required, with PushOver giving some more flexibility

In first place we should ensure proper notification via e-mail, as this can then be easily expanded to one of the other services
Assigning it a High priority for Q3

#2 Updated by Florian Effenberger 10 months ago

  • Target version changed from Q3/2017 to Q4/2017

Shifting to Q4 for further work on it, but I'd be interested if there was any progress in Q3

#3 Updated by Guilhem Moulin 10 months ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of course is highly suboptimal as it lacks notifications.

#4 Updated by Florian Effenberger 10 months ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of
course is highly suboptimal as it lacks notifications.

Then let's address that early in Q4

#5 Updated by Florian Effenberger 9 months ago

A topic for the next admin call would be to agree on a method to deliver notifications
E-mail for sure, but a direct push message to the mobile as well would be very much desirable

I see three choices:

  • use GMail's push feature -> drawback: everyone needs a @gmail.com address and I'm not sure if certain mails can create a direct alert
  • use SMS notification -> drawback: small fee, limited amount of text; advantages: works everywhere even w/o data
  • use PushOver -> drawback: license costs; advantages: works on Android, iOS and browser push notification
  • use a Telegram bot -> no insight into that, but could be the least expensive solution; works on desktops too
  • add your proposals here

I tried NotifiyMyAndroid which seems to work as well (but also has some license fee), however it was impossible for me to reach the vendor by any means, so I'd not pursue that further.

#6 Updated by Guilhem Moulin 8 months ago

From the Nov. 21 infra call minutes:

  • SMS is the way to go, but we need to the ability to specify schedules
    so not everyone is waken up in the middle of the night
  • cloph: want to revive the telegram notifications that we had in the
    past (TDF Monitoring bot)
  • Norbert: need the ability to temporarily disable the rules when doing
    manual maintenance
  • Norbert: it's crucial to avoid false positives (cf. infra ML…)

#7 Updated by Florian Effenberger 4 months ago

  • Target version changed from Q4/2017 to Q2/2018

I'd like to put the monitoring bits into Q2, I think it is really critical to have something up to date here
Where do we stand? I heard about some experiments by Brett?

#8 Updated by Guilhem Moulin 4 months ago

  • Related to Task #2211: greylog/log parsing added

#9 Updated by Guilhem Moulin 4 months ago

  • Status changed from New to In Progress

It's the hot topic in Q1 (and probably also early Q2). See minutes from the Feb 20 infra cal:
https://listarchives.libreoffice.org/global/website/msg15038.html .

We've got a working test VM, but we need to bring that offsite. We agreed to reuse the monitoring.tdf host, and take the opportunity to switch OS from 14.04.5 LTS to Debian 9. I've yet to contact Filoo about that, though.

#10 Updated by Guilhem Moulin 4 months ago

  • Related to Task #2208: add missing hosts to monitoring added

#11 Updated by Florian Effenberger 4 months ago

Cool, thanks for your work on this!

#12 Updated by Florian Effenberger 2 months ago

#13 Updated by Christian Lohmaier 23 days ago

Brett volunteered to start working on the alerting system on the recently setup prometheus based monitoring - so once we have some reliable/workable alerting rules in place, we can hook those up to email and irc as first step, and then expand to sipgate sms gateway.

#14 Updated by Florian Effenberger 22 days ago

Brett volunteered to start working on the alerting system on the
recently setup prometheus based monitoring - so once we have some
reliable/workable alerting rules in place, we can hook those up to email
and irc as first step, and then expand to sipgate sms gateway.

Sounds great! Both PushOver and the SMS gateway accept e-mail and HTTPS,
the latter probably be preferred.

Also available in: Atom PDF