Project

General

Profile

Task #2210

monitoring notifications

Added by Florian Effenberger over 1 year ago. Updated 1 day ago.

Status:
In Progress
Priority:
High
Category:
-
Target version:
Team - Q2/2018
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:
URL:

Description

The monitoring system doesn't send notifications properly, and not everyone uses Telegram.
We should switch to e-mail, by which we then can also use SMS via a gateway and PushOver for those who use it, or NotifyMyAndroid.


Related issues

Related to Infrastructure - Task #2211: greylog/log parsingNew

Related to Infrastructure - Task #2208: add missing hosts to monitoringIn Progress

Related to Infrastructure - Bug #1079: Status pageClosed

History

#1 Updated by Florian Effenberger over 1 year ago

  • Priority changed from Normal to High
  • Target version changed from Q2/2017 to Q3/2017

For my private boxes I've switched to PushOver (which needs some small investment as we need to buy licenses)
TDF also has a SMS gateway we can use
Both can be fed via e-mail if required, with PushOver giving some more flexibility

In first place we should ensure proper notification via e-mail, as this can then be easily expanded to one of the other services
Assigning it a High priority for Q3

#2 Updated by Florian Effenberger about 1 year ago

  • Target version changed from Q3/2017 to Q4/2017

Shifting to Q4 for further work on it, but I'd be interested if there was any progress in Q3

#3 Updated by Guilhem Moulin about 1 year ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of course is highly suboptimal as it lacks notifications.

#4 Updated by Florian Effenberger about 1 year ago

Unfortunately not :-/ I'm still using grafana for monitoring, which of
course is highly suboptimal as it lacks notifications.

Then let's address that early in Q4

#5 Updated by Florian Effenberger 12 months ago

A topic for the next admin call would be to agree on a method to deliver notifications
E-mail for sure, but a direct push message to the mobile as well would be very much desirable

I see three choices:

  • use GMail's push feature -> drawback: everyone needs a @gmail.com address and I'm not sure if certain mails can create a direct alert
  • use SMS notification -> drawback: small fee, limited amount of text; advantages: works everywhere even w/o data
  • use PushOver -> drawback: license costs; advantages: works on Android, iOS and browser push notification
  • use a Telegram bot -> no insight into that, but could be the least expensive solution; works on desktops too
  • add your proposals here

I tried NotifiyMyAndroid which seems to work as well (but also has some license fee), however it was impossible for me to reach the vendor by any means, so I'd not pursue that further.

#6 Updated by Guilhem Moulin 11 months ago

From the Nov. 21 infra call minutes:

  • SMS is the way to go, but we need to the ability to specify schedules
    so not everyone is waken up in the middle of the night
  • cloph: want to revive the telegram notifications that we had in the
    past (TDF Monitoring bot)
  • Norbert: need the ability to temporarily disable the rules when doing
    manual maintenance
  • Norbert: it's crucial to avoid false positives (cf. infra ML…)

#7 Updated by Florian Effenberger 7 months ago

  • Target version changed from Q4/2017 to Q2/2018

I'd like to put the monitoring bits into Q2, I think it is really critical to have something up to date here
Where do we stand? I heard about some experiments by Brett?

#8 Updated by Guilhem Moulin 7 months ago

  • Related to Task #2211: greylog/log parsing added

#9 Updated by Guilhem Moulin 7 months ago

  • Status changed from New to In Progress

It's the hot topic in Q1 (and probably also early Q2). See minutes from the Feb 20 infra cal:
https://listarchives.libreoffice.org/global/website/msg15038.html .

We've got a working test VM, but we need to bring that offsite. We agreed to reuse the monitoring.tdf host, and take the opportunity to switch OS from 14.04.5 LTS to Debian 9. I've yet to contact Filoo about that, though.

#10 Updated by Guilhem Moulin 7 months ago

  • Related to Task #2208: add missing hosts to monitoring added

#11 Updated by Florian Effenberger 7 months ago

Cool, thanks for your work on this!

#12 Updated by Florian Effenberger 5 months ago

#13 Updated by Christian Lohmaier 4 months ago

Brett volunteered to start working on the alerting system on the recently setup prometheus based monitoring - so once we have some reliable/workable alerting rules in place, we can hook those up to email and irc as first step, and then expand to sipgate sms gateway.

#14 Updated by Florian Effenberger 4 months ago

Brett volunteered to start working on the alerting system on the
recently setup prometheus based monitoring - so once we have some
reliable/workable alerting rules in place, we can hook those up to email
and irc as first step, and then expand to sipgate sms gateway.

Sounds great! Both PushOver and the SMS gateway accept e-mail and HTTPS,
the latter probably be preferred.

#15 Updated by Florian Effenberger about 2 months ago

Is there any ETA?
I really would like to get the cronjob mails cleaned up, and have some very basic notification system in place

#16 Updated by Guilhem Moulin 2 days ago

Florian Effenberger wrote:

Is there any ETA?

Was part of Brett's ‘prometheus’ branch, it's being tested since early- to mid-September. We only have email notifications for now.

#17 Updated by Florian Effenberger 1 day ago

Was part of Brett's ‘prometheus’ branch, it's being tested since early-
to mid-September. We only have email notifications for now.

Where do these notifications get sent? :)

Also available in: Atom PDF