Project

General

Profile

Task #1096

review infra blogpost

Added by Florian Effenberger almost 3 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Team - Pool
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Please draft a text about your achievements during the first and second quarter, to be published in TDF's official blog

infra_blogpost_mike_fixed.odt (25.4 KB) infra_blogpost_mike_fixed.odt Mike Saunders, 2016-03-02 15:39

History

#1 Updated by Alexander Werner over 2 years ago

  • Priority changed from Normal to Urgent

#2 Updated by Florian Effenberger over 2 years ago

  • Subject changed from blog about Q1/2015 achievements to blog about Q1+Q2/2015 achievements

#3 Updated by Alexander Werner over 2 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 90

Need only last finishing touches, will add the finished post here this evening.

#4 Updated by Alexander Werner over 2 years ago

The year 2015 brought some challenging and thrilling developements regarding the ongoing restructuring of our infrastructure.
At the beginning of the year, the migration of our exisiting virtual machines and bare metal machines was ongoing after an extensive test phase of the new virtualization platform.
The virtualization platform consists of 3 Servers, each with 256GB RAM and 64 CPU cores and quite a lot of HDD space. One of the machines is meant to be used exclusively by developers for crashtesting. These machines are all hosted at manitu in St. Wendel, Germany. The personal and responsive support of manitu, together with the flexibility in their offerings allowed the setup of a private network between these machines and others that manitu offered to provide housing for us.
After some problems with the software for the new virtualization platform that were already covered extensively, much work went into providing setting up more an more virtual machines where services run isolated from each other. This already led to the move of the hosted blog to one of our own machines. This allows more control over installed plugins and gives more flexible control over the WordPress-setup that is used.
During the Hackfest at the University of GranCanaria work went into making the used Salt-States more easy to hack on by people who want to get involved in our infra. This also resulted in a tutorial video on how to setup a developement environment for our infrastructure.
Monthly infra calls were also setup, taking place every last Wednesday of the month at 1700 UTC. They resulted in the creating of a weekly maintenance window for server upgrades, reboots and major configuration changes. The maintenance window is every Monday, 0300-0500 UTC.
During the calls the community decided to upgrade the base operating system to Debian 8 during the next few months. This was already carried out for one of our virtualization hosts during the newly setup maintenance window to check for any problems that may occur during the update. During the upgrade, some obstacles were identified and workarounds to allow smooth upgrades were set in place.
The friendly and generous offering by Thomas Krenn, a major server supplier in southern germany, will allow for the setup of two additional Windows-buildbots with powerful dual CPUs and high speed SSDs, and two more Linux-buildbots with the same specs. These buildbots will also be housed at Manitu and connect to our growing intranet there. Two more servers from Thomas Krenn will be housed Manitu and at Adfinis SyGroup in Switzerland and provide external Backup-Space. It is planned to connect all TDF-owned hardware with a VPN, forming a world-wide intranet.
In the second half of the year, more machines were migrated to Debian 8, including the two hypervisors still running Wheezy. Due to the huge success of the Thomas Krenn build bots, two more were ordered and now extend the intranet at Manitu, with a high performance cloud core router from Mikrotik becoming the central connection point of our intranet. The cloud core router also servers as a VPN provider for TDF members at areas with restricted internet access, such as the LibreOffice conference in Aarhus, Denmark in October.
As the number of servers that are from Thomas Krenn we decided to migrate our monitoring platform to TKmon, running on a high available virtual machine that is separated from the rest of our infrastructure at filoo. TKmon integrates with the support from Thomas Krenn and notifies them of hardware failures automatically. TKmon is open source software and uses tools such as icinga and pnp4nagios.
To be more flexible with the monitoring notifications, I wrote a tool called TMB that provides a bot for the Telegram chat service and sends notifications to admins. This happend with the help of PyCharm, a great python IDE from JetBrains, who provide free licenses for Open Source projects.

The current state of the infrastructure consists of 3 rented hyperisors at Manitu, each with 4 CPUs, 256 GB RAM and 8 HDDs and partially SSDs. Additional rented servers are one backup server and one website stand-in-host that was needed after the virtualization problems occured at the beginning of the year and that will be decomissioned soon. 9 housed servers from Thomas Krenn with Intel SSDs and powerful dual CPUs are only reachable in the intranet, with access to it being controlled by the core router. On the hypervisors, there are currently 31 VMs, providing services such as AskBot, Wordpress, Gerrit, Bugzilla, Jenkins, Moztrap and much more. At Hetzner there are currently 4 Servers, one that contains the Wiki, Mirrorbrain and our public mailing lists, one that is for internal services, and 2 Backup hosts, one that provides storage capacity of over 17TB and is currently being set up.

[http://salt-states-base.readthedocs.org/en/latest/_images/graphviz-e3bf01d100863ffbfeac9749c2e64792b346879d.png]

Much of our documentation and many of our salt states are published now at https://github.com/tdf/salt-states-base, the compiled documentation can be found at http://salt-states-base.readthedocs.org/en/latest/. The salt-states are now tested with travis, the build results are at https://travis-ci.org/tdf/salt-states-base. It is therefore now very easy to contribute to developement and improve the documentation. Just fork the repository and create a pull request, the results will automatically be tested in travis. If you want to contribute to the infrastructure of our projects, you are invited to join our monthly infra calls, the next taking place on January 20th, 1800 UTC or introduce yourself to the infra team in #tdf-infra at freenode. 

#5 Updated by Alexander Werner over 2 years ago

  • Status changed from In Progress to Closed

#6 Updated by Florian Effenberger over 2 years ago

  • Status changed from Closed to Resolved

Setting to Resolved instead of Closed so I don't lose it from my radar :-)

#7 Updated by Florian Effenberger over 2 years ago

  • Target version set to Q3/2015

#8 Updated by Florian Effenberger over 2 years ago

  • Description updated (diff)
  • Due date deleted (2015-04-01)
  • Priority changed from Urgent to Normal
  • % Done changed from 90 to 100

#10 Updated by Florian Effenberger about 2 years ago

  • Subject changed from blog about Q1+Q2/2015 achievements to blog about Q1+Q2+Q3/2015 achievements
  • Status changed from Resolved to In Progress
  • Target version changed from Q3/2015 to Q4/2015
  • % Done changed from 100 to 0

Merging this with #1381

#11 Updated by Alexander Werner about 2 years ago

Updated the text.

#12 Updated by Florian Effenberger about 2 years ago

  • Status changed from In Progress to Resolved

Let's defer publication to Q1 - Florian share details via private mail

#13 Updated by Florian Effenberger almost 2 years ago

  • Subject changed from blog about Q1+Q2+Q3/2015 achievements to review and publish Alex' blogpost
  • Status changed from Resolved to In Progress
  • Assignee changed from Alexander Werner to Florian Effenberger
  • Target version changed from Q4/2015 to Q1/2016

#14 Updated by Florian Effenberger almost 2 years ago

  • Subject changed from review and publish Alex' blogpost to update infra blogpost
  • Assignee changed from Florian Effenberger to Alexander Werner

As discussed on the phone, please update with the latest status and then we (finally) publish

#15 Updated by Florian Effenberger almost 2 years ago

Please keep in mind I need this ideally by tomorrow, the latest by Friday, so we can proceed

#16 Updated by Florian Effenberger almost 2 years ago

  • Subject changed from update infra blogpost to review infra blogpost
  • Assignee changed from Alexander Werner to Florian Effenberger

#17 Updated by Florian Effenberger almost 2 years ago

  • Assignee changed from Florian Effenberger to Mike Saunders

My redacted version is attached. Mike, can you put that into some nice English and add the edited version to this ticket? :)
Thanks!

The year 2015 brought some challenging and thrilling developements regarding the ongoing restructuring of our infrastructure.
At the beginning of the year, the migration of our exisiting virtual machines and bare metal machines was ongoing after an extensive test phase of the new virtualization platform.
The virtualization platform consists of 3 Servers, each with 256GB RAM and 64 CPU cores and quite a lot of HDD space. One of the machines is meant to be used exclusively by developers for crashtesting. These machines are all hosted at manitu in St. Wendel, Germany, and currently undergo migration onto our own, dedicated 42U rack, including the flexibility to setup of a private network between these machines and others that we house there.
After some problems with the software for the new virtualization platform, much work went into providing setting up more an more virtual machines where services run isolated from each other. This already led to the move of the hosted blog to one of our own machines. This allows more control over installed plugins and gives more flexible control over the WordPress-setup that is used.
During the Hackfest at the University of GranCanaria work went into making the used Salt-States more easy to hack on by people who want to get involved in our infra. This also resulted in a tutorial video on how to setup a developement environment for our infrastructure.
Monthly infra calls were also setup, taking place every last Wednesday of the month at 1700 UTC. They resulted in the creating of a weekly maintenance window for server upgrades, reboots and major configuration changes. The maintenance window is every Monday, 0300-0500 UTC.
During the calls the community decided to upgrade the base operating system to Debian 8 during the next few months. This was already carried out for one of our virtualization hosts during the newly setup maintenance window to check for any problems that may occur during the update. During the upgrade, some obstacles were identified and workarounds to allow smooth upgrades were set in place.
We have also invested in hardware from vendor Thomas Krenn, that will allow for the setup of two additional Windows-buildbots with powerful dual CPUs and high speed SSDs, and two more Linux-buildbots with the same specs. These buildbots will also be housed at Manitu and connect to our growing intranet there. Two more Krenn servers will be used for Backup-Space. It is planned to connect all TDF-owned hardware with a VPN, forming a world-wide intranet.
In the second half of the year, more machines were migrated to Debian 8, including the two hypervisors still running Wheezy. Due to the huge success of the new build bots, two more were ordered and now extend the intranet, with a high performance cloud core router from Mikrotik becoming the central connection point of our intranet. The cloud core router also servers as a VPN provider for TDF members at areas with restricted internet access, such as the LibreOffice conference in Aarhus, Denmark in October.
As the number of the new Krenn servers grew, we decided to migrate our monitoring platform to TKmon, running on a high available virtual machine that is separated from the rest of our infrastructure at filoo. TKmon integrates with the hardware vendor's support and notifies them of hardware failures automatically. TKmon is open source software and uses tools such as icinga and pnp4nagios.
To be more flexible with the monitoring notifications, I wrote a tool called TMB that provides a bot for the Telegram chat service and sends notifications to admins. This happend with the help of PyCharm, a great python IDE from JetBrains, who provide free licenses for Open Source projects.

The current state of the infrastructure consists of 3 rented hyperisors at manitu, each with 4 CPUs, 256 GB RAM and 8 HDDs and partially SSDs. Additional rented servers are one backup server and one website stand-in-host that was needed after the virtualization problems occured at the beginning of the year and that will be decomissioned soon. 9 housed servers with Intel SSDs and powerful dual CPUs are only reachable in the intranet, with access to it being controlled by the core router. On the hypervisors, there are currently 31 VMs, providing services such as AskBot, Wordpress, Gerrit, Bugzilla, Jenkins, Moztrap and much more. At Hetzner there are currently 4 Servers, one that contains the Wiki, Mirrorbrain and our public mailing lists, one that is for internal services, and 2 Backup hosts, one that provides storage capacity of over 17TB and is currently being set up.

[http://salt-states-base.readthedocs.org/en/latest/_images/graphviz-e3bf01d100863ffbfeac9749c2e64792b346879d.png]

Much of our documentation and many of our salt states are published now at https://github.com/tdf/salt-states-base, the compiled documentation can be found at http://salt-states-base.readthedocs.org/en/latest/. The salt-states are now tested with travis, the build results are at https://travis-ci.org/tdf/salt-states-base. It is therefore now very easy to contribute to developement and improve the documentation. Just fork the repository and create a pull request, the results will automatically be tested in travis. If you want to contribute to the infrastructure of our projects, you are invited to join our monthly infra calls, the next taking place on ... or introduce yourself to the infra team in #tdf-infra at freenode.

#18 Updated by Florian Effenberger almost 2 years ago

  • Project changed from Infrastructure to Marketing

#19 Updated by Florian Effenberger almost 2 years ago

  • Target version changed from Q1/2016 to Pool

#20 Updated by Mike Saunders almost 2 years ago

Attached the updated version of the blog post. Just fixed some small things to do with wording and word order -- nothing major. I also added a couple of headings to break up the text as it's quite long, but feel free to remove them!

#21 Updated by Florian Effenberger almost 2 years ago

Just to be on the safe side:
Before publishing, please get my approval - I'll have a look ~next week
I need to coordinate a few things before we go live :)

#22 Updated by Mike Saunders over 1 year ago

  • Status changed from In Progress to Closed

Closing this as it was assigned to me and has been published now.

Also available in: Atom PDF