Project

General

Profile

Task #1587

Task #1147: Clone gerrit VM to provide staging gerrit installation

Bump Gerrit version to 2.11.7

Added by David Ostrovsky about 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:
Documentation, Salt
URL:

Description

New Gerrit version has fixed quite some critical bugs and introduced new features
like inline edit. New Gerrit version removed old change screen. Only very few people
seems to be affected by this removal.

Prerequisites:

o Synchonize production configuration with gerrit site in Git: https://gerrit.libreoffice.org/gitweb?p=dev-tools.git;a=tree;f=gerrit/gerrit_site
o Open source Gerrit infra scripts (installation, backup, recovery, salt): add them to the same repository
o Use these scripts to set up staging Gerrit instance, see Task 1147

Upgrade plan for production and staging Gerrit (p = production, s = staging)

0. announce Gerrit outage [p]
1 shut down Gerrit
2. backup database
3. backup Git
4. upgrade Gerrit including all installed plugins
5. reindex Gerrit (estimated time needed for reindex: 45 min.)
6. start Gerrit
7. verify that all stil works (including staging Jenkins Gerrit trigger plugin on staging Jenkins instance) [s]
8. announce Gerrit availability [p]


Related issues

Blocked by Infrastructure - Task #1625: Gerrit: Install and configure gerrit-oauth-plugin to enable GitHub- and Google-OAuth2 providersClosed

History

#1 Updated by David Ostrovsky about 2 years ago

Gerrit 2.11.5 was released yesterday: [1].

#2 Updated by Florian Effenberger about 2 years ago

Cloph wanted to have a look at the staging VM, so you and Norbert can
try out the upgrade path then - I'll check with Cloph during today's
team call on this item

#3 Updated by David Ostrovsky about 2 years ago

There is one show stopper for the upgrade for OpenStack project: [1].

#4 Updated by Christian Lohmaier about 2 years ago

some test-runs on testing VM:

#### update test procedure ####

# stop gerrit
sudo -u gerrit /srv/gerrit/bin/gerrit.sh stop
# clean cache and index dirs
# this might not be needed for production upgrade, but with that we're on the safe side
# and also have worst-case (time-wise) szenario since reindex run is necessary...
sudo -u gerrit rm -r /srv/gerrit/index/*
sudo -u gerrit rm /srv/gerrit/cache/*

# repositories copied from production to test-vm, copy to gerrit's dir
sudo -u gerrit rsync -ahvP --delete repositories/ /srv/repositories/

# read-in database dump from production matching the repositories' state
# sudo -u postgres pg_dump -c gerrit |gzip > ~/gerrit_dump.sql.gz
# → dump created with -c, so it clears existing data before import
zcat gerrit_dump.sql.gz |sudo -u postgres psql gerrit_test

# run new gerrit's init to update database schemes
sudo -u gerrit java -jar /srv/gerrit/bin/gerrit.war init --batch -d /srv/gerrit

# recreate index
# needs manually set database.poolLimit in gerrit.config- docs seems to be 
# wrong about the default?
# threads don't seem to help much - it is working on repositories in parrallel
# it seems, but almost all LO stuff is in the core-repo, load peaks at 3...
sudo -u gerrit java -jar /srv/gerrit/bin/gerrit.war reindex --threads 8 -d /srv/gerrit

Reindexing changes: projects: 100% (26/26), 94% (18702/19792), done    
Reindexed 18702 changes in 6242.0s (3.0/s)
6910.75user 290.54system 1:44:14elapsed 115%CPU (0avgtext+0avgdata 2956548maxresident)k
653872inputs+23538768outputs (0major+2260479minor)pagefaults 0swaps

# start gerrit
sudo -u gerrit /srv/gerrit/bin/gerrit.sh start

→ seems to work OK

→ ETA for downtime: 2h, + whatever it takes to create a backup :-)
  (enough diskspace available on production gerrit to duplicate the /opt/gerrit tree, so no problem)

#5 Updated by Christian Lohmaier about 2 years ago

  • Blocked by Task #1625: Gerrit: Install and configure gerrit-oauth-plugin to enable GitHub- and Google-OAuth2 providers added

#6 Updated by David Ostrovsky almost 2 years ago

  • Subject changed from Bump Gerrit version to 2.11.5 to Bump Gerrit version to 2.11.6

Gerrit 2.11.6 was released today: [1] that includes fxes for two problems reported by cloph and shm_get:

  1. Don’t create new account when claimed OAuth identity is unknown. The Claimed Identity feature was enabled to support old Google OpenID accounts, that cannot be activated anymore. In some corner cases, when for example the URL is not from the production Gerrit site, for example on a staging instance, the OpenID identity may deviate from the original one. In case of mismatch, the lookup of the user for the claimed identity would fail, causing a new account to be created.
  2. Suggest to upgrade installed plugins per default during site initialization to new Gerrit version.

We should be able to upgrade now. As agreed in infra team, this upgrade should also include
installation and configuration of gerrit-oauth-plugin: [2]. The latest binary can be grabbed from
Gerrit CI: [3] or alternatively from gerrit-test instance where the plugin is up and running.

cloph must still provide plugin configuration for both providers: GitHub and Google. They only exist
for gerrit test instance for now.

I was also discussed during FOSDEM that reviewers-plugin:[4] would significantly simplify the review
workflow, as it can add reviewers automagically during push operation. The configuration can be done
in UI and per project (saved in git on refs/meta branch). Configuration uses the same predicate logic as
in watch machinery. This plugin is used by: Ericsson, Sony and Google. The binary is on Gerrit CI: [5].

#7 Updated by David Ostrovsky almost 2 years ago

  • Subject changed from Bump Gerrit version to 2.11.6 to Bump Gerrit version to 2.11.7

Gerrit 2.11.7 was released today: [1], [2] , so we should use it for our upcoming GErrit upgrade.

#8 Updated by Norbert Thiebaud almost 2 years ago

dry testing on vm171 on excelsior
what I got so far:

new vm, one 200GB disk, 2 network (br0 and br1)
disk patition using lvm with
root
/var
/srv
swap

initial boot

force by the nanny-installer to create a dummy 'user'

finished the install normally.. selected only sudo and default utility tools (iow not graphical env)

rebooted

ssh as the dummy user, su to root
intall minion client
edit /etc/salt/minion.d/9999users.conf as usual
restart the minion
accept it on salt server

highstate on vm171
gran for a while.. after 30 minutes or so.. investigated.. nothing moving yet the command is still hanging... usual salt behavior
kill the command on the server (told us about going to run in background and the id of a magical job file that we can query 'later'
of course no such luck
rebooted vm171
re-run highstate.. this time it runs and repport a result.. mostly green except a weird message about fqdn 'not defined'
this turned out to be a pillar issue... for now that was only impacting the 'anti-virus email scanner.. so ignore

ssh to vm171 things are usable...
delete the dummy user that the installer forced us to create

note: highstate at this point only had the basic stuff like users and sudo....

complete the pillar data using vm178 as an example
add gerrit as a state then highstate

get the irker state files from vm162, add irkerd state to vm171
higstate failed with weird message... after hours wasted figured out the game with saltenv=develop

higstate now work

sudo to gerrit, create a ssh key (with passwd) for 'migration', add en entry in .ssh/config for gerrrit_prod to that effect.
ssh to gerrit_prod install that key as authorized key and put a real shell in /etc/passwd for gerrit's user

I tried to get git-daemon under systemd with no success so I used git-daemon-run package
with a tweak in /etc/sv/git-daemon/run to reflect user and repo's home

on vm171 rsync the git repo form gerrit prod gerrit_site/git to /srv/repositories

cd /srv/repositories
rsync -avp gerrit_prod:gerrit_site/git/ .

fyi took: 36 minutes

retore the gerrit db
log on to vm171 as nthiebaud with -A
assign nthiebaud gerrit role and in general admin for gerrit db

ssh gerrit.libreoffice.org "pg_dump -Fc gerrit" | sudo -u postgres pg_restore -c -d gerrit

retore gerrit_site from prod (excluding git repos already synced)
on vm171 as gerrit

cd /srv/gerrit
rsync -avpP --exclude 'git' gerrit_prod:gerrit_site .

that take about 4 minutes

update gerrit.conf to reflect the new position of the git repos

diff --git a/gerrit.config b/gerrit.config
index af37556..4c9f2e9 100644
--- a/gerrit.config
+++ b/gerrit.config
@ -1,5 +1,5 @
[gerrit]
- basePath = git
+ basePath = /srv/repositories
canonicalWebUrl = https://gerrit.libreoffice.org/
canonicalGitUrl = git://gerrit.libreoffice.org/
changeScreen = CHANGE_SCREEN2

wrt to dns an ip... I'm thinking the best would be to:
make sure we have the ssh_host keys transfered
and then just swap the ip adressed.. iow change the ip of gerrit prod, give the ip to the new box when we are ready to go live. that way we donot even have to do the dns dance
and there is no risk of known_host issue...

#9 Updated by Norbert Thiebaud almost 2 years ago

  • Status changed from New to Closed

Done

Also available in: Atom PDF