Project

General

Profile

Actions

Task #1976

closed

berta & antares: shares from local gluster server not mounted at boot (attempt to mount before glulsterfs-server is fully started)

Added by Christian Lohmaier over 8 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Low
Category:
Backups
Target version:
Team - Pool
Start date:
Due date:
% Done:

0%

Tags:

Description

This affects only berta (and antares), as those are the only hosts that have a gluster volume that is only provided by the local machine. For antares it doesn't matter if volumes aren't mounted at reboot, but for berta it is critical that the backup-berta volume is mounted for rsnapshot.

I tired to add order in the mounts by adding corresponding systemd unit files to add dependencies, but this unfortunately not enough, the mount still is attempted too early.

what is needed nevertheless is to have glusterfs-server depend not only on network (the systemd default target), but networking (the init.d job/service) as otherwise dhcp is not done yet, and gluster might fail to resolve berta.tdf/the other gluster peers:

  • /etc/systemd/system/glusterfs-server.service.d/require_nw-online.conf
    [Unit]
    Wants=networking.service
    After=networking.service
    

This ensures that gluster is started after the internal network is brought up

Failed attemts to ensure the volume mounts succeed:

  • adding x-systemd.requires=glusterfs-server.service to options in /etc/fstab - this seems to be ignored completely by debian 8
  • adding /etc/systemd/system/srv-fileshare-mnt.mount with following content
    [Unit]
    Description=Loads the local fileshare volume
    Wants=glusterfs-server.service
    After=glusterfs-server.service basic.target
    
    [Mount]
    What=antares.tdf:fileshare-antares
    Where=/srv/fileshare/mnt
    Type=glusterfs
    
    [Install]
    WantedBy=multi-user.target
    

    while enabling that unit correctly attempts to mount after the glusterfs-server job is started, it is still too early, gluster isn't done with its initializing and the mount fails. Problem might be that the glusterfs-server is not a native systemd unit, but only a sysvinit one...

So easy workaround would be to add a @reboot cronjob with a sleep.... but of course still remains the chance of having a rsnapshot job triggered before the backup-berta volume is mounted (didn't try what happens when the /srv/rsnapshot symlink points to non-available dir and rsnapshot is run)

Actions #1

Updated by Alexander Werner over 8 years ago

  • Status changed from New to Feedback

Another workaround would be to mount using another gluster server, the connection afterwards would still be local, only the management connection in the beginning is then done using the different server.

Actions #2

Updated by Florian Effenberger over 8 years ago

  • Due date set to 2016-07-31
  • Priority changed from Normal to High
  • Target version set to Q3/2016

This should be fixed before end-July - any status update already?

Actions #3

Updated by Florian Effenberger over 8 years ago

  • Status changed from Feedback to In Progress
  • Assignee changed from Alexander Werner to Christian Lohmaier
Actions #4

Updated by Florian Effenberger over 8 years ago

  • Priority changed from High to Normal
Actions #5

Updated by Florian Effenberger about 8 years ago

Florian Effenberger wrote:

This should be fixed before end-July - any status update already?

Ping? ;-)

Actions #6

Updated by Florian Effenberger about 8 years ago

  • Assignee changed from Christian Lohmaier to Guilhem Moulin
  • Target version changed from Q3/2016 to Pool

Assinging to Guilhem
From what I recall, it's not that urgent - maybe you can quickly sync with Cloph about the impacts/caveats so we can prioritize and plan accordingly?
Adding to pool for the moment, as it doesn't look that super-urgent (anymore)

Actions #7

Updated by Florian Effenberger about 8 years ago

  • Due date deleted (2016-07-31)
Actions #8

Updated by Florian Effenberger almost 8 years ago

Is that still of relevance?

Actions #9

Updated by Florian Effenberger over 7 years ago

Florian Effenberger wrote:

Is that still of relevance?

Ping?

Actions #10

Updated by Guilhem Moulin over 7 years ago

Duno, didn't reboot berta recently :-/ I'll come back to it next time we reboot the box

Actions #11

Updated by Florian Effenberger over 7 years ago

Guilhem Moulin wrote:

Duno, didn't reboot berta recently :-/ I'll come back to it next time we reboot the box

Any updates? IMHO there were no reboots yet, but asking so we can update the ticket accordingly, as it's been open for quite a while

Actions #12

Updated by Guilhem Moulin over 7 years ago

  • Priority changed from Normal to Low

Last time we rebooted berta (june 18) we indeed had to manually mount the gluster fileshare. IMHO not a big deal as long as it doesn't interrupt the boot process and drops to a rescue shell. rsnapshot will complain loud enough (through cron mails) that the mountpoints don't exist ;-)

I'll see if I can adapt cloph's systemd.mount(5) suggested unit file next time we reboot the box, but I'm lowering the priority in the meantime.

Actions #13

Updated by Florian Effenberger almost 7 years ago

  • Parent task deleted (#1316)

Any update on this one? 1,5 years old ticket, so I'm interested in its priority...

Actions #14

Updated by Guilhem Moulin almost 7 years ago

Still (very) low priority, and I'm tempted to close it. It doesn't disrupt the boot process anymore, and we've got a safety net in place to ensure we don't forget to mount what needs to be mounted.

Actions #15

Updated by Florian Effenberger almost 7 years ago

Would like to hear Cloph's opinion on that as well, and then leaving it
to your trusted hands

Actions #16

Updated by Florian Effenberger over 6 years ago

Florian Effenberger wrote:

Would like to hear Cloph's opinion on that as well, and then leaving it
to your trusted hands

Did you talk to Cloph and is there a resolution?

Actions #17

Updated by Guilhem Moulin about 6 years ago

  • Status changed from In Progress to Closed

Yeah, that was brought during at infra call, closing:

Guilhem Moulin wrote:

It doesn't disrupt the boot process anymore, and we've got a safety net in place to ensure we don't forget to mount what needs to be mounted.

Actions

Also available in: Atom PDF