vm138 is unusably slow
There's something strange going on with vm138's disk(s) / IO:
$ dd if=/dev/zero of=output.tmp conv=fdatasync bs=384k count=1k
1024+0 records in
1024+0 records out
402653184 bytes (403 MB) copied, 507,908 s, 793 kB/s
This turns the machine totally unusable for any purposes. As TDF is paying quite a lot for the machine (IIRC), this issue is quite serious.
Can anyone debug what's up there? Or can the disk be switched to SSD? Or anything that would help to make it usable?
#1 Updated by Florian Effenberger almost 3 years ago
- Assignee set to Alexander Werner
I assume this is the crashtest VM?
Do you know since when we have those problems?
Was it usable before?
Assigning to Alex, but I guess the above questions can help him debug a bit
#2 Updated by Florian Effenberger almost 3 years ago
We have an infra network issue, more VMs are affected - will follow-up
with details later
#3 Updated by Jan Holesovsky almost 3 years ago
There's no need for this machine to depend on network backup; some weekly backups would be enough.
Cloph also said yesterday on the IRC that it has a rotating disk - so I guess the best here would be to get there a SSD & unplug it from any networking filesystem or what it has; I am sure this would make the crashtesting even faster.
#4 Updated by Florian Effenberger almost 3 years ago
It's not a backup issue - we seem to have general network issues between
the machines all of a sudden, intermittently happening
When ordering the machines we explicitly asked for SSDs and were told
they make no sense at all, so we didn't buy them. Can you check back
with Moggi on that maybe?
#5 Updated by Florian Effenberger almost 3 years ago
In other words: Sure, we can look into that - but I'd like to be sure
that makes sense, as thoughts on that differ ;-)
#6 Updated by Jan Holesovsky almost 3 years ago
I've talked to Moggi now; wrt. the SSD, that was a lack of info on my side - sorry about that!
But we'd really need this machine not to be bound to any networking trouble - ie. it should work regardless of what's up with the network :-) - it should be only crashtesting over and over again, even if the net is down... Moggi said that he was experiencing the slow i/o issue for something like the last 2.5 weeks.
#7 Updated by Florian Effenberger almost 3 years ago
I agree that the crashtesting users have been suffering quite heavily,
and I can only excuse for that
The initial idea - having the VM in our infra to stabilize all systems
and share resources is still charming, but we hit a couple of unexpected
issues that mainly the crashtesting VM was suffering from
I will discuss with Alex during FOSDEM how to proceed, whether we're
safe now, or should untie it from the other machines
Sorry again for this, I understand you're not the happiest at the moment :-)
#8 Updated by Bjoern Michaelsen almost 3 years ago
currently both bugs.documentfoundation.org and gerrit.documentfoundation.org are down -- is this related?
(note there likely will be a major release tomorrow --- extended parts of our infra being down would be a Bad Thing)
#9 Updated by Florian Effenberger almost 3 years ago
That's related to the same issue indeed - following-up with details on
the public lists as soon as possible