Project

General

Profile

Actions

Task #2136

closed

update a hack fest vm to run 24/7 and be available to new contributors.

Added by Jan Iversen almost 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Category:
Virtualization
Target version:
Team - Q2/2017
Start date:
Due date:
2017-05-02
% Done:

30%

Tags:
Salt

Description

As was decided in the ESC meeting 5h January 2017. The ESC want to give new contributors access to vm while submitting their first patches, eliminating the need for them to install the complicated windows development environment.

The details about the request is with Bjoern, who originally suggested to use e.g. Cloudshare (hack fest vm suggestion comes from me, in order to make a faster result).

The vm should auto update and prebuild master for max. 5 users. The usernames/password should be published on our getInvolved page (something I can do). The users should be as restricted as possible:
- edit files in repo,
- submit patches to gerrit
- run gdb.
In total as usual for a hack fest vm.

Since this is a an experiment, it would be good to know, how often and how long the vm is being used. Therefore we should keep an login audit log.

I do not have karma to initiate/stop/start vms therefore this ticket.

Actions #1

Updated by Christian Lohmaier almost 8 years ago

keeping prebuilt LO around is the tricky part with this proposal, for Hackfest, LO is built once when creating/setting up the VM and then left for the users.

For a constantly available VM, this of course won't work, and just having it build like tinderboxes do will conflict with users trying to use it.

So suggestions on how to solve this welcome. One idea is to build daily/regularly using a dedicated user, and copying/linking files to the user's home when the user logs in, maybe using a filesystem with snapshots that can be mounted separately.

Other problem to solve is who gets/grants access. How to reduce the risk of two users logging in as user1?
And how to clean up?

So bottom line: needs more refining of the proposal to get a working 24/7 solution.. A one-off is easier, although still tricky to get right in terms of user management/assignment. Just providing password (eek, password based login) or a ssh key on a website is not something I'd like to do for a permanent service (or otherwise put: I couldn't sleep well, always having to think whether the users might do stupid stuff with the VM/abuse it/making it part of a botnet...

So while it reads ok to have it "restricted as possible" - this is not an area where I have much experience with. Either you have user account and default permissions (user/group), or you don't have access is all I'm used to. Quota maybe in addition, but nothing that really comes close to locking down/preventing abuse.

Network abuse could be somewhat limited by only allowing creating connections to gerrit and dev-www, and only accepting incoming on ssh - but still not something that is bullet proof...

So Q1 for the thing likely is too ambitious.

Actions #2

Updated by Bjoern Michaelsen almost 8 years ago

Christian Lohmaier wrote:

keeping prebuilt LO around is the tricky part with this proposal, for Hackfest, LO is built once when creating/setting up the VM and then left for the users.
[...]
So suggestions on how to solve this welcome.

I think a prebuild LO once per week should be enough for a start. And I wouldnt force users to bump along: Let them stay on the VM checkout you started from and manually update as they go (yes, that suggests rebuilding themselves).

Alternatively they can freshly start from the new virgin VM image of the week and pull back their work from gerrit.

The risky part of the onboarding is up to the first patch is pushed, so having prebuild product there is essential.

Once the first patch is pushed, the contributor is already invested and having them do rebases/rebuilds isnt too bad: They'll need to at some point anyway.

Other problem to solve is who gets/grants access. How to reduce the risk of two users logging in as user1?

Someone at TDF needs to be the keykeeper. Dev mentor is the natural candidate.

And how to clean up?

I would suggest to not "clean up" these images, but run them as long as they work for a contributor. When they dont, throw away and start from a new virgin prebuild image of the week.

Actions #3

Updated by Bjoern Michaelsen almost 8 years ago

@JanIv: Can you scout out the pricing at CloudShare? E.g. for our use-case what are the expensive parts (updating an image, running an image, having an image laying around idle etc.)

Actions #4

Updated by Bjoern Michaelsen almost 8 years ago

Christian Lohmaier wrote:

So while it reads ok to have it "restricted as possible" - this is not an area where I have much experience with. [...]

But CloudShare should have, as what we are aiming for is pretty much their advertised scenario. I would be surprised if they hadnt considered this and have workable solutions for this (e.g. monitoring network volume and warn if it burns through some limits). Please contact with CloudShare (and other providers) and ask them for their thoughts on this.

Actions #5

Updated by Jan Iversen almost 8 years ago

Someone at TDF needs to be the keykeeper. Dev mentor is the natural candidate.

If a keykeeper is wanted I would of course happily do that as long as I am dev mentor

the getinvolved page would simply ask people to mail mentoring@ to get the details.

I will take a look at cloudshare as requested

Actions #6

Updated by Jan Iversen almost 8 years ago

Bjoern Michaelsen wrote:

@JanIv: Can you scout out the pricing at CloudShare? E.g. for our use-case what are the expensive parts (updating an image, running an image, having an image laying around idle etc.)

The team option seems to have what we want, 166USD/month and 66USD/month for extra users. Especilly the role based accounts would be good.
https://www.cloudshare.com/pricing

Actions #7

Updated by Florian Effenberger over 7 years ago

  • Due date set to 2017-05-02
  • Assignee set to Christian Lohmaier
  • Target version changed from Q1/2017 to Q2/2017

Some notes from the last team call:

  • deadline in May
  • Cloph to take the lead and onboard Xisco, so he can take over? Otherwise, split into operating systems?
    • Linux: can be done with X2go (https://wiki.documentfoundation.org/Hackfests/VMs/Using_a_VM)
    • Windows: missing, testing CloudShare?
    • also need mentor users of the boxes in May
    • as well as syncing with Mike Saunders to communicate it (he might be a good candidate for testing things out as well)
    • questions on user management and number of concurrent users
  • AI Cloph: talk to Björn about open questions, so we can move this forward
Actions #9

Updated by Christian Lohmaier over 7 years ago

Setting up servers to use with manually triggered cleanup can be done in our infra, assuming people are OK with using ssh-tunnel to connect to windows RDP (remmina client can do this quite nicely)

So if someone manages the access to the hosts and can signal: everyone is done with their stuff, OK to prune and prepare next round, then no problem with setting it up.

(the cleanup can be accelerated by having a "shadow" build that can just be copied over - but the problem on when to determine that the current working tree is no longer used/needed will require some "manager" to oversee, at least at the beginning/unless someone comes up with a more concrete to solve this programmatically :-))

re microsoft VMs: would not solve the "start immediately" problem, with enough bandwidth the user might be able to skip a few hours, but that would rely on having identical paths, etc. And installing the development toolchain would still take quite some time.

Actions #10

Updated by Florian Effenberger over 7 years ago

So, what are the next steps and do you need anything like budget
approval before? :-)

Actions #11

Updated by Florian Effenberger over 7 years ago

Last week's status:

-> Windows part is basically ready, we just might need some more VM disk space
      -> 5-7 concurrent users is possible, but rdesktop limitation
      -> AI: Cloph to look into the Cloud-something service Björn mentioned
      -> user management to be handled by Cloph and Xisco
      -> build tree update still needs automating
   -> Linux is easier, but not done yet
   => deadline for both is May, so ~one month to go
Actions #12

Updated by Christian Lohmaier over 7 years ago

  • % Done changed from 0 to 30

Linux VM created (vm195) - is TDF baseline (CentOS 6), has 150GB of diskspace for user-builds.
Automated builds will be done daily (using tinderbox scripts and the trigger method), so Guardkeeper would have to clone the directory (i.e. use the template user as skel-dir with useradd) for interested user and add their ssh key for access.
Windows VM will be setup in similar fashion, just have to fill the directory manually before creating the user (net user ....)

Actions #13

Updated by Florian Effenberger over 7 years ago

Quoting the last team call:

 * PENDING: hack fest vm to run 24/7 and be available to new contributors (#2136)
   -> Linux side is ready, Windows side needs VM cloning, will be done this week 
Actions #14

Updated by Florian Effenberger over 7 years ago

With the deadline approaching, I wanted to follow-up on this.

Cloph, has everything been setup properly?
If not yet, what's the ETA?

Adding Mike to this ticket, as I'd like him to come up with some blogposting/video/howto on that thingie. IMHO Mike has no experience yet with the build environment, which makes him the ideal candidate to get credentials and have a look to see if things work like a charm as planned or if there's a need to improve stuff. ;-)

Adding Sophie to this ticket, as she will lead the next team call and this should be talked through, given the deadline is approaching.

Cloph, in case you're not finished yet, please prioritize this. Thanks!

Actions #15

Updated by Mike Saunders over 7 years ago

Just talked about this in the team call - Cloph is going to send me some instructions, which I can then test and identify any gaps. In addition, I could try making a video to show how it works.

Actions #16

Updated by Florian Effenberger over 7 years ago

Sounds great!

The Document Foundation Redmine wrote on 2017-05-02 at 15:44:

Just talked about this in the team call - Cloph is going to send me some
instructions, which I can then test and identify any gaps. In addition,
I could try making a video to show how it works.

Actions #17

Updated by Florian Effenberger over 7 years ago

  • Status changed from New to Closed

Closing this now
From what I've heard, it worked quite well and can easily be re-purposed for the next Hackfest
Cloph, can you share some insight on the use of the VM in this ticket, as asked for by Björn?

Actions

Also available in: Atom PDF