Project

General

Profile

Actions

Task #2952

open

Evaluate Discourse

Added by Sophie Gautier over 5 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
Team - Q3/2021
Start date:
2019-09-23
Due date:
% Done:

90%

Tags:

Description

We want to evaluate Discourse as a possible replacement for AskBot.


Files

01-askapi_comments.patch (3.57 KB) 01-askapi_comments.patch AskBot API: Expose comments Guilhem Moulin, 2021-03-30 13:09
02-askapi_text.patch (697 Bytes) 02-askapi_text.patch AskBot API: Expose raw (non-truncated) text Guilhem Moulin, 2021-03-30 13:09
03-askapi_closed.patch (722 Bytes) 03-askapi_closed.patch AskBot API: Expose closed status Guilhem Moulin, 2021-03-30 13:38
migrate.pl (45 KB) migrate.pl migration script Guilhem Moulin, 2021-07-31 01:46
categories.json (1.53 KB) categories.json Guilhem Moulin, 2021-07-31 01:46
migrate.conf (476 Bytes) migrate.conf Guilhem Moulin, 2021-07-31 01:46

Subtasks 2 (2 open0 closed)

Task #3246: Discourse to mailing list bridgeNew

Actions
Task #3567: Bridging mailing lists to DiscourseNewSophie Gautier

Actions
Actions #1

Updated by Florian Effenberger over 5 years ago

  • Target version set to Qlater
Actions #3

Updated by Sophie Gautier over 5 years ago

I had a look at the documentation and one instance, I think it could be a good replacement for us. To go further, I would like Guilhem to install a test instance and I would like to test it with Oliver (as he has a good knowledge of all Ask functionalities).
I'll write a short report on my findings and will attach it here.

Actions #4

Updated by Sophie Gautier over 5 years ago

So here is a short summary, out of the usual Q&A:
- 5 trust levels from 0 to 5
- levels are changed by reading and acting on the site
- level 3 means regular user and level 5 means leader
- badges and emoji available
- possibility to summarize topics
- reply via mail or mobile apps
- 72 languages in transifex, various completions
- SSO

Actions #5

Updated by Sophie Gautier over 5 years ago

  • Status changed from New to In Progress
Actions #6

Updated by Aron Budea over 5 years ago

I'd also suggest considering Discourse as a contributor discussion board. I'm familiar with the dev/QA mailing list and IRC channels, and I think a discussion board would be much more convenient to use for new or less frequent contributors, while it could also likely provide all the functionalities of a mailing list.

Actions #7

Updated by Franklin Weng almost 5 years ago

Here is a fresh install: https://discourse.slat.org/

I've sent a moderator invitation to Sophie.

Actions #8

Updated by Sophie Gautier almost 5 years ago

  • Assignee changed from Sophie Gautier to Guilhem Moulin
  • Target version changed from Qlater to Q2/2020

Reassign to Guilhem and set target

Actions #9

Updated by William Gathoye over 4 years ago

Franklin Weng wrote:

Here is a fresh install: https://discourse.slat.org/

I've sent a moderator invitation to Sophie.

Possible to get an admin level access as well? I want to checkout #3246 wrt. integration with mailing lists. Thanks, Franklin!

Actions #10

Updated by Florian Effenberger over 4 years ago

  • Target version changed from Q2/2020 to Q3/2020
Actions #11

Updated by Guilhem Moulin over 4 years ago

  • Assignee changed from Guilhem Moulin to Sophie Gautier

Sorry for procrastinating this forever Sophie, kept being distracted and had to restart several times from scratch :-( Had trouble getting integration with our Single Sign-On portal, but got that to work finally (with a deployment in-line with the current best practices from the infra team): https://vm222.documentfoundation.org (login is open to all with an SSO account).

Actions #12

Updated by Guilhem Moulin over 4 years ago

Note that there are probably still rough edges on the instance (branding and footer comes to mind of course, but there might be glitches with the backend too), and email replies/newposts are currently not enabled (unlike for redmine): we need to configure the MTA for that, and it's not a blocker for the evaluation (nor a regression as our AskBot instance doesn't support this).

Actions #13

Updated by Sophie Gautier over 4 years ago

Guilhem Moulin: It seems not possible to upload documents to the instance, could you have a look at it? On meta-discourse I found this answer: "For security reasons, attachments/documents are not allowed by default. You have to whitelist the extensions using the “authorized extensions” site setting."

Actions #14

Updated by Guilhem Moulin over 4 years ago

Tweaked the list to match AskBot's.

Actions #15

Updated by Sophie Gautier over 4 years ago

Guilhem Moulin wrote:

Tweaked the list to match AskBot's.

Thanks a lot!

Actions #16

Updated by Beluga Beluga over 4 years ago

A random thing I ran into: Manjaro Linux recently suffered a messed-up forum migration to a new server. They had to start from scratch and abandon all old data. EndeavourOS maintainers say they also ran into the issue: https://forum.endeavouros.com/t/manjaro-to-endeavouros-experiences/6398/132

"We had the exact same problem when moving to the new server @Alpix found the solution, it is a big flaw in Discourse, the images are saved in another file and aren’t automatically put back on the backup, it has to be done manually, but no Discourse manual is telling you that."

Actions #17

Updated by Sophie Gautier over 4 years ago

It seems display of online image is broken, see https://vm222.documentfoundation.org/t/please-allow-document-upload/34/7?u=sophi. Clicking on it allows its display though.

Actions #18

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

It seems display of online image is broken

Should be better now

Actions #19

Updated by Sophie Gautier over 4 years ago

It seems it's not possible to post more than 3 topics per user on their first contribution, they have to wait for 19 hours to be able to post again. Could this be changed to more or reduce the delay? I'm aware it could be to avoid spam but I find it too restrictive.

Actions #20

Updated by Sophie Gautier over 4 years ago

Could this plugin https://meta.discourse.org/t/how-to-mark-a-topic-as-resolved/81793 be added? thanks in advance!

Actions #21

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

Could this be changed to more or reduce the delay?

Raised to 10 from 3. (AFAICT it's not possible to change the delay, there are only two settings to play with: “max topics per day” and “max topics in first day”.)

Actions #22

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

Could this plugin https://meta.discourse.org/t/how-to-mark-a-topic-as-resolved/81793 be added?

Done. “Allow topic owner and staff to mark a reply as the solution” can be done globally or per-category; I flipped the global switch for now, let me know if you want to only do it for some but not all categories.

Actions #23

Updated by Sophie Gautier over 4 years ago

I just realized that tags are not allowed by default. Could you activate them, I found https://meta.discourse.org/t/tags-category-restrictions-tag-groups-relationships/48260
In the min trust to create tag, I would set it to 2 (if I understand well it's for group levels) and O otherwise. thanks a lot :)

Actions #24

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

I just realized that tags are not allowed by default. Could you activate them
In the min trust to create tag, I would set it to 2

Done

Actions #25

Updated by Sophie Gautier over 4 years ago

I will add the 16 language categories in alphabetic order, could you activate the Fixed category position parameter so they are not sorted by topic number? thanks in advance

Actions #26

Updated by Sophie Gautier over 4 years ago

As a memo for the final instance, it will be better to give all new users level 1 otherwise they won't be able to provide a document demonstrating their problem, which is usual on Ask. The test instance gives level 1 to new users until there are 50 registered members.

Actions #27

Updated by Sophie Gautier over 4 years ago

Posts must be at least 20 chars, would it be possible to reduce that to 5? sometimes you only want to say: thanks!
And it seems that the warning message that is displayed stays even if you try to close it.

Actions #28

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

I will add the 16 language categories in alphabetic order, could you activate the Fixed category position parameter so they are not sorted by topic number?

Done

Sophie Gautier wrote:

As a memo for the final instance, it will be better to give all new users level 1 otherwise they won't be able to provide a document demonstrating their problem, which is usual on Ask.

Ah? AFAICT the minimum trust level to provide attachments is 0. I guess we'll see soon enough if that works :-)

Sophie Gautier wrote:

Posts must be at least 20 chars, would it be possible to reduce that to 5?

Done, there are 3 settings for this:

  • Minimum allowed post length in characters: 5 (default 20)
  • Minimum allowed first post (topic body) length in characters: 20 (default)
  • Minimum allowed post length in characters for messages: 10 (default)
Actions #29

Updated by Sophie Gautier over 4 years ago

Guilhem Moulin wrote:

Sophie Gautier wrote:

I will add the 16 language categories in alphabetic order, could you activate the Fixed category position parameter so they are not sorted by topic number?

Done

Thanks!

Sophie Gautier wrote:

As a memo for the final instance, it will be better to give all new users level 1 otherwise they won't be able to provide a document demonstrating their problem, which is usual on Ask.

Ah? AFAICT the minimum trust level to provide attachments is 0. I guess we'll see soon enough if that works :-)

ok, I read this on some articles in meta discourse, but default may have changed in the meantime the article was published :-)

Sophie Gautier wrote:

Posts must be at least 20 chars, would it be possible to reduce that to 5?

Done, there are 3 settings for this:

  • Minimum allowed post length in characters: 5 (default 20)
  • Minimum allowed first post (topic body) length in characters: 20 (default)
  • Minimum allowed post length in characters for messages: 10 (default)

Great, thanks! I don't think the other settings need to be changed.

Actions #30

Updated by Sophie Gautier over 4 years ago

On this page https://meta.discourse.org/t/discourse-solved-accepted-answer-plugin/30155 on the Filter area, I see that you can set a filter to sort Solved or Unsolved answers. Would it be possible to add it? thanks in advance

Actions #31

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

I see that you can set a filter to sort Solved or Unsolved answers. Would it be possible to add it?

Yup done

Actions #32

Updated by Sophie Gautier over 4 years ago

Noting here some plug-ins that I would like to evaluate to see if they can help Q&A management:
- Follow
- Multilingual
- Knowledge Explorer
- Canned Replies
- Tool Tips
Guilhem Moulin, it's not a request to install them, I want to read more and see them in action to evaluate if we need them :-)

Actions #33

Updated by Sophie Gautier over 4 years ago

Could you check if filtering by tags is activated? I've created some tags but I can't see the "all tags" filter at the top like here https://ask.fedoraproject.org/categories. Thanks in advance!

Actions #34

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

Could you check if filtering by tags is activated?

It wasn't (this this the default behavior), toggled the switch now.

Actions #35

Updated by Sophie Gautier over 4 years ago

In the 'Summarize' setting, could you set the 'Summary posts required' to 10 (see the screen shot here https://meta.discourse.org/t/ranking-answers/108259) and 'Summary score treshold' to 5 so we can play with it?

Actions #36

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

In the 'Summarize' setting, could you set the 'Summary posts required' to 10 and 'Summary score treshold' to 5

Done

Actions #37

Updated by Sophie Gautier over 4 years ago

Guilhem Moulin wrote:

Sophie Gautier wrote:

In the 'Summarize' setting, could you set the 'Summary posts required' to 10 and 'Summary score treshold' to 5

Done

Thanks!

Actions #38

Updated by Sophie Gautier over 4 years ago

Would it be possible that you install this plug-in https://github.com/paviliondev/discourse-question-answer
I hope it will solve the forum appearance for those used to the Ask indentation of the comments and answers.
It seems that it installs at category level (but I may be completely wrong) so could you activate it only for the English category at the moment. I'll try to figure out how it works :-) Thanks a lot in advance.

Actions #39

Updated by Guilhem Moulin over 4 years ago

Sophie Gautier wrote:

Would it be possible that you install this plug-in https://github.com/paviliondev/discourse-question-answer

Sure, done :-)

It seems that it installs at category level (but I may be completely wrong) so could you activate it only for the English category at the moment.

That's correct but it took me a while to find where to define it. It's not in the admin settings but in the category settings (Edit » Settings » Scroll to the bottom). There are also toggles to disable likes on questions (resp. answers, comments), all left to the default setting for now: likes are enabled everywhere.

Actions #40

Updated by Sophie Gautier about 4 years ago

  • Target version changed from Q3/2020 to Q4/2020

Just an update:
- project accepted by the BoD
- report sent to project list
- will define a migration roadmap for next year with Guilhem.
- will open a new issue "Discourse migration" when done.

Actions #41

Updated by Sophie Gautier almost 4 years ago

  • Target version changed from Q4/2020 to Q2/2021
Actions #42

Updated by Beluga Beluga almost 4 years ago

I have been experimenting with the migration and Guilhem patched AskBot so far to add an API endpoint for fetching comments and a new field for displaying the full post content rather than the silly truncated summary.

A thing I am currently interested in solving and hope that Guilhem can help with:
Closed status should be returned by API, but isn't https://github.com/ASKBOT/askbot-devel/blob/master/askbot/views/api_v1.py#L69
Closed thread example: https://ask.libreoffice.org/en/api/v1/questions/238673/

Actions #43

Updated by Guilhem Moulin almost 4 years ago

Beluga Beluga wrote in #note-42:

A thing I am currently interested in solving and hope that Guilhem can help with:
Closed status should be returned by API, but isn't https://github.com/ASKBOT/askbot-devel/blob/master/askbot/views/api_v1.py#L69
Closed thread example: https://ask.libreoffice.org/en/api/v1/questions/238673/

Added closed, closed_at, closed_by.id and closed_by.username. I also attach API patches for posterity.

Actions #44

Updated by Guilhem Moulin almost 4 years ago

Actions #45

Updated by Guilhem Moulin almost 4 years ago

I was unable to find attachments in the db schemas. Links appear to be hardcoded in the post source: https://github.com/ASKBOT/askbot-devel/blob/master/askbot/media/wmd/wmd.js#L1831

See for instance https://ask.libreoffice.org/en/api/v1/answers/129593/ for a post containing both an inlined image and an attached file.

So I guess importing images resp. attachments means parsing link markup matching /!\[[^\]]+\]\(\/upfiles\/[^\)]+\)/ resp. /(?<!!)\[[^\]]+\]\(\/upfiles\/[^\)]+\)/ and uploading https://ask.libreoffice.org/upfile/… to Discourse (using the link text as alt for images). In addition links name need to be updated, AFAICT discourse uses upload://BASE62(SHA1($DATA)) in the markdown source.

Actions #46

Updated by Beluga Beluga almost 4 years ago

Documenting my work so far and hope that others can pick this up now:

The basics:

AskBot API docs
As seen in the above comments, Guilhem added a 'comments' endpoint to the API, added a 'text' field for posts to provide the full contents of a post and made the 'closed' topic status work.

Using paging with the API:
https://ask.libreoffice.org/en/api/v1/questions/?page=1
https://ask.libreoffice.org/en/api/v1/users/?page=1

After we lock down AskBot, we will iterate through all the pages (the total is seen in the 'pages' field of users and questions endpoints).

Discourse API docs

For locally setting up Discourse, I used the dev setup instructions

git clone https://github.com/discourse/discourse.git
cd discourse
sudo d/boot_dev --init

Asks for admin email and password.

Posting is rate-limited for ordinary users, so edit config/site_settings.yml and change the topic and post creation limits in the rate_limits section to 0

Also in the posting section of config/site_settings.yml change min_post_length, min_first_post_length and min_topic_title_length to 5

Start Rails server with Sidekiq:

sudo d/unicorn

Navigate to http://localhost:9292/

Log in with your admin account.

Navigate to http://localhost:9292/admin/api/keys and create a new API key for all users (global key)

Navigate to http://localhost:9292/admin/site_settings/category/tags
Check "Enable tags on topics?"

To delete multiple topics, click the icon to the left of the "Topic" column title in the topics list.

Open questions:

  • How to match categories? Probably a static dictionary of language categories after they have been created in Discourse. Category for English ones will be "Question"?
  • Topics can be closed, but Discourse API docs don't tell how to specify closing date or closing user Maybe possible through reverse engineering the API
  • For answers that are accepted, append a unique string into the content like is_accepted_answer (matches the field name of the Discourse plugin). Then, run a find & replace opration in Discourse database that adds value for the custom field is_accepted_answer as used by the Solved plugin that we use. Alternative would be to add API support for the Solved plugin, example for another plugin
  • Migrating the scores for https://github.com/paviliondev/discourse-question-answer would again require some kind of database work

Python script used for the experiments is included below. It uses pydiscourse and waiting. Feel free to reimplement it without these libs, if you like. It does the hack of appending is_accepted_answer into the content of accepted answers.

For users, employ Pydiscourse's create_user() function.

#!/usr/bin/env python3
# https://pypi.org/project/pydiscourse/
# https://pypi.org/project/waiting/
import time
import datetime
import urllib.request
import json
from pydiscourse import DiscourseClient
from waiting import wait

# in the final script we need to create a new client for each topic, answer and comment and point api_username to the correct username
client = DiscourseClient(
        'http://localhost:9292',
        api_username='username',
        api_key='key')

def askbot_p(post_type, post_id):
    res = urllib.request.urlopen('https://ask.libreoffice.org/en/api/v1/' + post_type + '/' + str(post_id))
    body = res.read()

    return json.loads(body.decode('utf-8'))

def is_post_ready(post):
    if post['id']:
        return True
    return False

#client.create_post(content, category_id=None, topic_id=None, title=None, tags=[])

page = askbot_p('questions', '?page=1')

for question in page['questions']:
    q = askbot_p('questions', question['id'])
    added_at = datetime.datetime.utcfromtimestamp(int(q['added_at'])).strftime('%Y-%m-%dT%H:%M:%SZ')
    topic = client.create_post(q['text'], 1, None, q['title'], q['tags'], created_at = added_at)
    wait(lambda: is_post_ready(topic), timeout_seconds=10, waiting_for="Question to be ready")
    discourse_t = topic['topic_id']
    accepted_answer_id = q['accepted_answer_id']
    for comment in q['comment_ids']:
        c = askbot_p('comments', comment)
        added_at = datetime.datetime.utcfromtimestamp(int(c['added_at'])).strftime('%Y-%m-%dT%H:%M:%SZ')
        post = client.create_post(c['text'], None, discourse_t, None, [], created_at = added_at, reply_to_post_number = topic['post_number'])
        wait(lambda: is_post_ready(post), timeout_seconds=10, waiting_for="Question comment to be ready")
    for answer in q['answer_ids']:
        a = askbot_p('answers', answer)
        added_at = datetime.datetime.utcfromtimestamp(int(a['added_at'])).strftime('%Y-%m-%dT%H:%M:%SZ')
        text = a['text']
        if accepted_answer_id:
            text += 'is_accepted_answer'
        post = client.create_post(text, None, discourse_t, None, [], created_at = added_at)
        discourse_a = post['post_number']
        wait(lambda: is_post_ready(post), timeout_seconds=10, waiting_for="Answer to be ready")
        for comment in a['comment_ids']:
            c = askbot_p('comments', comment)
            added_at = datetime.datetime.utcfromtimestamp(int(c['added_at'])).strftime('%Y-%m-%dT%H:%M:%SZ')
            post = client.create_post(c['text'], None, discourse_t, None, [], created_at = added_at, reply_to_post_number = discourse_a)
            wait(lambda: is_post_ready(post), timeout_seconds=10, waiting_for="Answer comment to be ready")
    if q['closed']:
        client.update_topic_status(discourse_t, "closed", 1)
Actions #47

Updated by Guilhem Moulin almost 4 years ago

Some crude metrics to guestimate the time required for migration (during which we'll make AskBot read-only): we have just shy of 100k accounts, 62k questions, 67k answers and 125k comments. That gives a total of 350k URLs to query.

I was able to locally extract a random 10% sample in about 45min (using a single connection to avoid TCP and TLS handshake overheads). This doesn't account for the JSON parsing, markup massaging, and discourse posts, but these can be done in parallel and even sequentially I'm now confident we can complete the migration under 24h without other (potentially brittle) optimization.

Actions #48

Updated by Guilhem Moulin almost 4 years ago

Guilhem Moulin wrote in #note-12:

and email replies/newposts are currently not enabled

This is now configured: users can send follow-up by replying to an email notification, and new topics can be created by dropping a message to the category address (right now only the English category has such an address, question+english@).

One thing that comes to mind: what should the canonical name be? ask.libreoffice.org is fine as far as the Q&A platform is concerned, but if we're going to use it as replacement for (some) mailing lists too I think a subdomain of .documentfoundation.org would be more appropriate. What's your take, Sophie Gautier?

Actions #49

Updated by Guilhem Moulin almost 4 years ago

Beluga Beluga wrote in #note-46:

  • How to match categories? Probably a static dictionary of language categories after they have been created in Discourse. Category for English ones will be "Question"?

We already have that set up on vm222. There is an arbitrary list of languages on both sides and indeed the migration script needs to remap.

  • Topics can be closed, but Discourse API docs don't tell how to specify closing date or closing user

The user is specified via Api-Username header value, for the closing date I had to do it via database surgery :-)

  • For answers that are accepted, append a unique string into the content like is_accepted_answer (matches the field name of the Discourse plugin). Then, run a find & replace opration in Discourse database that adds value for the custom field is_accepted_answer as used by the Solved plugin that we use. Alternative would be to add API support for the Solved plugin, example for another plugin

Mangling the text in directly in the DB afterwards is a big no (it messes with search vectors and maybe also rendering). Fortunately the API supports that already via POST to /solution/accept.

The vote count is only for answers, Discourse's score on posts and topics is autogenerated <https://meta.discourse.org/t/score-score-for-a-post-how-is-this-calculated/40295> and AskBot's score doesn't map to any of theses so I guess it'll be lost in translation. We'll preserve the view count though, as well as upvotes on answers (but not the users who upvoted).

Python script used for the experiments is included below. It uses pydiscourse and waiting.

Thanks. I'm more fluent in Perl than Python so I wrote it again from scratch. No need to use a dedicated library in my view, native API calls work fine too :-) Got a working prototype (incl. closed date and status, view and upvote counts, accepted answer, and re-import of attachments), what's left now is attribution and mangling for internal links to other questions or user profiles.

Actions #50

Updated by Sophie Gautier almost 4 years ago

Guilhem Moulin wrote in #note-48:

Guilhem Moulin wrote in #note-12:

and email replies/newposts are currently not enabled

This is now configured: users can send follow-up by replying to an email notification, and new topics can be created by dropping a message to the category address (right now only the English category has such an address, question+english@).

One thing that comes to mind: what should the canonical name be? ask.libreoffice.org is fine as far as the Q&A platform is concerned, but if we're going to use it as replacement for (some) mailing lists too I think a subdomain of .documentfoundation.org would be more appropriate. What's your take, Sophie Gautier?

Thanks for all your work! Agreed on that, let's not limit to Q&A if we further extend Discourse usage in the future :)

Actions #51

Updated by Guilhem Moulin almost 4 years ago

Sophie Gautier wrote in #note-50:

Agreed on that, let's not limit to Q&A if we further extend Discourse usage in the future :)

Then we need to carefully choose a hostname that's generic enough and more specific question categories no? I suppose a category “English” in ask.libreoffice.org is fine to ask English questions about LibreOffice, but it's weird to use ask.libreoffice.org to talk about Foundation stuff, and we need separate categories for Q&A and “normal” discussion. We could remap later once the migration is complete, but the earlier the better :-)

I now imported the latest 10 pages of each language. Feel free to poke around, but note that it's automatically generated from a blank state and I'll likely reset it many times in the next days. It's important that AskBot powerusers check that everything looks right: it's still possible to fix posts mangle/profile mangling and/or fix counters/timestamps before the migration, but it'll be near impossible once it goes production.

Internal link to questions/answers/comments are not replaced yet. Some notes about user attribution:

  • Avatar are preserved, as well as created and last seen dates, and number of upvotes.
  • Reputation (karma) and badges are lost in translation. Users might dislike losing their badges and karma, but I'm not sure how to map these.
  • For profiles found in LDAP, I used the SSO username instead of AskBot's. Users not in SSO yet will regain access to their account when they create an SSO profile with the same email address (the username doesn't need to match). New accounts are automatically provisioned from SSO on first login; the Discourse username is taken from SSO if their is no conflict, otherwise (if there was an old AskBot profile with the same username) the system resolves the conflict by appending a number.
  • Full names are taken from SSO (when defined).
  • Discourse has stricter requirement on usernames than AskBot. For instance discourse usernames may not end with a dot or an underscore. I tried to map them sanely, and replaced @-addressing accordingly to target the right profile.
  • Both AskBot and Discourse (optionally) allow unicode in usernames, but our SSO (intentionally) doesn't. Existing non-ASCII usernames are preserved (for accounts not in SSO), but no new one will be created.

In order to bootstrap the site, I used the following made-up heuristic for trust levels:

  • TL4 for users with karma ≥1000 who were last seen in the past year;
  • TL3 for users with karma ≥250 who posted at least 100 answers and were last seen in the past year;
  • TL2 for users who posted at least 100 answers/comments;
  • TL1 for users who have at least 5 posts (questions/answers/comments) and were last seen at least 10 days after their joined date; and
  • TL0 for everyone else.

I'm now confident we can complete the migration under 24h without other (potentially brittle) optimization.

I have to retract this, the import is actually slower. But it's CPU-bound so in theory it can be improved by processing multiple pages in parallel.

Actions #52

Updated by Sophie Gautier almost 4 years ago

Guilhem Moulin wrote in #note-51:

Sophie Gautier wrote in #note-50:

Agreed on that, let's not limit to Q&A if we further extend Discourse usage in the future :)

Then we need to carefully choose a hostname that's generic enough and more specific question categories no? I suppose a category “English” in ask.libreoffice.org is fine to ask English questions about LibreOffice, but it's weird to use ask.libreoffice.org to talk about Foundation stuff, and we need separate categories for Q&A and “normal” discussion. We could remap later once the migration is complete, but the earlier the better :-)

Yes of course, let's discuss with the team first and see what it gives :)

I now imported the latest 10 pages of each language. Feel free to poke around, but note that it's automatically generated from a blank state and I'll likely reset it many times in the next days. It's important that AskBot powerusers check that everything looks right: it's still possible to fix posts mangle/profile mangling and/or fix counters/timestamps before the migration, but it'll be near impossible once it goes production.

ok, I'll poke them and ask for their detailed feedback asap.

Internal link to questions/answers/comments are not replaced yet. Some notes about user attribution:

  • Avatar are preserved, as well as created and last seen dates, and number of upvotes.
  • Reputation (karma) and badges are lost in translation. Users might dislike losing their badges and karma, but I'm not sure how to map these.
  • For profiles found in LDAP, I used the SSO username instead of AskBot's. Users not in SSO yet will regain access to their account when they create an SSO profile with the same email address (the username doesn't need to match). New accounts are automatically provisioned from SSO on first login; the Discourse username is taken from SSO if their is no conflict, otherwise (if there was an old AskBot profile with the same username) the system resolves the conflict by appending a number.
  • Full names are taken from SSO (when defined).
  • Discourse has stricter requirement on usernames than AskBot. For instance discourse usernames may not end with a dot or an underscore. I tried to map them sanely, and replaced @-addressing accordingly to target the right profile.
  • Both AskBot and Discourse (optionally) allow unicode in usernames, but our SSO (intentionally) doesn't. Existing non-ASCII usernames are preserved (for accounts not in SSO), but no new one will be created.

ok, thanks a lot for all that. I hope that TL will compensate the loss of Karma. I'll discuss in my mail to power users

In order to bootstrap the site, I used the following made-up heuristic for trust levels:

  • TL4 for users with karma ≥1000 who were last seen in the past year;
  • TL3 for users with karma ≥250 who posted at least 100 answers and were last seen in the past year;
  • TL2 for users who posted at least 100 answers/comments;
  • TL1 for users who have at least 5 posts (questions/answers/comments) and were last seen at least 10 days after their joined date; and
  • TL0 for everyone else.

Great!

I'm now confident we can complete the migration under 24h without other (potentially brittle) optimization.

I have to retract this, the import is actually slower. But it's CPU-bound so in theory it can be improved by processing multiple pages in parallel.

Ok. Again, thank you so much for your work.

Actions #53

Updated by Guilhem Moulin almost 4 years ago

The partial import now contains the last 150 question pages of each language. That represents only 10% of English questions but all other languages are fully imported. The operation took 10 hours using 12 threads and breaks down to 20k questions (32% of all AskBot questions), 22k answers (32%), and 39k comments (31%). Blindly interpolating we should be able to import the full forum in about 30h; it'd likely be faster using native bindings but I don't know Ruby well enough and I suppose turning AskBot read-only for 1.5 day is acceptable as a one-off thing.

We have some rare cases of data loss, typically when one uploaded a corrupted picture to AskBot or renamed a PDF to .jpg to workaround the extension restriction. Discourses refuses to import pictures it can't decode, so I treated these as deleted/missing attachments, see for instance https://vm222.documentfoundation.org/t/nog-steeds-problemen-met-het-invoegen-van-tekstvakken-en-pijltjes-zie-bijlage . These broken/corrupted uploads represent only 12 of the 10k uploads which IMHO is acceptable as “lost in translation”. Another thing that's lost is the edit history: only final (non-drafted and non-deleted) question/answer/comments are imported. Also users who never posted anything (or whose all posts have been deleted) are not imported, they'll get a fresh account next time they log in.

Another thing, I see uses of tdf#XYZ in some posts and AskBot linkifies these, see for instance https://ask.libreoffice.org/es/question/268771/writer-reemplazo-el-texto-de-una-tabla-con-0/?comment=279459#post-id-279459 . We can do the same in Discourse using https://meta.discourse.org/t/linkify-words-in-post . I tried it and it works, however it seems Discourse want to make it a core feature so this is not configured in the current snapshot. If we can't wait for the new release and tdf#XYZ support is a requirement, then we can use that theme component in the meantime.

Last but not least, internal crossed-linked URLs have been rewritten (so Discourse can better track crossed-linked questions), but URLs are preserved too; the above post can for instance be reached at https://vm222.documentfoundation.org/es/question/268771/writer-reemplazo-el-texto-de-una-tabla-con-0/?comment=279459 :-)

Actions #55

Updated by Guilhem Moulin over 3 years ago

Thanks for checking :-) The first of the missing 3 was published at on Tuesday at 09:49:09 CEST, so shortly before the import was finished but after it was started. We can't reliably import posts that are published after the migration starts (Tue Apr 19 at 03:00 CEST for the current snapshot), that's why we need to turn AskBot read-only during the migration. What this snapshot contains is: all non-deleted english questions from #267198 to #304945, all non-deleted questions from all other languages up to Apr 19 03:00 UTC, and all non-deleted answers/comments to these questions up to that point. Some posts published between 03:00 and 13:00 CEST are included as well but that depends on when the parent question was processed so for these the import is by design unreliable. Missing items within that set are bugs but not outside :-)

Actions #56

Updated by Guilhem Moulin over 3 years ago

Noticed something something add with raal's account: he has TL0 “new user” which is definitely wrong :-) On closer look this is because https://ask.libreoffice.org/en/users/37866/raal/ was merged into https://ask.libreoffice.org/en/users/16507/zcr/ not the other way around. There are a few account duplicates on AskBot — in most cases the duplicate was created by mistake and the users will go back to use the other account (see for instance Olivier's https://ask.libreoffice.org/en/users/7788/ohallot/ which was merged into https://ask.libreoffice.org/en/users/11/olivier/ ) so I blindly chose the account with lower ID as the one to retain. In raal's case the newer account is the one that's used, that's why he has TL0 on Discourse. I suggest not to bother for now, there are only a handful of duplicate accounts and manual TL adjustments will likely be needed anyway (the heuristic from #note-51 is only meant to bootstrap the site and not have everyone at TL0).

Actions #57

Updated by Sophie Gautier over 3 years ago

Guilhem Moulin wrote in #note-55:

Thanks for checking :-)

It was Pierre-Yves feedback in fact :)

The first of the missing 3 was published at on Tuesday at 09:49:09 CEST, so shortly before the import was finished but after it was started. We can't reliably import posts that are published after the migration starts (Tue Apr 19 at 03:00 CEST for the current snapshot), that's why we need to turn AskBot read-only during the migration. What this snapshot contains is: all non-deleted english questions from #267198 to #304945, all non-deleted questions from all other languages up to Apr 19 03:00 UTC, and all non-deleted answers/comments to these questions up to that point. Some posts published between 03:00 and 13:00 CEST are included as well but that depends on when the parent question was processed so for these the import is by design unreliable. Missing items within that set are bugs but not outside :-)

ok, I'll check the migration date and time if I have this feedback again. Thank you and sorry for the noise then.

Actions #58

Updated by Jun Nogata over 3 years ago

Hi,

The Japanese community reported that there is no link to the TDF bug number(it like tdf#138749).

Other than that, there were no reports and there seems to be no problem. 🙂

Actions #59

Updated by Guilhem Moulin over 3 years ago

Jun Nogata wrote in #note-58:

The Japanese community reported that there is no link to the TDF bug number(it like tdf#138749).

See #note-53 §3.

Other than that, there were no reports and there seems to be no problem. 🙂

Cool :-)

Actions #60

Updated by Sophie Gautier over 3 years ago

Posting feedback I received:
From Miguel Angel
Well, in 'all categories' (main window) I can't see what the sort order is, FMPOV should be at top languages with more questions, if that's no possible then alphabetically but with the language name in their language, like are showed now.

Actions #61

Updated by Sophie Gautier over 3 years ago

From Daniel A. Rodriguez
About karma & badges, is there any chance to at least reflect actual status
in the wiki?. To avoid completely loose that information

Actions #62

Updated by Sophie Gautier over 3 years ago

From Hagar Delest
Tried Move to answers function to change a comment in answer. I get the error message: You are not permitted to view the requested resource

Actions #63

Updated by Sophie Gautier over 3 years ago

From Alex Kemp
/Browsing //vm222 as anonymous user (Chromium Version 89.0.4389.114 (Developer Build) built on Debian 10.9, running on Debian 10.0 (64-bit)):
/

1. Viewed my latest post.
Clicking up-vote link next to Reply threw me to login (up/down-vote
controls should be inoperative).
Back-button then gave problematic page (cannot tell what was happening).
Attempted to get to /vm222/ again but my latest post now missing.
Whoops.
2. I viewed
https://vm222.documentfoundation.org/t/is-there-a-shortcut-for-switching-groups-in-calc/29
The page is most bland & uninspiring
Almost all text is same-size
Poor page layout & is missing many sections:-

  • Vote-count + Vote-controls are missing next to Question
    (obviously Vote-controls should be inoperative, but needs to exist)
  • Tag display is poor (plain-text separated by commas).
    (Askbot display is far better)
  • Information link is useful, but perhaps only if you are logged in
  • Questioner-Info block almost entirely missing
    (Askbot display is far better)
  • 'Solved by' addition is useful
  • Answer Downvote-Link button missing
  • Answerer-Info block almost entirely missing
    (Askbot display is far better)
  • No '/Propose Your Solution/' section.
  • Related questions missing
Actions #64

Updated by Sophie Gautier over 3 years ago

From Pierre-Yves Samyn
Messages are marked as not read per default. During the "real" migration, it'd be good if messages are in the same state than in Ask.

Actions #65

Updated by Guilhem Moulin over 3 years ago

Woo thanks!

Sophie Gautier wrote in #note-60:

Well, in 'all categories' (main window) I can't see what the sort order is, FMPOV should be at top languages with more questions, if that's no possible then alphabetically but with the language name in their language, like are showed now.

I don't know if Miguel is referring to https://vm222.documentfoundation.org or to https://vm222.documentfoundation.org/categories . In the former, topic are ordered by decreasing activity timestamp, with the pinned topic(s) on top. In the latter, there are two panes: category ordering is fixed (see #note-25) and currently lexicographicaly by slug (like the language dropdown in Wikpedia AFAIK), with special categories at the top (uncategorized) or bottom (site feedback, lounge, etc); the left pane shows topic are ordered by decreasing activity timestamp.

Sophie Gautier wrote in #note-61:

About karma & badges, is there any chance to at least reflect actual status in the wiki?. To avoid completely loose that information

We probably don't want a wiki page with 100k entries, so it boils down to copying the first few pages at
https://ask.libreoffice.org/users/ . Not something that belongs to this migration IMHO.

Sophie Gautier wrote in #note-62:

Tried Move to answers function to change a comment in answer. I get the error message: You are not permitted to view the requested resource

Good catch, that's a recent feature https://github.com/paviliondev/discourse-question-answer/commit/165ca0334aeef19f10acc02bbcfd96a0dc30c2cb and currently not configurable. Arguably a bug that users who are lacking the rights to perform the operation are seeing the button.

Sophie Gautier wrote in #note-63:

1. Viewed my latest post.
Clicking up-vote link next to Reply threw me to login (up/down-vote controls should be inoperative).

I disagree about that one, it also matches the behavior of the “Like” and “Reply” button. All are visible for anonymous users, but they'll need to authenticate to be able to perform the operation. Either way this is not configurable.

Back-button then gave problematic page (cannot tell what was happening).
Attempted to get to /vm222/ again but my latest post now missing.

Unable to reproduce this.

2. I viewed
https://vm222.documentfoundation.org/t/is-there-a-shortcut-for-switching-groups-in-calc/29
The page is most bland & uninspiring
Almost all text is same-size
Poor page layout & is missing many sections:-

We're using the default theme. Using a custom theme is of course an option. But I'd argue that aside from basic settings like changing colors and font family, we're better off sticking to upstream defaults when it comes to font size and element position, because these have been tested with screens of different size etc. Moreover while UI is clearly subjective, the default theme seems to serve the global discourse community well.

  • Vote-count + Vote-controls are missing next to Question (obviously Vote-controls should be inoperative, but needs to exist)

As said in #note-49 it's not possible to upvote a question (or comment). Only answers can be upvoted, topic “scores” are calculated.

  • Tag display is poor (plain-text separated by commas). (Askbot display is far better)

“Far better” is subjective, I for one find it ugly :-P Anyway that one is configurable and people think boxes are nicer we can show that rather than the comma-separated list.

  • Information link is useful, but perhaps only if you are logged in

Unclear to me which link Alex is talking about.

  • Questioner-Info block almost entirely missing (Askbot display is far better)

Sadly not configurable.

  • Answer Downvote-Link button missing

As noted during the previous test window last summer it's currently not possible to downvote an answer, see
https://meta.discourse.org/t/question-answer-plugin/56032 .

  • Answerer-Info block almost entirely missing (Askbot display is far better)
  • No '/Propose Your Solution/' section.

Sadly not configurable.

  • Related questions missing

Where? AFAIK it's not possible to add a list of similar topics based on title/tags only, but Discourse adds a related section when it finds crosslinks, see for instance https://vm222.documentfoundation.org/t/save-a-single-slide-from-impress-as-a-jpg . (The number of incoming links also raises the topic's overall score.) Furthermore click counts (from authenticated users) are tracked and the most popular links should be shown below the “Frequent Posters” section, see for instance https://vm222.documentfoundation.org/t/drop-caps-not-working .

Sophie Gautier wrote in #note-64:

Messages are marked as not read per default. During the "real" migration, it'd be good if messages are in the same state than in Ask.

That's part of what's lost in translation unfortunately, as well as who upvoted what. See #note-49: total view and vote counts are preserved but individual details are not.

Actions #66

Updated by Sophie Gautier over 3 years ago

Updating actual state:
- Guilhem has tested multisite instance: looks promising but needs more investigation
- migration will take place first week of August
- mail sent to mailing lists and members
- banner added on top of Ask page

Actions #67

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-66:

- Guilhem has tested multisite instance: looks promising but needs more investigation

Ran into a few issues regarding the SAML plugin but was able to fix them, I think. That makes the concerns from #note-48 moot :-)

While playing around I noticed some settings which could be helpful (can set them later, no need to do it now):

  • Ability to auto-tag a topic based on its first post: we can for instance automatically add the tag ‘writer’ when the question contains that word. (Regular expressions are not supported though.)
  • Restrict tags by category: some tags are localized and only makes sense in particular language(s). We can for instance hide tags ‘標點符號’ or ‘tabela’ from the drop-down menu on the English category. Speaking about tags, might be worth merging some of them, we have for instance both ‘colors’ (358 topics) and ‘color’ (1 topic).
  • Ability to automatically close topics after a configurable period of time after the last post or after the question was asked. This would avoid users bumping years-old posts. I noticed Alex Kemp does that manually(?) on AskBot, but we can now do that automatically on a per-category basis. Note also that unlike AskBot Discourse won't let <TL4 users close their own question (this is currently not configurable).
  • Should we forbid uncategorized posts? I fear that if we don't do that then most questions will end up uncategorized and people will need to manually re-categorize them.
  • Mandatory tags: like for AskBot each topic must be tagged with one (or more) among ‘common’, ‘writer’, ‘calc’, ‘impress’, ‘base’, ‘draw’, ‘math’, or ‘meta’. This a per-category setting so the tags can be localized if desired. We might also want to drop ‘meta’ and use the Uncategorized and/or Site Feedback categories?

Regarding localization, I'm probably missing something since it didn't bother the folks you asked feedback from, but I for one thinks it's a bit strange to see questions in Japanese or Spanish when I visit the site with Accept-Language: en. Is it meant to be that way? with AskBot I'm redirected to https://ask.libreoffice.org/en/questions , and I have to change the URL to visit the localized content of the site. I stumbled upon https://meta.discourse.org/t/multilingual-plugin/142740Sophie Gautier, is this something you're aware of but ruled out? (I find the content-language feature interesting: https://thepavilion.io/t/content-languages/2545 . However if we want to try it out it needs to be done now and not after the migration, because that's a pretty intrusive change which changes the layout to use tags instead of categories.)

Also, nobody reported that so far but for the record: post revisions are lost in translation (only the most recent revision is imported, and deleted posts are weeded out), as well as one's own notification settings, watch list and saved searches. Everyone starts with “fresh” defaults in that regard: 1/ no weekly/daily digest, 2/ notify by email when away and someones uses @-address or replies to one's own post, and 3/ auto-watch a topic when replying or after spending ≥4 minutes reading.

Actions #68

Updated by Sophie Gautier over 3 years ago

Guilhem Moulin wrote in #note-67:

Sophie Gautier wrote in #note-66:

- Guilhem has tested multisite instance: looks promising but needs more investigation

Ran into a few issues regarding the SAML plugin but was able to fix them, I think. That makes the concerns from #note-48 moot :-)

great!

While playing around I noticed some settings which could be helpful (can set them later, no need to do it now):

  • Ability to auto-tag a topic based on its first post: we can for instance automatically add the tag ‘writer’ when the question contains that word. (Regular expressions are not supported though.)

+1

  • Restrict tags by category: some tags are localized and only makes sense in particular language(s). We can for instance hide tags ‘標點符號’ or ‘tabela’ from the drop-down menu on the English category. Speaking about tags, might be worth merging some of them, we have for instance both ‘colors’ (358 topics) and ‘color’ (1 topic).

+1

  • Ability to automatically close topics after a configurable period of time after the last post or after the question was asked. This would avoid users bumping years-old posts. I noticed Alex Kemp does that manually(?) on AskBot, but we can now do that automatically on a per-category basis. Note also that unlike AskBot Discourse won't let <TL4 users close their own question (this is currently not configurable).

+1

  • Should we forbid uncategorized posts? I fear that if we don't do that then most questions will end up uncategorized and people will need to manually re-categorize them.

+1

  • Mandatory tags: like for AskBot each topic must be tagged with one (or more) among ‘common’, ‘writer’, ‘calc’, ‘impress’, ‘base’, ‘draw’, ‘math’, or ‘meta’. This a per-category setting so the tags can be localized if desired. We might also want to drop ‘meta’ and use the Uncategorized and/or Site Feedback categories?

Agreed also to drop meta in favor of Uncategorized/Site Feedback.

Regarding localization, I'm probably missing something since it didn't bother the folks you asked feedback from, but I for one thinks it's a bit strange to see questions in Japanese or Spanish when I visit the site with Accept-Language: en. Is it meant to be that way?

When you deal with several languages it's something you'd like, you'd go directly to the last posts in your different languages. But if you only work with one language, then seeing posts mixed in multiple languages is boring. Personally I like to see all the language categories displayed because with Ask you have to guess if a language exists or not, but I agree the mix of languages on recent list is not inviting.

with AskBot I'm redirected to https://ask.libreoffice.org/en/questions , and I have to change the URL to visit the localized content of the site. I stumbled upon https://meta.discourse.org/t/multilingual-plugin/142740Sophie Gautier, is this something you're aware of but ruled out? (I find the content-language feature interesting: https://thepavilion.io/t/content-languages/2545 . However if we want to try it out it needs to be done now and not after the migration, because that's a pretty intrusive change which changes the layout to use tags instead of categories.)

I didn't ruled it out, I was not sure at the beginning because we have one language per category and not different language contents per category. Then I forgot about it. What I was searching for to solve the topics appearing in multiple languages is a presentation like this one https://discuter.spip.net/ where recent topics are in front of each category. I've searched on meta discourse and found https://meta.discourse.org/t/category-home-boxes-with-recent-category-posts/114826/4 so it seems a setting we have to tweak. Do you think it would be a more comfortable display?

Also, nobody reported that so far but for the record: post revisions are lost in translation (only the most recent revision is imported, and deleted posts are weeded out), as well as one's own notification settings, watch list and saved searches. Everyone starts with “fresh” defaults in that regard: 1/ no weekly/daily digest, 2/ notify by email when away and someones uses @-address or replies to one's own post, and 3/ auto-watch a topic when replying or after spending ≥4 minutes reading.

ok, thanks for the feedback, I fear it's inevitable to lose some settings

Actions #69

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-68:

Regarding localization, I'm probably missing something since it didn't bother the folks you asked feedback from, but I for one thinks it's a bit strange to see questions in Japanese or Spanish when I visit the site with Accept-Language: en. Is it meant to be that way?

When you deal with several languages it's something you'd like, you'd go directly to the last posts in your different languages. But if you only work with one language, then seeing posts mixed in multiple languages is boring. Personally I like to see all the language categories displayed because with Ask you have to guess if a language exists or not, but I agree the mix of languages on recent list is not inviting.

I see, then I guess the best would be to present users content in their own language (based on the Accept-Language header) only by default, but also have a (multiselect) drop-down menu so they can add content in other languages too? And for logged-in users, the ability to pin the languages they want to show?

With the current organization filtering is possible but only for logged in users and works the other way: if I don't speak Japanese I can mute the category so questions in Japanese are hidden when I'm logged in. I haven't found how to achieve the above without a plugin.

What I was searching for to solve the topics appearing in multiple languages is a presentation like this one https://discuter.spip.net/ where recent topics are in front of each category. I've searched on meta discourse and found https://meta.discourse.org/t/category-home-boxes-with-recent-category-posts/114826/4 so it seems a setting we have to tweak. Do you think it would be a more comfortable display?

Ack, did the same for our instance by making the category page the homepage and by setting desktop_category_page_style = categories_with_featured_topics. I find it better indeed, but IMHO having only English (or whatever Accept-Language specifies) content by default would be even better — I'm not the targeted audience though so my opinion is irrelevant :-)

Actions #70

Updated by Sophie Gautier over 3 years ago

Guilhem Moulin wrote in #note-69:

Sophie Gautier wrote in #note-68:

Regarding localization, I'm probably missing something since it didn't bother the folks you asked feedback from, but I for one thinks it's a bit strange to see questions in Japanese or Spanish when I visit the site with Accept-Language: en. Is it meant to be that way?

When you deal with several languages it's something you'd like, you'd go directly to the last posts in your different languages. But if you only work with one language, then seeing posts mixed in multiple languages is boring. Personally I like to see all the language categories displayed because with Ask you have to guess if a language exists or not, but I agree the mix of languages on recent list is not inviting.

I see, then I guess the best would be to present users content in their own language (based on the Accept-Language header) only by default, but also have a (multiselect) drop-down menu so they can add content in other languages too? And for logged-in users, the ability to pin the languages they want to show?

That would be great, indeed

With the current organization filtering is possible but only for logged in users and works the other way: if I don't speak Japanese I can mute the category so questions in Japanese are hidden when I'm logged in. I haven't found how to achieve the above without a plugin.

You mean we could use some of the features of the multilingual plugin for that?

What I was searching for to solve the topics appearing in multiple languages is a presentation like this one https://discuter.spip.net/ where recent topics are in front of each category. I've searched on meta discourse and found https://meta.discourse.org/t/category-home-boxes-with-recent-category-posts/114826/4 so it seems a setting we have to tweak. Do you think it would be a more comfortable display?

Ack, did the same for our instance by making the category page the homepage and by setting desktop_category_page_style = categories_with_featured_topics. I find it better indeed, but IMHO having only English (or whatever Accept-Language specifies) content by default would be even better — I'm not the targeted audience though so my opinion is irrelevant :-)

ha good to know you already did it :) and your opinion is relevant because you have an overview of what is feasible, so thanks for giving it :-)

Actions #71

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-70:

With the current organization filtering is possible but only for logged in users and works the other way: if I don't speak Japanese I can mute the category so questions in Japanese are hidden when I'm logged in. I haven't found how to achieve the above without a plugin.

You mean we could use some of the features of the multilingual plugin for that?

I thought so, but after trying it I take it back. You can access the test at https://ask.libreoffice.org after configuring your local resolver to point the hostname to 89.238.68.222. (On linux tee -a /etc/hosts <<<"89.238.68.222 ask.libreoffice.org" should do the trick, but don't forget to the remove the line afterwards ;-).)

The category-based approach has some advantages:

  • categories are more lightweights (each topic belongs to a single category)
  • we grant rights (incl. moderation) on a per-category basis
  • each category uses its own email address for new topics so categorizing new questions by email is trivial — OTOH it's unclear how to tag them

The multilingual plugin uses content language tags instead of categories, and provides a somewhat nicer (subjective) interface

  • authenticated users can list the languages they want to see on their preference page, rather than the categories they don't want to see (personally I like the whitelist better)
  • tags become (manually) translatable, so AFAICT when someone lists topic tagged “writer” or “cell” they'll also get posts tagged with their localized version; that way a trilingual person has fewer tags to add to their watchlist. Merging our 10k tags would be a huge a huge amount of (one-off) work but passed that it's pretty nice.

On the other hand, I was unable to get the plugin to help anonymous users. There is a drop down menu with languages but AFAICT it only affects searches and does not filter the list.

Also, neither approach gives a nice interface for x-lingual people to list or search through multiple languages at once. It's either all languages at once, or one language at the time.

All in all I'd your initial suggestion to use categories is better :-) Authenticated users are covered (they can mute the languages they don't want to hear), and for anonymous ones the simplest might be to write a custom piece of javascript to redirect the homepage to the category of their Accept-Language (in order to emulate a kind of “default category”).

There was also this thread with the pros and cons of the different approaches: https://meta.discourse.org/t/how-to-structure-a-multilingual-community/73225

Actions #72

Updated by Sophie Gautier over 3 years ago

Guilhem Moulin wrote in #note-71:

Sophie Gautier wrote in #note-70:

With the current organization filtering is possible but only for logged in users and works the other way: if I don't speak Japanese I can mute the category so questions in Japanese are hidden when I'm logged in. I haven't found how to achieve the above without a plugin.

You mean we could use some of the features of the multilingual plugin for that?

I thought so, but after trying it I take it back. You can access the test at https://ask.libreoffice.org after configuring your local resolver to point the hostname to 89.238.68.222. (On linux tee -a /etc/hosts <<<"89.238.68.222 ask.libreoffice.org" should do the trick, but don't forget to the remove the line afterwards ;-).)

Done, thanks for the info

The category-based approach has some advantages:

  • categories are more lightweights (each topic belongs to a single category)
  • we grant rights (incl. moderation) on a per-category basis
  • each category uses its own email address for new topics so categorizing new questions by email is trivial — OTOH it's unclear how to tag them

The multilingual plugin uses content language tags instead of categories, and provides a somewhat nicer (subjective) interface

  • authenticated users can list the languages they want to see on their preference page, rather than the categories they don't want to see (personally I like the whitelist better)
  • tags become (manually) translatable, so AFAICT when someone lists topic tagged “writer” or “cell” they'll also get posts tagged with their localized version; that way a trilingual person has fewer tags to add to their watchlist. Merging our 10k tags would be a huge a huge amount of (one-off) work but passed that it's pretty nice.

On the other hand, I was unable to get the plugin to help anonymous users. There is a drop down menu with languages but AFAICT it only affects searches and does not filter the list.

yes, and I find it confusing. The list also contains many languages

Also, neither approach gives a nice interface for x-lingual people to list or search through multiple languages at once. It's either all languages at once, or one language at the time.

All in all I'd your initial suggestion to use categories is better :-) Authenticated users are covered (they can mute the languages they don't want to hear), and for anonymous ones the simplest might be to write a custom piece of javascript to redirect the homepage to the category of their Accept-Language (in order to emulate a kind of “default category”).

ok, I agree with you, seeing the plugin in action, I think it should be useful when languages are mixed in one category, which is not our case. Thanks a lot for your feedback

There was also this thread with the pros and cons of the different approaches: https://meta.discourse.org/t/how-to-structure-a-multilingual-community/73225

thanks, we can say that we have monolingual communities per category ;-)
In my comment #32 I listed some plugins I wanted to have a look to, I'll revisit that today and let you know if one of them could be useful.

Actions #73

Updated by Sophie Gautier over 3 years ago

Comparing with Ask functionalities, it seems the Follow plugin would be something to have:
https://meta.discourse.org/t/follow-plugin/110579
I can see Mike or Pierre-Yves having many followers
Other plugins on my list are not really useful for us (I think :)
Would it be possible that you install this Follow plugin on the final instance?

Actions #74

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-73:

I can see Mike or Pierre-Yves having many followers

:-)

Other plugins on my list are not really useful for us (I think :)
Would it be possible that you install this Follow plugin on the final instance?

Sure! Just tried it and it seems to work fine with the other plugins: I can follow others and be notified when they post something. Followers and followed profiles are listed on each profile (we can make it private but AFAICT that's a global switch).

I propose to delay the installation until a few days after the migration: not sure what overhead the plugin has but I'd rather have the bare minimum during the full import. Unlike the multilingual or QnA plugins it doesn't seem disruptive to add it later.

Actions #75

Updated by Sophie Gautier over 3 years ago

  • Target version changed from Q2/2021 to Q3/2021
Actions #76

Updated by Guilhem Moulin over 3 years ago

The migration is now ongoing: it's no longer possible to post or authenticate to AskLibO (all sessions have been terminated) but existing can still be consulted (under heavy rate-limiting). Attaching the migration script for posterity and see you on the other side :-)

Actions #77

Updated by Guilhem Moulin over 3 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 90

AskLibO is now Discourse! I'll install the plugin mentioned in #note-74 along with the next upgrade.

Regarding auto-closing topics, we have the option to automatically close topics for which no one has posted for a while (the delay is configurable per category), or to do it only for solved questions. The latter is better no? What would be a reasonable delay?

Actions #78

Updated by Sophie Gautier over 3 years ago

Guilhem Moulin wrote in #note-77:

AskLibO is now Discourse! I'll install the plugin mentioned in #note-74 along with the next upgrade.

\o/ :-) Should I open a Discourse category here in Redmine or do you prefer to keep AskLibO for this site?

Regarding auto-closing topics, we have the option to automatically close topics for which no one has posted for a while (the delay is configurable per category), or to do it only for solved questions. The latter is better no? What would be a reasonable delay?

Yes, I think the latter is better too and let's say a month to close it.

Actions #79

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-78:

Guilhem Moulin wrote in #note-77:

AskLibO is now Discourse! I'll install the plugin mentioned in #note-74 along with the next upgrade.

\o/ :-) Should I open a Discourse category here in Redmine or do you prefer to keep AskLibO for this site?

We can keep AskLibO (it was called AskBot before, renamed it last night so it's not tied to a particular software).

Regarding auto-closing topics, we have the option to automatically close topics for which no one has posted for a while (the delay is configurable per category), or to do it only for solved questions. The latter is better no? What would be a reasonable delay?

Yes, I think the latter is better too and let's say a month to close it.

Done, set auto-close solved topics 720h (30 days) after the last reply. That probably means a lot of topics will be auto-closed later today.

Actions #80

Updated by Guilhem Moulin over 3 years ago

Sophie Gautier wrote in #note-68:

Guilhem Moulin wrote in #note-67:

Also, nobody reported that so far but for the record: post revisions are lost in translation (only the most recent revision is imported, and deleted posts are weeded out), as well as one's own notification settings, watch list and saved searches. Everyone starts with “fresh” defaults in that regard: 1/ no weekly/daily digest, 2/ notify by email when away and someones uses @-address or replies to one's own post, and 3/ auto-watch a topic when replying or after spending ≥4 minutes reading.

ok, thanks for the feedback, I fear it's inevitable to lose some settings

Do you think it'd be helpful to have a pinned post suggesting users to update their notification preferences and other things that might have been lost in translation?

Actions #81

Updated by ste ve over 3 years ago

Thanks for moving this forward. I think there is huge potential and this is the right thing to do. Some feedback:

  • new default to follow system color scheme is currently still disabled. Could this be enabled? https://meta.discourse.org/t/automatic-dark-mode-color-scheme-switching/161593/66
  • improve structure of default view by showing english and most popular subcategories at the top along with the most visited languages (maybe all languages with +1k entries or so?). Currently the list is alphabetical for all languages which is not a good default view
Actions #82

Updated by Guilhem Moulin over 3 years ago

ste ve wrote in #note-81:

  • new default to follow system color scheme is currently still disabled. Could this be enabled?

Is it really the new default? AFAICT upstream is currently considering making it the default for the upcoming 2.8, but the default in dev version hasn't been changed yet. IMHO we should stick to the upstream for now, especially while the discussion is ongoing. 2.8 will be released soon enough anyway :-)

  • improve structure of default view by showing english and most popular subcategories at the top along with the most visited languages (maybe all languages with +1k entries or so?). Currently the list is alphabetical for all languages which is not a good default view

Agreed the default view could be improved, see Sophie Gautier and mine discussion above. Unfortunately we were not able find a suitable configuration setting or plugin, however routing users to the category of they browser language should be easy to do in a custom Javascript overlay. Perhaps you're comfortable with JS and could help with that? :-)

Actions #83

Updated by Sophie Gautier over 3 years ago

  • Status changed from Feedback to Resolved

Closing this one as the migration is done.

Actions

Also available in: Atom PDF