Page semi-protected

Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

BAG member instructions

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

 Instructions for bot operators

Current requests for approval

RonBot 14

Operator: Ronhjones (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 20:29, Wednesday, November 14, 2018 (UTC)

Function overview: Upscales the nominal size of an non free SVG (but not exceeding NFC guideline) post a manual request.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: Based on RonBot4 - User:RonBot/4/Source1 (used to change the nominal size of oversized NFC svg files, so the resultant png files are below NFC guideline), will change to use a different category as input, to be set by a new template.

Links to relevant discussions (where appropriate): (Copied from my Talk Page)...

Hi Ron, I had an idea for new functionality in RonBot. Your bot already fixes images that are too large to meet the NFC guidelines, but another frequent problem with SVG files is that they'll be too small. What often happens is, someone will extract a logo from a PDF where it appeared very small, and as a result, their SVG will have a tiny nominal size. Someone reading an article will click on the logo, expecting to see it bigger, and instead they'll see a much smaller version of it. An example of this right now is the logo on Charlotte Independence (used to affect maybe half of the teams in that league, but I've manually fixed most of them).

Would you consider developing new functionality for RonBot to raise SVGs to the maximum allowed resolution for NFC? Obviously, you wouldn't want to automatically scale up every SVG, since some are presumably intended to be so small. Instead, an editor would have to manually tag the image to be upsized. The current process of manually fixing this is quite tedious, so being able to simply click a button in Twinkle and add a template would make it much easier to eradicate these unnecessarily-tiny logos. Let me know what you think of my idea. Thanks, IagoQnsi (talk) 19:13, 14 November 2018 (UTC)

Edit period(s): Daily

Estimated number of pages affected: Low number of images, just those that are tagged manually

Namespace(s): Files

Exclusion compliant (Yes/No): Yes

Function details: Based on a manually added template (with an associated category), the bot will just adjust the width and height parameters in the "<svg" tag of the image, to allow the size to of the resultant png to be more readable, but of course below the NFC guideline.

Discussion

With all due respect, is this *really* necessary, given that MediaWiki already allows SVGs to be scaled arbitrarily? -FASTILY 01:37, 15 November 2018 (UTC)

Well the requester thought it would be useful, and since there is virtually zero coding to do, I did not see an issue. It does stop items such as File:Gun Owners of America Logo.svg looking rather silly. Ronhjones  (Talk) 21:59, 15 November 2018 (UTC)

ProgrammingBot 2

Operator: ProgrammingGeek (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:41, Wednesday, November 14, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): JavaScript (nodejs)

Source code available: GitHub

Function overview: Adds {{WikiProject Protected areas}} to talk pages in categories:

That do not already have the template.

Links to relevant discussions (where appropriate): WP:Bot requests#Add a wikiproject template to New York City parks articles

Edit period(s): Daily

Estimated number of pages affected: ~650

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Adds the template to talk pages of articles in the categories above, provided they do not already have the template. Will fill in the class= field if there is another template with it filled out.

Discussion

PrimeBOT 30

Operator: Primefac (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 22:16, Monday, November 12, 2018 (UTC)

Function overview: Remove deprecated/invalid parameters from templates as well as "infobox genfixes"

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: WP:AWB

Links to relevant discussions (where appropriate): Previous bot tasks 7, 8, 10, 18, 20, 23, 26, 28, and 29, as well as current discussion regarding {{infobox UK school}}

Edit period(s): As needed/requested, but each request will be a OTR

Estimated number of pages affected: 1000-8000 pages, depending on transclusion count and size of category

Namespace(s): Main

Exclusion compliant (Yes/No): Yes

Function details: As seen in my previous bot runs, I have a lot of past experience with removing deprecated/unnecessary/etc parameters from templates. With this number seemingly increasing (4 of the last 5 BRFAs were for this purpose) I thought I would get an open-ended approval for this task. The general criteria for running this bot would be something along the lines of the following:

  1. Template must have 1000+ pages in a "bad parameters" category and/or be a template merger with 1000+ transclusions (anything smaller can really be cleaned up manually)
  2. Discussion must be present and consensus agrees that a bot run will be necessary to remove the bad params (e.g. see discussions for bot runs 28 and 29)
  3. A list (ideally created by the requesters and not myself, but I will do it if necessary) will be created for the parameters to be removed/changed/updated/etc
  4. The specific task will be added to a "records page" (likely User:PrimeBOT/Parameter removal records or some such location) fully documenting the previous three steps (i.e. CYA)
  5. The parameters will be updated as dictated in the discussion.

As a result of the last two bot runs (28/29) I also worked out most of the kinks in Headbomb's proposed logic for infobox cleanup, which basically involves one-param-per-line and piping fixes. Specifically, 1.a, b, c, and 2.d - I never did get around to investigating the others. This logic makes the actual task of tweaking params a lot easier, hence my interest in including it.

Discussion

As a note/thought, this would also be used/useful for when TFD mergers are implemented, which is really what motivated me to file this BRFA as I might have been stretching Task 24 a little farther than it should have been stretched on a couple of occasions. Primefac (talk) 22:18, 12 November 2018 (UTC)

Support your methodology. Getting a list of parameters from the editor(s) leading the merge effort will be useful to show that the bot is performing a consensus-based task. With regard to your first criterion, it might need to be tweaked a bit to be applicable to the Infobox UK school to Infobox school merger: the bot's work will migrate current parameters in a TfD'd infobox to supported parameters in the merge target, thereby avoiding the creation of 1000+ unsupported parameters when the TfD'd infobox is turned into a redirect. – Jonesey95 (talk) 05:30, 13 November 2018 (UTC)

Good point. Updated. Primefac (talk) 11:15, 13 November 2018 (UTC)
Well done. – Jonesey95 (talk) 15:11, 13 November 2018 (UTC)

BsherrAWBBOT 2

Operator: Bsherr (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 22:50, Monday, October 29, 2018 (UTC)

Function overview: Ad hoc template orphaning and transclusion replacement, including pursuant to WP:TFD.

Automatic, Supervised, or Manual: Supervised

Programming language(s): AutoWikiBrowser

Source code available: AutoWikiBrowser

Links to relevant discussions (where appropriate):

Edit period(s): One time runs as needed by particular task

Estimated number of pages affected: Usually between 100 and 1000 pages.

Namespace(s): All

Exclusion compliant (Yes/No): No

Function details: This is a reapproaval after the bot was deflagged for inactivity only. Following its original approval, there were no runs needed. First task is replacing Template:User x/doc with Template:User x in pages having that as the first parameter in Template:Documentation, the purpose of which is to merge redundant pages Template:User x/doc with Template:User x and allow Template:User x a normal documentation page.

Discussion

  • Hi The Earwig. Ordinarily, yes, implementation of TfD results. This particular unusual situation seemed to present an exception. It's not unusual that templates are functionally divided between a base page and subpages. When functions are moved between the base page and subpages, it's not usually by a TfD unless the subpage is being proposed for deletion. Here, there is clear accidental forking between the two pages. The subpage is being retained for the purpose of being the documentation subpage, and retaining the page history is desirable. Given that, with a bit of IAR in mind, I didn't think it needed. If BAG thinks it's still important to put it through TfD, happy to do it. Or, alternatively, if BAG wants to stipulate that any future runs not pursuant to TfD be approved here, happy to accept that too. Bsherr (talk) 04:16, 1 November 2018 (UTC)
  • Okay, that's reasonable. I agree you don't need a full TfD for this fairly low-risk change. I did notice a recent discussion about the situation, so unless you've already done so, I think it would be a good idea to let Trappist and Hyacinth know of your proposal to make sure we're all in agreement. Once that's settled, I'm comfortable with a (speedy) approval for that replacement and future TfD-consensus-derived replacements (as in your prior approval), with the condition that runs not arising from TfD consensus require explicit future BRFAs, as you suggested. — Earwig talk 05:51, 1 November 2018 (UTC)
  • Nice sleuthing. I'll leave a note and drop a link to this page. --Bsherr (talk) 15:43, 1 November 2018 (UTC)

Good, I can strike this from my to-do list. But, as I suggested at that other conversation, I would choose different template names. I believe that templates, as much as possible, should be named to reflect the work that they are actually doing. The names of {{user x}} and {{user x/doc}} are mighty vague about what it is that they actually do or what their names mean. I proposed in that other conversation that the single documentation template should be called {{user x doc}} which name sort of describes what it is that the template does. It then gets its own documentation page, Template:user x doc/doc. These two pages can be created and tested without disrupting the existing template structures. When tests with templates that use {{user x}} and {{user x/doc}} show that the new template and its documentation works as expected, then all instances of {{user x}} and {{user x/doc}} can be replaced with {{user x doc}}.

What I think might be a better long-term solution would be to create a single parameterized template, perhaps {{user lang box}}, then we have no need to create and maintain a mess of however many individual templates we have now. Were we starting afresh today, this is likely how we would handle this task: use language codes from Module:Lang or Module:ISO 639 name, create another module that keeps a large data table of all of the appropriate non-English text that is rendered when the user box is rendered. This solution is really beyond the scope of this brfa, but it is offered as a suggestion for later consideration.

Trappist the monk (talk) 16:54, 1 November 2018 (UTC)

Makes sense to me. Why not boldly move the template now? Then when I run the bot, I can use the new name. --Bsherr (talk) 18:35, 1 November 2018 (UTC)
Barring that, I'd plan to proceed, as I wouldn't want to leave this fork outstanding for long. We should certainly continue this on the template talk page. May I have approval? Bsherr (talk) 19:52, 9 November 2018 (UTC)

WOSlinkerBot

Operator: WOSlinker (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 08:16, Sunday, October 28, 2018 (UTC)

Function overview: Updating talk pages to fix html tag issues such as font tags to make the pages readable and properly formatted. Also reducing Multiple unclosed formatting tags lint errors. Will initally start with user talk archive pages before moving on to other talk pages and wikipedia pages.

Automatic, Supervised, or Manual: Manual

Programming language(s): Javascript

Source code available: At User:WOSlinkerBot/lint.js

Links to relevant discussions (where appropriate):

Edit period(s): Adhoc

Estimated number of pages affected: Search queries listed at User:WOSlinker/user lint. There is some overlap between some of the queries.

Namespace(s): User Talk pages initally, then other user talk pages and lastly other talk pages and wikipedia pages.

Exclusion compliant (Yes/No): No

Function details: Updating talk pages to fix html tag issues such as font tags to make the pages readable and properly formatted. Also reducing Multiple unclosed formatting tags lint errors. Will initally start with user talk archive pages before moving on to other talk pages and wikipedia pages. Each edit is previewed to check that the end of the page looks ok before saving.

Discussion

  • I like the idea, and since it's manual the risk is fairly low. My question is, do we have clearly established consensus to change peoples' signatures on talk pages en-masse? The changes are mundane, but it seems prior approvals (Ahechtbot and Galobot) may have excluded user talk pages, and this tends to be the kind of thing that bothers people. — Earwig talk 02:50, 1 November 2018 (UTC)
    I'm not changing how they look, but the tags so that they are properly closed or ordered correctly. If you scroll to the bottom of a few example pages such as example1, example2, example3, example4, example5, example6 you'll see that the sigs are messing up the rest of the text which is why they need fixing. A few of the ones I want to update do not mess up the talk page as much, but closing some tags on a sig isn't really changing them much. -- WOSlinker (talk) 08:11, 1 November 2018 (UTC)
    Both Galobot and Ahechtbot ran on user talk pages. Only excluded user talks on initial trial without bot flag so as to not cause "new message" notifications. Galobtter (pingó mió) 10:58, 1 November 2018 (UTC)

PkbwcgsBot

Operator: Pkbwcgs (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 13:55, Saturday, October 27, 2018 (UTC)

Function overview: The bot will make fixes to some WP:WCW errors using WPCleaner.

Automatic, Supervised, or Manual: Automatic, but I can do supervised on request (for some errors)

Programming language(s): WPCleaner

Source code available: WPCleaner bot tools

Links to relevant discussions (where appropriate):

Edit period(s): Each error, three times a week as clarified below

Estimated number of pages affected: Around 900 pages per one-hour period of editing at 15epm, three one-hour sessions a week will make this approximately 2,700 pages fixed. Approximately six to seven minutes will be spent fixing per error in an editing session at 15 epm.

Namespace(s): Mainspace/Articles

Exclusion compliant (Yes/No): Yes

Function details: I am going to run WPCleaner using the bot account and it will make WP:WCW fixes for errors: 1 (Template contains useless word Template:), 2 (Tag with incorrect syntax), 6 (DEFAULTSORT with special characters), 9 (Multiple categories on one line), 16 (Unicode control characters), 17 (Category duplication), 20 (Symbol for dead), 37 (DEFAULTSORT missing for titles with special letters), 54 (Break in list), 64 (Link equal to linktext), 85 (Tags without content), 88 (DEFAULTSORT with a blank at first position), 90 (Internal link after external link), 91 (Interwiki link written as external link or used as a reference) and 524 (Duplicate arguments in template calls). The bot will not do 45 (Interwiki duplication) because automatic fixing for interwiki duplication is causing errors. Most of the errors are being done by other bots. The bot is going to use the bot tools provided by WPCleaner. Each error will be run three times a week on a Monday, Thursday and Sunday with a one-hour editing session on each of the three days with the aim to fix approximately 900 pages at the rate of 15epm. There are nine errors so the bot will stick to a maximum of 100 fixes per error (900/9 = 100) in a single editing session. Due to over 6,000 pages reported as error 90, more time will be spent of it but this will come in a future BRFA but I will stick to a maximum of 100 fixes per editing session for this error for the time being. The figure of a maximum of 100 fixes per error at a one-hour editing session making it 300 fixes per error per week using this bot.

Discussion

  • You will need to register this account, also please make a userpage for your bot, you may want to redirect its talk page to yours. — xaosflux Talk 14:55, 27 October 2018 (UTC)
  • List of "errors": 1,2,6,9,16,17,20,37,54,64,85,88,90,91,524. — xaosflux Talk 14:55, 27 October 2018 (UTC)
    A large portion of these are marked as "cosmetic only" - making only cosmetic updates with a bot is generally not supported, can you talk some about your strategy here? — xaosflux Talk 14:58, 27 October 2018 (UTC)
    @Xaosflux: Some of the errors listed are not cosmetic. Yes, I agree that fixing error 64 is cosmetic as it has no visible change. However, error 90 and 91 are not cosmetic because it changes the internal/interwiki from an external link to an internal link. Error 524, error 2 and error 16 are also not cosmetic. I will strike error 64 as I know that is definitely cosmetic. Pkbwcgs (talk) 15:07, 27 October 2018 (UTC)
    Bot account has now been created. Pkbwcgs (talk) 15:12, 27 October 2018 (UTC)
    @Xaosflux: I have struck off some more errors that I felt will be cosmetic. Out of the errors I have said, 2 (Tag with incorrect syntax), 90 (Internal link after external link) and 91 (Interwiki link written as external link or used as a reference) are either high priority or middle priority. Error 524 is also important as well as error 16 because it strips out unicode control characters which will reduce the bytes of the page. I can do error 20 manually on my account as there rarely is a backlog for error 20. I don't know what you think about the other errors. Pkbwcgs (talk) 18:06, 27 October 2018 (UTC)
    Wikipedia:WikiProject_Check_Wikipedia/List_of_errors doesn't have error numbers above 113, please point to a current documentation of error numbers you are dealing with. — xaosflux Talk 18:12, 27 October 2018 (UTC)
    The errors at Wikipedia:WikiProject Check Wikipedia/Translation has error 524 and WPCleaner is also configured to fix error 524. Pkbwcgs (talk) 18:15, 27 October 2018 (UTC)
  • You are requested an edit rate of 50epm, will you be configuring MAXLAG? — xaosflux Talk 18:11, 27 October 2018 (UTC)
    @Xaosflux: Yes, as that is a requirement when editing a high level of pages per minute. However, how to enable MAXLAG? Pkbwcgs (talk) 18:13, 27 October 2018 (UTC)
    mw:Manual:Maxlag parameter , however if you don't know how to do this, you will need to just throttle down to a slower level like 10epm. — xaosflux Talk 18:17, 27 October 2018 (UTC)
    I can come down to 10epm to 20epm and make the editing time longer. I will amend this in "Estimated number of pages affected:". Also, based on what I can see at WP:WCW and the amount of pages that need fixing, I plan to run this bot two to three times a week. One hour for each editing session and that way, I can make it 10epm to 20epm and have things fixed quickly. Pkbwcgs (talk) 18:22, 27 October 2018 (UTC)
    @Xaosflux: I have amended the data above. I plan on doing 15epm without going over. Pkbwcgs (talk) 18:25, 27 October 2018 (UTC)
    15epm is OK. — xaosflux Talk 18:30, 27 October 2018 (UTC)
    Okay. The bot will stick to 15epm. Pkbwcgs (talk) 18:31, 27 October 2018 (UTC)
  • Comment: Fixing duplicate arguments in template calls requires some tricky logic, and (IIRC) Sporkbot is already handling this fix for bot-fixable instances. Do you propose an improvement on what Sporkbot is doing? – Jonesey95 (talk) 20:18, 3 November 2018 (UTC)
    • @Jonesey95: It will handle ones which can be fixed automatically by the bot. Pkbwcgs (talk) 20:44, 3 November 2018 (UTC)
      • More detail is needed. "Handle" does not describe what the bot will do. Please see the discussion at Wikipedia:Bots/Requests for approval/SporkBot 5. – Jonesey95 (talk) 21:23, 3 November 2018 (UTC)
        • @Jonesey95: The ones that can be fixed by WPCleaner. For example, if there are two blank parameters in an infobox then it will eliminate one of them. E.g. In this diff (on my account), there were two duplicate arguments in the infobox, the parameter |membership = was duplicated twice in this instance so it will eliminate one. Another instance of WPCleaner fixing duplicate arguments, is this diff where there are two duplicate parameters and both of them have the same value (| name = Augusto Heleno Ribeiro Pereira) so WPCleaner will eliminate one of them and it did in that diff. It also fixed link equal to linktext but when I am running the bot, I will not allow that to happen. I hope that helps. Thanks. Pkbwcgs (talk) 09:14, 4 November 2018 (UTC)
          • Thanks. I believe that the key to bot eliminations of duplicate parameters is that the edit must not have an effect on the rendered page, except for the elimination of the hidden duplicate parameters category. As long as the bot adheres to this condition, it should be fine. – Jonesey95 (talk) 09:57, 4 November 2018 (UTC)
            • Is this bot ready for trial? Pkbwcgs (talk) 18:47, 11 November 2018 (UTC)

ZackBot 11

Operator: Zackmann08 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 20:19, Sunday, October 21, 2018 (UTC)

Function overview: Clean out Category:Music infoboxes with deprecated parameters (27)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Ruby

Source code available: User:ZackBot/Albums

Links to relevant discussions (where appropriate): N/A

Edit period(s): one time run

Estimated number of pages affected: ~90,000

Namespace(s): Mainspace

Exclusion compliant (Yes/No): yes

Function details: Been having a pretty good run at doing this as a semi-automated process where I basically copy and paste the source code into a script which then converts the page and I manually preview it and click save but it is taking too much time so I want to just do a fully automated run.

Bottom line what this does is parse the existing template to the necessary format and then substitute the infobox code in. The way the template has been written, it can be substituted for proper formatting. The issue with just doing a straight substitution is that the regular expressions in the template do not cover all cases. My code covers a much higher percentage (about 99% in my testing) and more importantly, when it hits one of those 1% cases, it skips over the page and doesn't make the edit rather than introducing errors.

Discussion

Can you be a bit more specific about the kinds of transformations you're doing? If I understand the task correctly from your edit history, the bot would completely reformat the infobox—is this something we're okay with on 90,000 pages? — Earwig talk 22:29, 21 October 2018 (UTC)

@The Earwig: thanks for the message! So the real focus here is on removing the deprecated parameters. An added bonus of the way the subst template has been setup is that it re-formats the source code to be nicely tabbed and spaced. On 99% of the pages, there will be no noticeable change on the front end. There are some pages that currently are not properly using the next album/previous album parameters. Those will see minor changes to conform with the templates documentation. --Zackmann (Talk to me/What I been doing) 22:46, 21 October 2018 (UTC)
Does your code avoid or fix Category:Music infoboxes with Module:String errors errors? – Jonesey95 (talk) 04:23, 22 October 2018 (UTC)
@Jonesey95: both. As described above it either fixes them, or if it cannot, it simply skips the page leaving it to be done manually. --Zackmann (Talk to me/What I been doing) 04:45, 22 October 2018 (UTC)
@Jonesey95: just to expand on that... The template currently has an insanely complicated substitution method in it that was masterfully written by Jc86035, but really is crazy complicated (to be clear, Jc86035, you did a great job! That isn't a dig are you.). The program that I wrote is able to use more advanced parsing techniques than are available in WikiMarkup. The code I'm using involves multiple different regular expressions so the issues that are present in that group are almost entirely resolved. The very limited number of cases where it cannot be resolved, the page is just skipped. Of the ~15,000 pages I've done manually, with the exception of some of the first pages when I was still debugging the process, none of them have introduced errors into the page. Let me know if you have any more questions. --Zackmann (Talk to me/What I been doing) 17:55, 22 October 2018 (UTC)
  • My bot, DeprecatedFixerBot, already does this, see Wikipedia:Bots/Requests for approval/DeprecatedFixerBot 3. I was just planning to finish that this week. --TheSandDoctor Talk 15:32, 23 October 2018 (UTC)
    @Zackmann08: Though I do skip the string errors category pages. --TheSandDoctor Talk 15:34, 23 October 2018 (UTC)
    @Jonesey95, The Earwig, and Zackmann08: Only reason it isn't down to 30k pages right now is that at some point into the run on my server yesterday it ran into an error (new C++ code, old python...something I plan to resolve tonight, aka git pull the missing file). It just takes a bit of time to parse/edit, so I tend to set it up on the server and grind away 50k pages at a time as compared to running locally on my laptop. --TheSandDoctor Talk 15:46, 23 October 2018 (UTC)
    Maybe Zackmann08 can focus on the Module:String errors, since there are about 5,000 of them. – Jonesey95 (talk) 16:01, 23 October 2018 (UTC)
@Jonesey95, The Earwig, and TheSandDoctor: happy to work with you to capture the pages you don't resolve. Up to you. Certainly don't want to compete. :-) --Zackmann (Talk to me/What I been doing) 18:10, 23 October 2018 (UTC)
@TheSandDoctor: Yaarige Saluthe Sambala is an example of a page where I was able to handle a case your bot skipped, so I think both bots could be useful. --Zackmann (Talk to me/What I been doing) 20:50, 23 October 2018 (UTC)
@Zackmann08: I will make sure that my bot is done its running by the weekend, while you focus on Module:String errors how about? --TheSandDoctor Talk 21:15, 23 October 2018 (UTC)

───────────────────────── Sure thing! Could I get approval for a trial run and then I will unleash the beast this weekend once you are done? --Zackmann (Talk to me/What I been doing) 21:33, 23 October 2018 (UTC)

RonBot 12

Operator: Ronhjones (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 23:23, Monday, October 15, 2018 (UTC)

Function overview: Tags pages that have broken images, and sends a neutral message to the last editor.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: User:RonBot/12/Source1

Links to relevant discussions (where appropriate): Request at Wikipedia:Bot_requests#CAT:MISSFILE_bot by KatnissEverdeen

Edit period(s): Twice daily.

Estimated number of pages affected: on average, we estimate 70 articles a day are affected, so that will be 70 articles and 70 talk pages.

Namespace(s): Articles, User Talk space

Exclusion compliant (Yes/No): Yes

Function details:

Step 1 - Bot will get the list of articles at Category:Articles with missing files. It will check for the presence of {{BrokenImage}}. If not present, then it will (a) Add that template, and (b) add {{Broken image link found}} to the talk page of the last editor. NB:This message will be adjusted for the first runs as the time from the broken image to the last edit might be while - it will be better when up to date.
Step 2 - Bot will get the list of articles at Category:Wikipedia articles with bad file links (i.e. pages containing {{Broken image}}) with {{BrokenImage}}. It will check that the page is still in Category:Articles with missing files - if not, it will remove the template - this allows for cases where some other action (e.g. image restored) has fixed the problem, without the need to edit the article.

Discussion

  • I'm not sure leaving a TP message with the last editor is a good idea. I can think of several scenarios where the last editor might not have had anything to do with the image link breaking. I'd really like to hear other opinions on this. SQLQuery me! 04:31, 16 October 2018 (UTC)
I'm actually Coding... to message the user who broke the link Galobtter (pingó mió) 06:23, 16 October 2018 (UTC)
{{BOTREQ}} ping Ronhjones Galobtter (pingó mió) 10:08, 16 October 2018 (UTC)
  • If a file is deleted (here or on Commons), then it may take several months until a page using the file shows up in Category:Articles with missing files. Whenever someone edits the page the next time, it immediately shows up in the category. Deleted files are typically removed from articles by bots, but they sometimes fail.
As a first step, I propose that you generate a database report of broken images (use Wikipedia's imagelinks table to find file use and then Wikipedia's+Commons's image tables to see if the file exists) and then purge the cache of those pages so that the category is updated. Also consider purging the cache to all pages in Category:Articles with missing files as files might not otherwise disappear from the category if a file is undeleted.
If the file is missing because it was deleted, then the latest editor to the article presumably doesn't have anything to do with this. I think that {{Broken image link found}} risks confusing editors in this situation. Consider reformulating the template.
Category:Wikipedia articles with bad file links seems to duplicate Category:Articles with missing files so I suggest that we delete Category:Wikipedia articles with bad file links and change {{Broken image}} so that the template doesn't add any category.
This is bad code:
if "{{Broken image}}" not in pagetext:
Someone might add the template manually as {{broken&#x20;image}} or some other variant and then you would add the template a second time. Consider asking the API if the template appears on the page instead of searching for specific wikicode. If the bot is unable to remove the template because of unusual syntax, then it may be a good idea if the bot notifies you in one way or another. --Stefan2 (talk) 10:23, 16 October 2018 (UTC)
Using mw:API:Templates for the existence of {{Broken image}} would seem the better way of doing it.
use Wikipedia's imagelinks table to find file use and then Wikipedia's+Commons's image tables to see if the file exists With 10s millions of files in each table I wonder how feasible doing that would be. Galobtter (pingó mió) 11:02, 16 October 2018 (UTC)
Also, the issue with deleted files would seem resolved once Wikipedia:Bots/Requests_for_approval/Filedelinkerbot_3 goes through Galobtter (pingó mió) 12:46, 16 October 2018 (UTC)

{{BotWithdrawn}} In view of the better system by Wikipedia:Bots/Requests_for_approval#Galobot_2 - I'll delete the unneeded cats and templates. Ronhjones  (Talk) 22:39, 16 October 2018 (UTC)

Restart

I have undone the withdraw at the request of Galobtter. But I have cut down the actions to a simple tagging (and de-tagging) of images based on the Category:Articles with missing files as requested. User pages will not be edited. The {{BrokenImage}} no longer generates a categorisation - instead I have used mw:API:Templates to find the list of transclusions (and removed the space in the template name to make life easier). Also removed the If "X" not in Y, for a better match code. Ronhjones  (Talk) 22:53, 17 October 2018 (UTC)

If {{BrokenImage}} no longer categorizes an article, then why add it? Also, pages which already have many maintenance templates will suffer from increased instruction/template creep. -FASTILY 07:39, 22 October 2018 (UTC)
@Fastily: Galobtter suggested it will still be useful - Special:Diff/864517326. It will highlight the fact that there is a broken image link. Passing editors may not realise there is a broken link otherwise - not all editors will be showing hidden categories (where Category:Articles with missing files is located), and how many editors will check the categories anyway? Ronhjones  (Talk) 21:40, 22 October 2018 (UTC)
With all due respect, {{BrokenImage}} states an obvious fact and does not even add a dated maintenance category. IMO, this does not improve the editor/reader experience. Given that maintenance tags are frequently ignored and/or annoying to editors (evidenced in discussions such as this), mass-tagging articles with yet another maintenance tag isn't a good task for a bot. -FASTILY 01:41, 15 November 2018 (UTC)

TheSandBot

Operator: TheSandDoctor (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:47, Wednesday, October 17, 2018 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: https://github.com/TheSandDoctor/election_converter

Function overview: Looks through the linked csv file, converting from the old title format to the new one.

Links to relevant discussions (where appropriate): [1], RfC on election/referendum article naming format

Edit period(s): Run until done

Estimated number of pages affected: Approximately 35, 227

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): No

Function details: The bot goes through the compiled csv file (in GitHub repo, more easily read form is the Excel document, which is also included). The bot then pulls up the individual page objects, double checks that they are not themselves redirects (ie that they haven't been moved) and exist. If both conditions are satisfied, the bot moves the page (leaving behind a redirect) to the corresponding title in column B (.xlsx doc). This corresponds with the latest RfC on election/referendum article naming format and was created per request of Number 57.

The code itself is relatively straight forward, with most of the heavy lifting being handled by the mwclient Python library's move function, which is a part of the page object.

Due to the large number of page moves required, I would also request that the bot flag be assigned should this request be approved. The bot is not exclusion compliant as that is non-applicable given the context.

Discussion

  • Would it be possible for you to generate a randomized list of 100-500 articles to be moved, and the titles that they would be moved to in your userspace please? SQLQuery me! 01:11, 17 October 2018 (UTC)
    @SQL: Certainly! It can be viewed here: User:TheSandDoctor/sandbox2. The first 150 (of 151) were generated at complete random (though I did ensure no duplicates were chosen) by a Python script. Since there are over 35 thousand and only 150 selected at (near as computers can get) random (so ~0.43% of the articles, assuming my math is correct), I included a variant not shown by the randomly generated list (#151). --TheSandDoctor Talk 01:49, 17 October 2018 (UTC)
    Here is the random generation code, if you are interested. Incase it wasn't clear before, this is the script itself. I should really clean up the repo... --TheSandDoctor Talk 01:54, 17 October 2018 (UTC)
    This is a very large request. I'd love some more opinions from other recently active BAG members. Xaosflux, MusikAnimal, Anomie, The Earwig, Headbomb, Cyberpower678. SQLQuery me! 02:12, 17 October 2018 (UTC)
    If it is determined that throttling would be required, the addition of a wait timer would be relatively trivial. All I would need is the time you want it to wait between moves and that could be added rather painlessly. --TheSandDoctor Talk 02:04, 17 October 2018 (UTC)
    @SQL: my concern is determining if there was sufficient input to that RfC, it closed with ~69% support over only 16 editors despite a claim that it would be advertised to "as many relevant WPs as I can find". At the very least notice of this at WP:VPR would be a good "last chance" notification to editors at large. — xaosflux Talk 03:17, 17 October 2018 (UTC)
    I would support that. The absolute number of pages involved isn't very large for a bot, but it's the kind of change that can be very contentious, especially if people feel they weren't notified. I generally support the proposal, for what it's worth, but consensus isn't as clear as I would've liked. — Earwig talk 03:22, 17 October 2018 (UTC)
    It seems like a WP:VPR discussion and people reviewing this list would both be helpful here. Anomie 12:34, 17 October 2018 (UTC)
    Number 57 would you like to do the honours, or would you like me to? I agree that bringing attention to this BRFA at WP:VPR might be a good idea. --TheSandDoctor Talk 20:34, 17 October 2018 (UTC)
    I'll give it a go. Please amend what I've done if it's not right (never started a discussion there before). Cheers, Number 57 21:06, 17 October 2018 (UTC)
    I'll keep an eye out for it Number 57 Thumbs up --TheSandDoctor Talk 21:14, 17 October 2018 (UTC)
  • No objections here at running it at full speed. Getting it done as fast as possible would present minimal disruption to people's watchlists. We just need to make sure the renames work correctly. We wouldn't want a bunch of moves going to "Test move" or "Oopsie".—CYBERPOWER (Trick or Treat) 02:37, 17 October 2018 (UTC)
    "Test move" and "oopsie" are not in the list, I can assure you that Face-grin.svgFace-wink.svg. --TheSandDoctor Talk 02:41, 17 October 2018 (UTC)
    Conventional wisdom is that you want to go slowly enough to allow for manual review and to minimize disruption if something goes wrong, especially for a task that is not time-sensitive like this one. Unless the bot flag doesn't work for moves (?), watchlist disruption shouldn't be an issue. I would recommend at least a few seconds between edits. — Earwig talk 03:16, 17 October 2018 (UTC)
    @Earwig: Easily done. The script is currently configured with a 4 second delay, but that could be changed in less time than it took to write this sentence. --TheSandDoctor Talk 04:02, 17 October 2018 (UTC)
  • I'm not convinced all of the proposed wordings are correct. For example, Oregon Ballot Measure 58 (2008) should probably not be moved. I see a lot of similar issues in the .csv with other propositions/ballot measures; e.g. California Proposition 10 (1998) going to 1998 California Proposition 10 seems wrong. There are some other strange wordings: should Polish presidential election, 1922 (special) go to 1922 Polish presidential election (special) as suggested or 1922 Polish presidential special election or similar? Perhaps we can come up with a tighter definition of the grammar for acceptable renames, like leaving titles with parentheses for manual review. — Earwig talk 02:54, 17 October 2018 (UTC)
    @The Earwig: If a title contains parenthesis anywhere, it could certainly be compiled into its own list, recorded, and skipped over in the move. Would only add a couple of lines. The thing is, for that sort of thing, I need well defined and clear rules in order to write the regex to test for. --TheSandDoctor Talk 02:57, 17 October 2018 (UTC)
    Right, and I'm not sure what that would look like yet. I noticed another strange phrasing, which would currently move Ohio's 13th congressional district election, 2006, to 2006 Ohio's 13th congressional district election. So at the very least, we can probably have extra eyes on titles with parentheses or apostrophes, and titles without "election" or "referendum"? — Earwig talk 03:16, 17 October 2018 (UTC)
  • The change file includes moving non-year values to the start of the title, the MOS doesn't appear to address this, nor did the RfC. e.g. French constitutional referendum, October 1946 (Guinea) --> October 1946 French constitutional referendum (Guinea). Do you mean to sort these with "O"? — xaosflux Talk 03:21, 17 October 2018 (UTC)
    @Xaosflux: The list was compiled by Number 57, so they would probably be the best to ask. That said, it does appear to be the case and does make logical sense, given the RfC and its approaches with the postfix years. The whole purpose of the RfC appears to be moving like this. Moving otherwise would not make sense. This is not to comment on the above point or "(Guinea)" not being moved, just "October 1946". --TheSandDoctor Talk 04:01, 17 October 2018 (UTC)
    @TheSandDoctor: so in this example, the SORTKEY is currently under "French con..", now it will be under "October" (any very specifically NOT under "1946") - unless additional sortkey adjustments are made. What is the category sorting goal? — xaosflux Talk 04:04, 17 October 2018 (UTC)
    @Xaosflux: Number 57 is going to have to answer that one. I just saw the WP:BOTREQ and made the bot. I will happily answer or give my assessment on the technical side of things (related to the script), but the excel document and the RfC was Number's brainchild. --TheSandDoctor Talk 04:09, 17 October 2018 (UTC); expanded for clarity 04:22, 17 October 2018 (UTC)
en dash issue is OK, covered with another bot. — xaosflux Talk 19:26, 17 October 2018 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

  • For new pages with "–" in the titles, now at the start for ranges, are you going to make redirects from the common redirect "-"? — xaosflux Talk 03:25, 17 October 2018 (UTC)
    Unless I'm missing something, or don't understand - I don't see any titles matching " - " in the docx. SQLQuery me! 03:52, 17 October 2018 (UTC)
    @SQL: For example new page will be July–August 1990 Bulgarian presidential election, may need a redirect from July-August 1990 Bulgarian presidential election created. — xaosflux Talk 04:02, 17 October 2018 (UTC)
    @Xaosflux: The word "may" could be problematic here as scripts can't do human thought. That would probably necessitate output to be posted in the user space for human editors to look over and make the call if we go down that route (definitely possible, would just need to decide between the bot posting it or it generating a file and myself periodically updating the page). Aside from that though (ignoring it momentarily, if you will), a regex could be crafted to scan a title looking for '–'s and then launch another method to create a redirect without (assuming '–'s are present). --TheSandDoctor Talk 04:14, 17 October 2018 (UTC)
    I'm only calling it out for commentary here, I'm not that current on MOS about dashes and don't want to fall asleep on my keyboard reading the MOS right now! — xaosflux Talk 04:20, 17 October 2018 (UTC)
    @Xaosflux: Not a problem and I hear you Face-wink.svg. I am happy to work with the community on this and share the technical knowledge I have. I am hoping that we can iron out the details regarding this. I still believe that some sort of a bot is needed for this, should the RfC stand, since 35k+ articles is a tad too much to do by hand very easily. --TheSandDoctor Talk 04:27, 17 October 2018 (UTC)
    Isn't this already covered by Anomie Bot? ~ Amory (utc) 15:22, 17 October 2018 (UTC)
    I think you're right; as an example, when 2018–19 Southern Football League was created, Anomie Bot created 2018-19 Southern Football League a few hours afterwards. Number 57 16:11, 17 October 2018 (UTC)
    I looked through the bot's user page and did not see anything covering this BRFA. That said, after reading Number's response (which occurred while I was looking), I realize that that appears to have not been what you meant. In that case, then there probably wouldn't be any issues whatsoever on this particular point/thread. --TheSandDoctor Talk 16:22, 17 October 2018 (UTC)
    Yes, sorry — this was threaded to be in reply to Xaosflux. It's AnomieBOt 74, and works like a charm (too well, really; I see plenty of these at G8 patrolling). ~ Amory (utc) 19:24, 17 October 2018 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
  • To respond to some of the various points above:
  1. The RfC was advertised to the following WikiProjects: Elections and Referendums, Politics, Politics of the United Kingdom, U.S. Congress, Pakistani politics, New Zealand/Politics, Indian politics, Chinese politics and Australian politics; these are all the politics-related WikiProjects that I could find. On a vote-counting basis it is 69% in favour, but a couple of the oppose !votes are on dubious grounds (one being because an editor didn't believe that redirects appear in search, and another one who claimed they had never seen a year at the start of an article title), so I think the consensus (in terms of what a closing admin would determine) from the discussion is pretty undeniable.
  2. The proposed moves to 1998 California Proposition 10 etc are in line with the naming guideline (see the last bullet at WP:NC-GAL#Elections and referendums). The Oregon ones are currently incorrectly titled, so the move is to bring them in line with the guideline. We could have a conversation at a later date about whether the year is required at all for these types of articles (I'm not convinced it works), but currently they are in the guideline as such. If it's really a problem, perhaps we could drop them from the run?
  3. Polish presidential election, 1922 (special) is at the wrong title (one a few days before is at Polish presidential election, 9 December 1922, so the other one should be at Polish presidential election, 20 December 1922). This should therefore probably be moved to 20 December 1922 Polish presidential election;
  4. 2006 Ohio's 13th congressional district election is again a correct move in terms of the guideline (sixth bullet of WP:NC-GAL#Elections and referendums). The issue here is more around the awkward naming of the districts (e.g. Ohio's 13th congressional district) and is perhaps something that should be raised separately;
  5. The RfC did include discussion about titles that would start with a month (see Impru20's comments). The sortkey for articles like this would still be the year, followed by a numeral representing the month (e.g. "1946 1", "1946 2" for elections held in two separate months in 1946)
Cheers, Number 57 09:02, 17 October 2018 (UTC)
Looks like some bot coding is going to be needed to add/alter sort keys following the moves for new titles not starting with years. — xaosflux Talk 10:56, 17 October 2018 (UTC)
I'm not sure this is really needed; the current format means that articles don't automatically sort by year, so in many cases a sortkey has already been added. For instance, French constitutional referendum, October 1946 (Guinea) mentioned above is sorted in Category:Referendums in Guinea by the key 1946. This might be something better to do manually for the 380 articles with months in the year if not already in place. Number 57 13:00, 17 October 2018 (UTC)
For the other 4 categories it is in it is sorted only by page title such as Category:October 1946 events. I'm not sure what the 'best' answer for this is, but if doing it manually is the way that could be done prior to the page title moves to prevent issues in category view. — xaosflux Talk 13:08, 17 October 2018 (UTC)
I think the question there is what is the best category sortkey for the article in Category:October 1946 events. Number 57 13:10, 17 October 2018 (UTC)
Pages can be sorted differently in each category with a directive, but they only get one "default sort" for all undirected categories, so if in general sorting these be "month name" is undesirable a default sort should be defined for what the best general sorting should be. — xaosflux Talk 20:43, 17 October 2018 (UTC)
There doesn't seem to be any firm conventions around this – some articles starting with a year are sorted by the year in the events categories, and others by the first word after the year. However, in election categories sorting by year would definitely be desirable, so I guess the year followed by the month would be the best sorting (e.g. 1946 01, 1946 02 to 1946 12). Happy to add a DEFAULTSORT manually to these articles if it will resolve this concern for you? Cheers, Number 57 21:06, 17 October 2018 (UTC)
WP:NCGAL is a guideline, which like other guidelines says prominently at the top It is a generally accepted standard that editors should attempt to follow, though it is best treated with common sense, and occasional exceptions may apply. The use of a bot on such a huge scale precludes the consideration of exceptions. --BrownHairedGirl (talk) • (contribs) 03:03, 19 October 2018 (UTC)
    • PS The discussion above about some exceptions such as the naming of some Polish elections should not be conducted on this page. I am sure that @Number 57 is making recommendations on a well-reasoned basis, but decisions on the titles of individual articles should not buried in a technical page such as this; they belong at WP:RM. --BrownHairedGirl (talk) • (contribs) 03:08, 19 October 2018 (UTC)
      • This isn't RFA. Bolded "votes" aren't needed, or helpful. I do believe that this idea needs more discussion, and I'm surprised (in a good way!) at how much attention, and the overall quality of comments that this request is getting. I'm not sure about listing ~35,000 pages at WP:RM. I could see some parties seeing that as disruptive. SQLQuery me! 03:28, 19 October 2018 (UTC)
        • @SQL: no mater how many good comments there are here, this remains a technical page whose remit is to decide whether and how to use bots to implement a consensus. It is not a suitable place to form a consensus on whether to bypass WP:RM on this scale; that is right outside the remit of WP:BAG. --BrownHairedGirl (talk) • (contribs) 03:42, 19 October 2018 (UTC)
          • Right. I have an idea of how this works. I've been around a BRFA or two. You can see comments above questioning the RFC by other BAG members. I'm not sure where you get the idea that this is just going to pass without addressing that. SQLQuery me! 04:15, 19 October 2018 (UTC)
            • My point is simply that BRFA is not the place to address those issues. BRFA's role starts when they have been resolved. --BrownHairedGirl (talk) • (contribs) 04:50, 19 October 2018 (UTC)
  • I have raised this at WP:Village pump (policy)#Mass_renaming_of_election_articles,_bypassing_WP:Requested_moves. --BrownHairedGirl (talk) • (contribs) 03:36, 19 October 2018 (UTC)
  • Using some regex magic, I have split the original csv (which is still in the repository) into two camps. format1.xlsx (and .csv) contain the "odd ball" formats which could conceivably be the more contentious of the two groups, given the above discussion. format2.xlsx (and .csv) contain the "election(s), year" (where "year" is the end of the title), which appear to be less contentious per the above. While that doesn't necessarily lessen the problems above, they are now in two distinct datasets easier analyzed. It appears that 21972 are in the latter of the two, with 13253 being in format1. It should, however, be noted that the format1 dataset could be trimmed down by multiple thousands further if the words "referendum", "measures", and each of the state names were considered in the format (instead of just "election(s)"). --TheSandDoctor Talk 05:06, 19 October 2018 (UTC)
    If anyone wants it, I will split based on those words in 2 other files as well. A bot could conceivably run through format2 and leave the first for human intervention as it appears few (if any) concerns were raised about that case directly. That said, I made this bot based on the original bot request above, which was simple to implement, and submitted it accordingly. If another RfC is desired to further solidify the consensus and address concerns raised, I am all for it and do not wish to rush anything. Pinging all active participants: Number 57, Xaosflux, MusikAnimal, Anomie, The Earwig, Headbomb, Cyberpower678, SmokeyJoe, BrownHairedGirl, SQL. (hope that's everyone, if I missed anyone I apologize). --TheSandDoctor Talk 05:40, 19 October 2018 (UTC)
    Thanks, @TheSandDoctor. I am sure that you acted in good faith in respinse to the bot request. However, I do think that this request should be placed on hold pending a fresh RFC, and that the bot should not be run unless and until it is clear that there is a v broad consensus on a) the guidelines, and b) the use of a bot to bypass RM for >35k articles. I really don't see any basis for asserting that there is a consensus to rename e.g. the 860 relevant articles under WikiProject Ireland's scope with zero notification to WP:IRE or on any one of the 860 article pages. --BrownHairedGirl (talk) • (contribs) 05:55, 19 October 2018 (UTC)
    User:TheSandDoctor, I suggest putting a random selection, maybe ten, through the standard RM process. This will draw in critical comments from RM regulars. --SmokeyJoe (talk) 06:48, 19 October 2018 (UTC)
    @SmokeyJoe: I think that's a good idea, but wildly insufficient. This needs to go way beyond the RM regulars, who are few in number. And Joe, you are rightly critical of how CFD tends to be dominated by regulars. Same goes here.
    This needs to draw in editors who who have sufficient experience of each sub-topic (e.g. Spanish local elections, or Kenyan parliamentary elections) to assess how the broad principle works in their field and hopefully to look for any exception. --BrownHairedGirl (talk) • (contribs) 07:40, 19 October 2018 (UTC)
    Put ten through RM this week, then go back to RFC next week. The previous RFC was pretty sad in drawing attention, despite the attempt at publicising. —SmokeyJoe (talk) 08:14, 19 October 2018 (UTC)
  • The RfC supposedly supporting these mass moves looks very very dubious on a quick glance. The closing statement is terribly inadequate. This looks a tad overenthusiastic. 35,226 page moves with nothing mentioned at WP:RM? --SmokeyJoe (talk) 05:14, 19 October 2018 (UTC)
  • I will echo what Joe already said. That discussion is clearly insufficient for a change of this magnitude coupled with the bland closure. To avert unnecessary crisis I will suggest a new RFC on Village pump with detailed rationale and be well advertised . –Ammarpad (talk) 08:31, 19 October 2018 (UTC)
  • I knew nothing about this proposed task until today when it was mentioned at WT:RM. Given the number of pages involved it would be best to advertise this in all the relevant places where editors of such articles would be watching and get further input. Thanks. — Frayæ (Talk/Spjall) 08:39, 19 October 2018 (UTC)
  • Number 57 and others, for the record I have now changed my vote from oppose to support. However, I don't agree with my oppose vote being characterised in that way. I would generally advise against editors attempting to explain the reasons for other people's votes. Onetwothreeip (talk) 11:23, 19 October 2018 (UTC)
  • Strong oppose. This proposal does not have an adequate consensus given the large number of pages concerned, long-standing titles, and the high profile nature of the pages. The proposal should be tested in a few RMs first to see if it really has consensus.  — Amakuru (talk) 22:52, 19 October 2018 (UTC)
  • Number 57, I'm sorry, but I feel a bit misled that you cite WP:NCGAL as justification for moving California Proposition 10 (1998) to 1998 California Proposition 10, when it was you who changed the guideline in response to the RFC four days ago. I don't see any discussion in the RFC that supports this unnatural wording. Perhaps I am being pedantic, but I think this is an important distinction because "California Proposition 10" is a proper noun that external sources use directly, while "1998 California Proposition 10" really isn't. If there was only one Proposition 10 in California, I see a strong argument for excluding the year (c.f. California Proposition 46, though there aren't many examples), further supporting that the year acts as disambiguation and not as part of the proper name. I'm open to discussing this point further, but I don't feel it's clear-cut enough for the bot. — Earwig talk 02:36, 20 October 2018 (UTC)
    • Hi Earwig. I'm not really sure what the issue is here. The guideline previously stated that propositions should be of the title format "California Proposition 10, 1998" (the article itself is not named correctly according to the guideline by using parentheses). The RfC proposal was to move the year from the end to the start of the title, so California Proposition 10, 1998 would therefore become 1998 California Proposition 10. Number 57 10:40, 20 October 2018 (UTC)
      • Number 57, understood, but when I look through the category series starting at Category:California ballot propositions, 1994, I see almost no titles using the previously guideline-supported format. This leads me to believe the guideline was never really correct or widely applied in this area (it's a guideline, after all), so I have trouble using the guideline now as the sole justification for moving these pages automatically. — Earwig talk 17:06, 20 October 2018 (UTC)
        • Actually, let me move this discussion to the RfC; it's a better place. — Earwig talk 17:14, 20 October 2018 (UTC)
  • @TheSandDoctor, SQL, Xaosflux, Cyberpower678, Amorymeltzer, The Earwig, Ammarpad, and Frayae: The original RfC has been reopened for further input: Wikipedia talk:Naming conventions (government and legislation)#Proposed change to election/referendum naming format. Number 57 15:08, 20 October 2018 (UTC)
    @Number 57: Thanks for the note, the BRFA process does include many components, and this page is best for going over the technical issues of the execution, but not the best for determining community consensus as the the overall scope of edits. Article titles haven proven to be touchy subjects in the past. I brought up some points about possible unintended consequences above to category SORTKEYS, feel free to include that in the RfC if you want a definitive answer. — xaosflux Talk 15:12, 20 October 2018 (UTC)

Galobot 2

Operator: Galobtter (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 10:06, Tuesday, October 16, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python/Pywikibot

Source code available: here

Function overview: Message users who add broken file links to articles

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#CAT:MISSFILE_bot; Wikipedia:Bots/Requests_for_approval/RonBot_12

Edit period(s): Daily

Estimated number of pages affected: ~10-20 a day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Sends a talk page message to auto-confirmed users who add broken (red linked) file links to mainspace pages, by scanning CAT:MISSFILE. Mechanism is similar to Wikipedia:Bots/Requests for approval/DPL bot 2. Runs daily, seeing what new red linked files have been added, and messages the user who added them if they are auto-confirmed; doesn't message non-autoconfirmed users as they are likely vandals/wouldn't know how to fix the link. Most people who break file links are IPs/non-autoconfirmed so of the 70 or so broken links added each day I estimate only ~10 people will be messaged per day.

Figures out what image is broken and who did it using mw:API:Images and mw:API:Parse to get file links and finds out the revision in which the broken file link was added.

Message sent will be something like:

Hello. Thank you for your recent edits. An automated process has found that you have added a link to a non-existent file File:Hydril Compact BOP Patent.jpg to the page Blowout preventer in this diff. If you can, please remove or fix the file link.

You may remove this message. To stop receiving these messages, see the opt-out instructions. Galobtter (pingó mió) 10:06, 16 October 2018 (UTC)

Discussion

  • Consider this scenario: User A uploads a file and adds it to an article. A vandal (User B) blanks the page and User C reverts. Later, User D deletes the file. Who would be notified?
Note that it may take forever before pages with recently deleted files show up in Category:Articles with missing files so consider obtaining a list of articles from a database report and purging those so that the category is updated before you start notifying users. --Stefan2 (talk) 10:38, 16 October 2018 (UTC)
Thanks for the comment! In this case, nobody, because it skips cases where the file has been added and then removed and then added, i.e where the file has been added more than once. However if User A adds a file and later User B deletes the file, it'll notify User A, but only if that revision occurred within 24 hours before being listed in CAT:MISSFILE as it only checks the revisions that have occurred since the last run 24 hours ago. I was thinking previously, whether it should skip cases where the file has been deleted after a user adds a file? (can check deletion logs). Galobtter (pingó mió) 11:27, 16 October 2018 (UTC)
Actually, checking the deletion logs seems pretty necessary since the bot probably shouldn't spam people if FileDelinkerBot/CommonsDelinkerBot goes down. Will add Galobtter (pingó mió) 11:51, 16 October 2018 (UTC)

Not a good task for a bot. This is effectively equivalent to messaging someone every single time they make a typo and will likely be perceived as spam and/or be irritating to established editors. At 10-20 edits/day, this is pretty low impact, and comes off as a solution in search of a problem. -FASTILY 07:24, 22 October 2018 (UTC)

As it runs daily, it'll only message if people leave the broken file link for at-least a few hours. I wouldn't want to be messaged every time I made a typo but certainly if I broke a link to file and so caused an easily fixed problem in an article. And there is a definite problem it is trying to help solve: CAT:MISSFILE steadily rising and people spending quite a bit of time every day getting it down (because someone has to eventually fix the file link). That it'd only message 10-20 people a day shows that the number of people who break file links is quite low and so people are unlikely to messaged repeatedly that it becomes an irritant. Galobtter (pingó mió) 07:45, 22 October 2018 (UTC)
I'll split my response for clarity:
CAT:MISSFILE steadily rising and people spending quite a bit of time every day getting it down (because someone has to eventually fix the file link).
Unless CAT:MISSFILE is primarily populated by editors making typos, this is not a legitimate reason to run this task.
it'd only message 10-20 people a day shows that the number of people who break file links is quite low
Sounds like we don't need this task then
and so people are unlikely to messaged repeatedly that it becomes an irritant
It's irritating to people that do get messaged, especially if you're bothering them over minor things. In fact, this is one of the reasons I am opposed to this task. -FASTILY 03:55, 23 October 2018 (UTC)
I think the number here is somewhat underestimated - Wikipedia:Bot_requests#CAT:MISSFILE_bot says a 10 day trial generated 681 pages with broken file links. It hardly a minor thing if someone has broken a file link in an article, I think they would want to know. Some of these errors are definitely know to be due a poor search and replace with AWB, if the editor is not aware, then there is the strong possibility that the editor will use the same setup and create even more broken links. Ronhjones  (Talk) 19:50, 25 October 2018 (UTC)
The reason for that number is that it is mostly IPs or non-autoconfirmed users breaking links and many errors are from failures of the delinker bots upon deletion of files. Galobtter (pingó mió) 20:00, 25 October 2018 (UTC)
As a regular patroller of CAT:MISSFILE, I can say definitively that many red-linked files are due a poor search and replace with AWB or other script-assisted editors. See these two edit histories (1 and 2) for recent examples of red-linked images caused by script-assisted editing. I'm a less active patroller now than I used to be but I'm sure KatnissEverdeen and Sam Sailor can provide other examples. - tucoxn\talk 07:09, 27 October 2018 (UTC)

─────────────── I definitely would agree with Ronhjones and Tucoxn. However, while Tucoxn is definitely right that a lot of red-linked files are because of 'find and replace' AWB/script edits, I would also add that people (especially new editors) often don't realize that editing a filename breaks the image. I would argue that a message would be helpful, as I have received many confused messages on my talk page legitimately asking why I reverted them and what they did wrong. Here are a few other examples to illustrate this point (all of these people messaged on my talk page later saying they didn't know they had done something wrong). 1 2 3. Happy to provide other examples if you like. Cheers, Katniss May the odds be ever in your favor 16:17, 27 October 2018 (UTC)

Ronhjones and Tucoxn are both right here. Seasoned editors running AWB/scripts and overlooking changes to filenames is a common mistake. I am no saint myself: my first interaction with KatnissEverdeen was when she made me aware that I had overlooked a script-assisted change of a dash to emdash endash in a filename. The more "permanent" solution to these scenarios is to create redirects on Commons. I wish we had a little script for doing that, and if any of you have a good idea where to propose it, I would appreciate your feedback. Galobtter, thanks for coding the bot, I for one would like to know when I screwed something up. Sam Sailor 21:55, 27 October 2018 (UTC) (Amended. Sam Sailor 20:37, 29 October 2018 (UTC))
@Sam Sailor, KatnissEverdeen, and Galobtter: As a commons admin - I know that will be - c:Commons:Bots/Work_requests to request someone to invent/run a bot, and c:COM:BRFA for bot approvals. Ronhjones  (Talk) 00:40, 28 October 2018 (UTC)
@Ronhjones and Galobtter: I would agree with the script idea, not sure of the technical lingo I would need to use to request it though. I'm sure you all would be much better at wording the request than I would Face-smile.svg. Sam Sailor "I am no saint myself: my first interaction with KatnissEverdeen was when she made me aware that I had overlooked a script-assisted change of a dash to emdash endash in a filename." - Haha, I totally forgot about that...very easy thing to screw up and nobody's perfect Face-smile.svg. Katniss May the odds be ever in your favor 15:39, 29 October 2018 (UTC) (Amended "emdash" to "endash" in quote per WP:TPO for clarity. Sam Sailor 20:37, 29 October 2018 (UTC))
@Sam Sailor, KatnissEverdeen, and Galobtter: I'm not sure that commons would like such a bot. With 50 million images on site, it might be quite a few redirects! I'll post a question over there and see what they say. Ronhjones  (Talk) 15:45, 29 October 2018 (UTC)

──────────@Ronhjones: I have no clue if that is a job for a bot, I was thinking about a script that would make it a bit easier to create redirects on Commons.
Suppose you patrol CAT:MISSFILE, and you "correct" a spelling correction only to be undone which again causes a redlinked file. Here it would save some seconds with a script that could load up https://commons.wikimedia.org/wiki/File:Nutrient_absorbtion_to_blood_and_lymph.png and pop up a box containing the string File:Nutrient absorbtion to blood and lymph.png where you could change it to File:Nutrient absorption to blood and lymph.png, press Create redirect, and a redirect would be created from the latter to the former. Sam Sailor 20:37, 29 October 2018 (UTC)

Love this idea! I think this would be a super easy solution to quite a few of our issues here. Katniss May the odds be ever in your favor 20:40, 29 October 2018 (UTC)
@Sam Sailor, KatnissEverdeen, and Galobtter: Interesting. I don't write scripts very well at all, I've no idea how well a script on en-wiki would work well with commons - there are still some old users who have different usernames on commons - might cause issues! However, you don't need a commons redirect - it could be local redirect on en-wiki (does not matter if it redirects to a commons image), that would keep it much more simpler. Maybe you should ask at Wikipedia:User scripts/Requests Ronhjones  (Talk) 21:28, 29 October 2018 (UTC)
@Ronhjones: thank you, I created a local redirect at File:Nutrient absorption to blood and lymph.png to File:Nutrient absorbtion to blood and lymph.png, but it did not work. Are there special requirements to the syntax of redirects in file space? Sam Sailor 12:06, 8 November 2018 (UTC)
@Sam Sailor: Very odd and very unusual page. How did you create it? Wikitext or visual editor or dummy upload? See User:Ronhjones/Sandbox2 - three images are File:Testorientation.jpg, File:Testorientation.JPG, File:Testorientationtest.JPG - compare the last one to File:Nutrient absorption to blood and lymph.png Ronhjones  (Talk) 16:48, 8 November 2018 (UTC)

──────────@Ronhjones: Yours are working, mine are not. I tried substituting underscores for spaces in the filename in the redirect (diff), it did not change a thing. The redirect was created with Sagittarius+, but that should not be the culprit, and starting File:Nutrient absorption to blood and lymph TEST.png "manually" in the normal editor did not change anything. (I hardly ever use Visual Editor.)
I notice two things:

I wonder if my lack of the movefile flag is causing this. Would you grant me, at least temporarily, the file mover right? If you do, would you also delete File:Nutrient absorption to blood and lymph.png, so I can recreate it with the file mover right, and in any case delete File:Nutrient absorption to blood and lymph TEST.png, thanks. Sam Sailor 19:39, 8 November 2018 (UTC)

@Sam Sailor: I think you have been here long enough not to go mad with it  Done (and page deleted) Ronhjones  (Talk) 20:00, 8 November 2018 (UTC)
Thanks, Ronhjones. Recreated File:Nutrient absorption to blood and lymph.png, but the problem persists. Any ideas? Ask at VPT? Sam Sailor 20:05, 8 November 2018 (UTC)
@Sam Sailor: Bonkers! It won't work for me. I made a redirect for my balloon pic with a space - no problem, and I took out the spaces File:Nutrientabsorptiontobloodandlymph.png. The only difference I can see is that mine is jpg and yours is a png. Let me find a different png and try something. Ronhjones  (Talk) 20:26, 8 November 2018 (UTC)
@Sam Sailor:Not the png - made File:7 and 35 shields.png, all OK. Anything based on File:Nutrient absorbtion to blood and lymph.png fails. Suggest VPT, I'm now lost... :-( Ronhjones  (Talk) 20:34, 8 November 2018 (UTC)
Redirects on enwiki to files on Commons do not work. Redirects to Commons's files must be created on Commons. — JJMC89(T·C) 04:03, 9 November 2018 (UTC)

─────────────────────────Ahh, of course, thank you JJMC89. Could you, with your expertise in programming, by any chance write a script that facilitates creating redirects on Commons? Sam Sailor 08:33, 13 November 2018 (UTC)

Sam, A bot or a user script? — JJMC89(T·C) 02:37, 14 November 2018 (UTC)
JJMC89, a script something like this. Sam Sailor 08:23, 14 November 2018 (UTC)

Bots in a trial period

ZackBot 12

Operator: Zackmann08 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:55, Friday, October 26, 2018 (UTC)

Function overview: Replace deprecated parameters on {{Infobox islands}}

Automatic, Supervised, or Manual: automatic

Programming language(s): Ruby

Source code available: User:ZackBot/infobox islands

Links to relevant discussions (where appropriate): Template_talk:Infobox_islands#Can_we_clean_up_these_params?

Edit period(s): one time run

Estimated number of pages affected: all transclusions so currently 6821 pages.

Namespace(s):Mainspace

Exclusion compliant (Yes/No): yes

Function details: Very straight forward. There are a whole bunch of parameters on this template that have been deprecated (replacing spaces with underscores so for example {{{image name}}}{{{image_name}}}). This bot would be a one time run to go through and replace the deprecated parameters with their new version. A very simple find and replace that would only make changes within the infobox. @Frietjes and Plastikspork: pinging you both in case you wish to chime in. --Zackmann (Talk to me/What I been doing) 18:55, 26 October 2018 (UTC)

Discussion

  • I generally support the task. I would like to see the bot function similar to SporkBot which preserved comments and fixed indentation at the same time. I don't know how hard that would be to code, or if it would be better to just use SporkBot. Frietjes (talk) 13:58, 30 October 2018 (UTC)
    • For a simple parameter replacement, I'm fine with leaving the indentation alone if it seems like too much extra work to normalize (speaking from experience, it can be quite hard to automate in complex cases). That said, find + replace should preserve comments and other idiosyncrasies, so I think this is safe. Zackmann08, let us know when it's ready and I think a trial is in short order afterwards. — Earwig talk 02:35, 1 November 2018 (UTC)
      • @The Earwig: all ready to go! Let me know when I'm clear to test it out. :-) --Zackmann (Talk to me/What I been doing) 04:36, 1 November 2018 (UTC)
        • @Zackmann08: Cool. One thing sticks out from the code: I think the replacements will be made anywhere within the template body, not just parameter names. The risk of this is probably quite low, but if a string like "capital city" was found in an image caption, wouldn't the bot replace it with "capital_city"? I suspect this is easy to fix by testing for \s*= after the parameter name in each case. — Earwig talk 04:51, 1 November 2018 (UTC)

───────────────────────── @The Earwig: done. Did the first few one at a time and found a few small typos which I corrected (both in my script and in the page itself). But other than that, looks good to me. let me know if you see anything. --Zackmann (Talk to me/What I been doing) 06:18, 1 November 2018 (UTC)

@The Earwig: any update? Would love to unleash this bad boy. --Zackmann (Talk to me/What I been doing) 17:15, 6 November 2018 (UTC)
Hi. Looked through the edits; I see a couple small issues/nitpicks. Nothing too serious, but a few points worth fixing.
  • In [2], the bot replaced the valid |width min km= and |width max km= with the invalid |width_min km= and |width_max km=, causing the width to get removed from the infobox. (This is the only change I saw affecting an actual rendered page.) The page remains in the deprecated params maintenance cat, so we would've caught it eventually, but still should be fixed.
  • In [3], [4], [5], and [6], the bot either (1) messes with a (fake?) parameter, turning |locator map size= into |locator map size=, or (2) replaces "native name:" with "native_name:" in a comment that's not referring to a parameter name. Since the parameter doesn't seem to exist, it's not a serious issue (arguably GIGO), but I would still prefer not touching it in that case. For the latter thing, again, it's very minor, but we would ideally leave it alone.
  • In [7], the bot changed some text in the value of the |ethnic_groups= parameter. No actual effect because it was a link title, but if that link wasn't piped, it would have been visible. This is basically what I was worried about above, but it seems you didn't apply the fix to every parameter, only some of them?
Thanks! — Earwig talk 05:15, 8 November 2018 (UTC)
@The Earwig: thank you for taking the time to go through all of these! Great catches. I've got a number of projects I'm working on right now but I will get to this ASAP. --Zackmann (Talk to me/What I been doing) 01:27, 11 November 2018 (UTC)
@The Earwig: I have completely redone the bot. Can I get another 50 edit trial? --Zackmann (Talk to me/What I been doing) 19:38, 12 November 2018 (UTC)
@Zackmann08: Sure. Approved for trial (50 edits). — Earwig talk 03:07, 13 November 2018 (UTC)
@The Earwig: Done! diffs. Let me know. :-) --Zackmann (Talk to me/What I been doing) 03:22, 13 November 2018 (UTC)
@The Earwig: any update on this? --Zackmann (Talk to me/What I been doing) 18:12, 16 November 2018 (UTC)

MusikBot II 2

Operator: MusikAnimal (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 03:23, Monday, September 24, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Typically I use Ruby, but here it may have to be PHP or possibly Node.js.

Source code available: GitHub

Function overview: Syncs Wikipedia:Geonotice/list.json (to be created, fully-protected) with MediaWiki:Gadget-geonotice-list.js.

Links to relevant discussions (where appropriate): Special:PermaLink/862124571#Geonotices closed discussion, support of usage at Wikipedia talk:Interface administrators (see also RFC for IAdmins at top of that page allowing bot access where bot operator is also an IAdmin)

Edit period(s): Continuous

Estimated number of pages affected: 1

Namespace(s): MediaWiki

Exclusion compliant (Yes/No): No, not applicable.

Adminbot (Yes/No): Yes

Function details: First, some background: With the advent of the interface administrator user group, sysops can no longer edit MediaWiki:Gadget-geonotice-list.js. Many of these users are not particularly tech-savvy, and have no use for editing site-wide JS outside configuring geonotices for outreach purposes, etc. The configuration is literally just a JavaScript object, with key/value pairs. Using a JSON page then makes much more sense, which they'd be able to edit. However currently we cannot put JSON pages behind ResourceLoader (phab:T198758), so for performance reasons we need to continue to maintain the JS page. The proposed workaround is have a bot sync a JSON page with the JS page. This is in our best interests for security reasons (fewer accounts with access to site JS), but also JSON is easier to work with and gives you nice formatting, hence less prone to mistakes.

Implementation details:

  1. Check the time of the last edit to Wikipedia:Geonotice/list.json.
  2. If it is after the time of the last sync by the bot (tracked by local caching), process the JSON.
  3. Perform validations, which include full JSON validation, validating the date formats, country code (going off of ISO 3166), and format of the corners.
  4. If validations fail, report them at User:MusikBot II/GeonoticeSync/Report (example) and do nothing more.
  5. If validations pass, build the JS and write to MediaWiki:Gadget-geonotice-list.js (example), and update the report stating there are no errors (example).

The comment block at the top of MediaWiki:Gadget-geonotice-list.js can be freely edited. The bot will retain this in full.

Discussion

Per Xaosflux I've preemptively started this BRFA. I haven't done any coding but I think this is a good time to discuss implementation details. Concerns that come to mind:

  • What to do when there are syntax errors. The native JSON editor should mean admins won't introduce syntax errors, because it won't even let you save. But, it can happen -- say the admin ironically has JavaScript disabled. As a safeguard, the bot can validate the JSON, too (easy, thanks to existing libraries). Similar to User:MusikBot/PermClerk/Report, the bot could have a status report page, transcluded at Wikipedia talk:Geonotice/list.json. This way they can get some debugging info should something go wrong. If we want to get real fancy, the bot could also report when the configuration doesn't match the expected format, as described in the comments at MediaWiki:Gadget-geonotice-list.js. I think that would a nice feature, but not a requirement.
  • After deployment, we'd need to update the JS page to clearly say it should not be edited directly. We could do a two-way syncing, but I'd prefer not to, just to keep it simple.
  • I can confirm MusikBot II's account is secure and 2FA is enabled (with some caveats). The bot solution still puts us on the winning end, as there will be fewer int-admin accounts than if we added it to all who manage geonotices.
  • Anything else? MusikAnimal talk 03:23, 24 September 2018 (UTC)
  • @MusikAnimal: for the directional sync concerns, a defined "section" (delineated by comments) should be the only area edited - this section should be marked "do not edit directly" - and the bot should only edit within the section. This way if other changes to the page are needed they won't interfere. — xaosflux Talk 04:25, 24 September 2018 (UTC)
    • This should work fine, just like her PERMclerking, right? Would be good if there are rush edits, last-minute-changes, etc. ~ Amory (utc) 16:22, 24 September 2018 (UTC)
    • Yeah, we can definitely reserve a part of the JS page for free editing, much like we do at WP:AWB/CP. MusikAnimal talk 16:41, 24 September 2018 (UTC)
  • I'd like to see some tests over at testwiki that can be used to demonstrate the edits. — xaosflux Talk 04:25, 24 September 2018 (UTC)
    • No problem. Though I don't think we need to test Geonotice itself (could be tedious), rather just that the JS was generated properly. MusikAnimal talk 16:41, 24 September 2018 (UTC)
      • Agree, don't need to actually implement the geonotice, just that things work as expected in the namespaces and content types. — xaosflux Talk 01:21, 25 September 2018 (UTC)
  • Syntax errors could still occur in the data - will you validate this as well? For example putting start/end dates in their own cells, validate that this is date data and not something else? Everything should be validated (e.g. this should not be a route to inject new javascript). — xaosflux Talk 04:25, 24 September 2018 (UTC)
    • Perhaps make the mock-up json page to demonstrate? — xaosflux Talk 04:25, 24 September 2018 (UTC)
    • JS injection shouldn't be possible, unless there are vulnerabilities in Geonotice itself. I would hope it doesn't use eval on the strings. Arbitrary code (e.g. begin: alert('foo!') isn't valid JSON and hence would fail on the initial validation (and the MediaWiki JSON editor won't let you save it, either). We can still validate it ourselves, to be sure. As I said this would be a nice feature. I don't know that I want to validate things like the country, though. We could validate the 'begin'/'end' date format, in particular, but for everything else I think the bot will just look for permissible keys and the data type of the values ('country' is a string, 'corners' is an array of two arrays, each with two integers). MusikAnimal talk 16:41, 24 September 2018 (UTC)
      • Injection would be if you accepted arbitrary "text" and just made it js, where the text could contain characters that would terminate the text field and then continue along in javascript. — xaosflux Talk 17:11, 24 September 2018 (UTC)
  • For the JSON page, not the bot: we'll also have to move the normal explanation text into an editnotice or regular notice, since comments are stripped on save for pages with the JSON content model. Enterprisey (talk!) 23:32, 24 September 2018 (UTC)
  • Got a prototype working, see Special:Diff/861109475. There are quotations around all the keys, but JavaScript shouldn't care. Maybe we should test against testwiki's Geonotice to be sure. This does mean the rules have changed -- you don't need to escape single quotes ', but you do for double quotation marks ". This is just a consequence -- that's how JSON wants it. I think single quotes are probably more commonly used in the geonotice text anyway, so this might be a welcomed change. The bot could find/replace all "'s to ', but this would be purely cosmetic and error-prone when it is not really needed. Other formatting has changed, mostly whitespace. Also in the edit summary we're linking to the combined diff of all edits made to the JSON page since the last sync. That way we can easily verify it was copied over correctly. We do loose attribution here (as opposed to linking to individual diffs), but I think that's okay? Source code (work in progress) is on GitHub. I've made this task translatable, should other wikis be interested in it. I'm going to stop here until the bot proposal discussion has closed. MusikAnimal talk 04:55, 25 September 2018 (UTC)
    I agree with the quoting change. You may want to specify the number of edits if it's more than one, but I don't know if that's required for attribution. (And it's displayed on the diff page anyway.) Enterprisey (talk!) 06:16, 25 September 2018 (UTC)
  • I started adding some 'directions' at Template:Editnotices/Page/User:MusikBot II/GeonoticeSync/Test config.json, please fill out with more directions, requirements, etc. As far as attribution, in the edit request at least pipe to the name of the source page to make it clear where the source is without having to follow the diff. — xaosflux Talk 15:21, 28 September 2018 (UTC)
    • Another option there is to put the whole attribution (source, diff, time, user of diff) into the comments of the .json, and only minimal in the edit summary (revid, sourcepage). --Dirk Beetstra T C 13:59, 2 October 2018 (UTC)

─────────────────────────

  • Update I've resumed work on this and am ready for more feedback. The current implementation is described in the "function details" above. I still need to work on filling out Template:Editnotices/Page/User:MusikBot II/GeonoticeSync/Test config.json, please feel free to help. That page will be moved to Template:Editnotices/Page/Wikipedia:Geonotice/list.json when we're ready to go live.

    For validations, see Special:PermaLink/863494086 for an example invalid config (with lots of errors!) and Special:PermaLink/863494234 for generated report. A few notes:

    • I'm using Ruby internal methods to tell if the date is valid. This works for "Invalid date" or "35 January 2018 00:00 UTC" but not for invalid month names as with "15 Foobar 2018 00:00 UTC". Going by some logic I don't understand it chooses some other valid month. I could use regular expressions to ensure the month names are valid, but I want this bot task to work in other languages where I assume they're able to put in localized month names, if not a different format entirely (which Ruby should still be able to validate). Anyway I think this is fine. There were no validations before, after all :)
    • Validating the country code actually works! It's going off of the ISO 3166 spec, which is what's advertised as the valid codes Geonotice accepts.
    • Coordinates are validated by ensuring there are two corners, and each with two values (lat and lng), and that the values are floats and not integers or strings.
    • The keys of each list item are also validated, ensuring they only include "begin", "end", "country", and either "corners" or "text".
    • I added code to check if they escaped single quotations (as with \'), since Geonotice admins probably are used to doing this. Amazingly, MediaWiki won't even let you save the JSON page if you try to do this, as indeed it is invalid JSON. So turns out no validation is needed for this, or any other JSON syntax errors for that matter. This should mean we don't need to worry about anyone injecting malicious code.
    • The comment block at the top of the JS page is retained and can be freely edited.
    • Back to the issue of attribution in the edit summary, I went with Xaosflux's recommendation and am using a combined diff link, piped with the title of the JSON page. I'm not sure it's worth the hassle of adding in comments in the generated JS code directly, but let me know if there are any strong feelings about that.

Let me know if there's anything else I should do, or if we're ready for a trial! MusikAnimal talk 04:35, 11 October 2018 (UTC)

MediaWiki won't let you save the page with invalid JSON even if you turn off JS or use the API, right? Because if it does you may want to validate for that case. Enterprisey (talk!) 04:43, 11 October 2018 (UTC)
Luckily it's server side. It shows the error "Invalid content data", even if you have JS turned off. I haven't tested the API yet, but if it does work it's probably a bug in MediaWiki :) MusikAnimal talk 16:38, 11 October 2018 (UTC)
But I should clarify, the bot does validate JSON content, but I haven't tested to see if this works because I am unable to create invalid JSON :) At any rate, we would not end up in a situation where an invalid JS object is written to MediaWiki:Gadget-geonotice-list.js, because the core JSON methods that we're using would error out before this happens. MusikAnimal talk 20:15, 11 October 2018 (UTC)
This seems fine to me. My main concern (and it is fairly minor) is making sure the bot's role is clear in documentation/notices, and that people will know how to look for errors if something doesn't get updated because validation failed (because there won't be immediate user feedback as there is with the basic MW-side JSON validation). I'm giving this for a two-week trial, pending completion of the editnotice(s) and related pages and granting of the i-admin flag; based on history, that should allow for at least a handful of edits to test with, but feel free to extend if more time is required. Approved for trial (14 days). — Earwig talk 05:51, 14 October 2018 (UTC)
@The Earwig: will this be trialing on the actual pages or in userspace? Ping me if you need a flag assigned for trialing. — xaosflux Talk 00:06, 15 October 2018 (UTC)
@Xaosflux: My intention is for a full trial. I saw there were already reasonable tests done in the userspace, so given that MA feels comfortable working with the actual pages now, I'm fine with that too. As for the IA flag, it's not clear to me from the policy whether we can do that here or a request needs to be explicitly made to BN? I would prefer MA post something to BN to be safe, but I suppose one interpretation of the policy would let you grant it immediately without the waiting period. — Earwig talk 03:22, 15 October 2018 (UTC)
@MusikAnimal: what authentication options do you have configured for this bot account? (e.g. 2FA, BotPasswords, OAuth) — xaosflux Talk 11:57, 15 October 2018 (UTC)
@Xaosflux: 2FA is enabled. Historically I have not had a good solution for OAuth, but times have changed. I'll try to look into this today. For the record the current consumer for MusikBot II can only edit protected pages, all other admin rights are not permitted. We will use a different consumer here, and all related edits will be tagged with the application name. We could use "GeonoticeSync 1.0" (what I've dubbed the task, and then a version number), or is there a better name? For permissions, I believe the consumer only needs editsiteconfig.
So no need to grant int-admin just yet -- although it should be safe to do so, because we have 2FA enabled and the current consumer can't edit sitewide or user JS/CSS.
The outstanding to-dos:
  1. Create OAuth consumer and rework the bot to use it.
  2. Create Wikipedia:Geonotice/list.json to reflect current configuration, fully protect it, and move Template:Editnotices/Page/User:MusikBot II/GeonoticeSync/Test config.json to Template:Editnotices/Page/Wikipedia:Geonotice/list.json.
  3. Update documentation at Wikipedia:Geonotice and also describe the new process in the comment block at MediaWiki:Gadget-geonotice-list.js.
  4. Ping all the current Geonotice admins to make sure they know about the new system, and the new rules (don't escape single quotes, but do for double, etc.).
  5. Grant int-admin to MusikBot II, and enable the task.
I'll ping you when I'm done with steps 1-3, and once given the final okay we'll do 4-5. Sound like a plan? If we have to rollback or abandon the new system, I'll be sure to let everyone know that they can go back to editing MediaWiki:Gadget-geonotice-list.js directly. MusikAnimal talk 16:58, 15 October 2018 (UTC)
Sounds fine, let us know when you are ready. — xaosflux Talk 18:37, 15 October 2018 (UTC)
@Xaosflux and The Earwig: I spoke with Bawolff, a security expert working for the Foundation. It would seem in this case, bot passwords is no less secure than OAuth. OAuth is more about editing on behalf of users, or authorizing users to use some centralized service. This is fantastic news, because I found the library I was going to use is more for webservices (specifically Ruby on Rails or similar), which doesn't apply here. I would have to implement my own client. To illustrate the complexity, have a look at this "simple" example written in PHP.
So if it's alright, I'd like to move forward with the current bot infrastructure. I have gone ahead and set up the JSON config to reflect the current JS config, and filled in the edit notice. I'm going to be out of town this weekend, so I can start the trial early next week if we're ready to move forward (first doing steps 3-5 above). MusikAnimal talk 03:09, 19 October 2018 (UTC)
@MusikAnimal: its not "as good" but it is still much better than using standard passwords. Please let us know what BP grants you are including (you don't have to disclose the allowed IP ranges (that you should also use if you can). — xaosflux Talk 03:15, 19 October 2018 (UTC)
Just an update that I haven't forgotten about this. I would like to resume work soon. Bawolff had another great idea to use the MediaWiki parser API on the "text". This virtually eliminates security concerns, and above all, gets rid of all that HTML. It'll be easier for Geonotice managers to test what the output looks like, and I assume wikitext is more familiar to them than using <a>...</a>, etc. @Xaosflux and The Earwig: Thoughts? MusikAnimal talk 23:41, 4 November 2018 (UTC)
Seems OK, the sooner you can get a mock up running outside of the production page the sooner people can start test cases. — xaosflux Talk 16:03, 5 November 2018 (UTC)
  • @MusikAnimal, Xaosflux, and Earwig: Thanks for developing this bot! I have a few comments:
    • Coordinates are validated by ensuring there are two corners, and each with two values (lat and lng), and that the values are floats and not integers or strings. - I'm not sure why you want to treat integers and floats differently? Ignore this concern if you've already tested the bot with integer coordinates.
    • Thanks for creating a "comments" field in the draft JSON format. There's no need to copy comments in the JSON to the JavaScript page when the bot syncs notices.
    • Geonotices are a low-volume process and the team of maintainers is small, especially after the IAdmin right got unbundled. I think we should roll out the bot first and then update the documentation as we go along.
      Deryck C. 17:39, 7 November 2018 (UTC)
      • I've imported all current geonotices into Wikipedia:Geonotice/list.json (involving a bit of search and replace, and a bit of human legwork) so we can test the bot. If it runs correctly none of the current geonotices should change. Deryck C. 17:52, 7 November 2018 (UTC)
        Good point about the coordinates. I don't know why I decided to enforce floats. This has been fixed.
        For the record, I've tested that the Geonotice actually shows on testwiki. At this point I'm confident to move forward. Glad to hear you are ready as well!
        @Xaosflux: Shall we make a request at BN for the int-admin flag, or can you handle that? MusikAnimal talk 22:23, 7 November 2018 (UTC)

Ready to deploy

  • @The Earwig and MusikAnimal: as The Earwig has approved a trial I'll do it for the trial period, are you ready to begin? — xaosflux Talk 23:38, 7 November 2018 (UTC)
    @Xaosflux: Deryck says we can go ahead without updating documentation (which makes sense, in case we have to rollback), so yes, I'm ready :) Just a note that I'm not going to turn the bot on until the int-admin rights have been granted, hence don't expect MediaWiki:Gadget-geonotice-list.js to update immediately. MusikAnimal talk 02:21, 8 November 2018 (UTC)
    @MusikAnimal: int-admin access added for 14 days to support this trial. Any admin should block this bot immediately if it is malfunctioning related to this access. — xaosflux Talk 02:37, 8 November 2018 (UTC)
    @Xaosflux: Thanks! I apologize, I've changed my mind -- I'm going to start it first thing in the morning (my time). No need to extend the int-admin access, the ~12 hour difference is trivial. I'm going to ping everyone at Wikipedia talk:Geonotice#New bot-assisted_system just after the first edit.
    If something does go wrong, do we need to block the bot? I know admin/intadmin bots are more sensitive, and I'm even advertising the block button on the userpage. You definitely should revert the edits to the last stable version, but ideally you'd use the "disable task" link (making it anything other than true), that way the other bot task won't be interrupted. That link works, as evidenced by Special:Diff/867621853 after a vandal cleverly disabled the AWB task yesterday. I've upped the protection level on the run pages to prevent this from happening again. MusikAnimal talk 03:24, 8 November 2018 (UTC)
    @MusikAnimal: - Clarifying:if it is doing things off-task with this access, malfunctions should be handled in the normal escalating manner. — xaosflux Talk 04:30, 8 November 2018 (UTC)
    @MusikAnimal: The bot seems to have synced a spurious newline in the last sync test. Is this intentional? Deryck C. 12:59, 8 November 2018 (UTC)
    @Deryck Chan: Good catch, this has been fixed (see testwiki:Special:Diff/364384). I noticed you added "begincomments" and "endcomments" to the JSON config. Currently the only supported key is comments, you have throw everything in there. Is that satisfactory?
    I'm otherwise ready to go! The generated JS of the current JSON can viewed at testwiki:Special:PermaLink/364390. It looks right to me, minus the red links since those pages don't exist on testwiki. MusikAnimal talk 19:02, 8 November 2018 (UTC)
    @MusikAnimal: Thank you so much! Regarding "begincomments" and "endcomments", I was hoping the bot should ignore JSON fields that aren't in the bot specification, allowing admins to use arbitrary field names to leave additional notes about the JSON without changing the actual geonotice. But it seems that you've built the bot to police all fields and make the bot throw up an error if somebody adds unrecognised field names. That's fine with me too. Deryck C. 23:16, 8 November 2018 (UTC)
  • It seems that we've done enough testing on testwiki - should we try to set the bot to work on en.wp? Deryck C. 23:16, 8 November 2018 (UTC)
    We are live! Let me know of any problems. Thanks for your help. MusikAnimal talk 17:35, 9 November 2018 (UTC)
    @Cyberpower678: Conveniently you made an edit right after the first sync! It took another 15 minutes or so, but this was because the cron was not set up on Toolforge yet (I always do the first run locally). Special:Diff/868053189 was the first true, fully-automated edit, and it looks good :)
    This got me thinking -- the bot could automatically remove expired notices. @Deryck Chan: What do you think? MusikAnimal talk 18:04, 9 November 2018 (UTC)
    MusikAnimal, Well let's keep some things in place for us int-admins to do. :p —CYBERPOWER (Chat) 18:34, 9 November 2018 (UTC)
  • @MusikAnimal: the "readability" of MediaWiki:Gadget-geonotice-list.js has gotten harder with the sync, can you look over options for whitespace separations around the notices such as the human editors were previously doing? — xaosflux Talk 18:18, 9 November 2018 (UTC)
    @Xaosflux: Eh, it'd be quite tricky as I'm using a library to generate the JS object, and believe it or not the method is called pretty_generate =P Manipulating the result may be error-prone. I realize it's probably difficult for bot approvers to evaluate the JS, but the page geonotice managers will be using, Wikipedia:Geonotice/list.json, is more readable than ever! Hopefully that's OK? MusikAnimal talk 18:27, 9 November 2018 (UTC)
    Not a show stopper, and agree Wikipedia:Geonotice/list.json is much much easier to edit and review. — xaosflux Talk 20:21, 9 November 2018 (UTC)
    I personally prefer the JS formatting style of keeping the contents of each square bracket within the same line. On the other hand, the JSON viewing mode of MediaWiki does make it easier to review content, and the fact that the bot has standardised the presentation of JS on geonotice-list.js also adds to readability. All in all I think we are okay with this change in code formatting. Deryck C. 21:11, 10 November 2018 (UTC)

Bots that have completed the trial period

Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.


Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.


Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia:Bots/Requests_for_approval&oldid=868844933"
This content was retrieved from Wikipedia : http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval
This page is based on the copyrighted Wikipedia article "Wikipedia:Bots/Requests for approval"; it is used under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA). You may redistribute it, verbatim or modified, providing that you comply with the terms of the CC-BY-SA