Open Source

Checking Wikipedia style guidelines using FiveUI

Wikipedia maintains a Manual of Style that contains prescribed details for content and formatting in Wikipedia articles. The manual provides consistency, which makes articles accessible and easily comprehensible. The guidelines are extensive, providing a high degree of consistency. The downside is that it is difficult for casual contributers read and internalize all of the rules.

FiveUI is an extensible tool for evaluating HTML user interfaces against sets of codified UI Guidelines. The Wikipedia style guidelines can be codified and checked automatically. Upon making edits to an article, a contributer using FiveUI sees a list of any items that should be changed before publishing.

FiveUI also simplifies auditing articles in bulk. It supports a headless mode that uses Selenium to test large numbers of pages and produces a report of issues with each page.

We have codified a portion of the Manual of Style which can be used right now. This serves as a starting point to demonstrate the usefulness of FiveUI. The codified guidelines are included in the FiveUI source distribution under guidelines/wikipedia/.

Getting started

FiveUI is implemented as a browser extension for Firefox and Chrome and as a headless script. To install the browser extension see the Install Guide. To get to the options page for the extension see Getting Started. In the extension options page:

  1. Click 'add a rule set'.
  2. In the first text box enter the URL to the Wikipedia guidelines manifest:
    https://raw2.github.com/GaloisInc/FiveUI/master/guidelines/wikipedia/package.json And click on the little 'save' icon.
  3. In the second text box enter a pattern for URLs that should be checked against the guidelines:
    http*://*.wikipedia.org/wiki/*

Now navigate to some Wikipedia articles. The FiveUI icon should change from gray to red. If any style issues are found then a problem count will appear superimposed over the icon.

The guidelines that are checked

The guidelines that we have codified so far focus mainly on accessibility and formatting.

Guidelines that deal with prose (word choice, tone, voice, punctuation, etc.) are more difficult to verify with an automated process. In some cases we can use heuristics to identify cases where there might be a problem, and produce warnings to encourage the user to take a closer look.

Wikimedia content exists in two forms: markup and rendered HTML. FiveUI checks HTML, not markup. To be more precise, FiveUI checks fully rendered web pages. That provides certainty that details such as colors, paragraph width, and image placement are checked in the context of what readers will see. But it does mean that some translation is required in cases where the Manual of Style makes statements about conventions that should be used in markup.

The rules that we have implemented so far are listed below. Rules are specified with Javascript; the headings indicate the source file that defines each rule.

headingOrder.js

Checks that, e.g., there is an h2 before each h3.1 Also reports an error if any h1 tags are found in article content.2

bullets.js

Produces warnings to remind users to minimize use of bullet points.3 The output from this rule is a bit noisy at the moment.

strikeout.js

Produces error reports if an article contains struck-out text.4 That includes <del> and <strike> tags, and <span> tags with a text-decoration: line-through style.

color.js

Produces an error if the contrast ratio between text and background color falls below the WCAG 2.0 AA level:5

  • less than 4.5:1 for most text
  • less than 3:1 for normal text with a point size of at least 18
  • less than 3:1 for bold text with a point size of at least 14.

Produces a warning if contrast falls below the AAA level:

  • less than 7:1 for most text
  • less than 4.5:1 for normal text with a point size of at least 18
  • less than 4.5:1 for bold text with a point size of at least 14.

In order to keep reports focused on article content rather than site design, there are some exceptions built into this rule to skip over styles that are produced by predefined Wikipedia templates. At the moment the rule ignores links with "new", "external", or "extiw" class and ignores links in navbox group cells (<th> elements with the "navbox-group" class).

spaceBetweenParagraphs.js

Produces error reports when extra space is found between paragraphs. This corresponds to markup with more than one blank line between paragraphs. In the rendered HTML the result is one or more paragraphs that contain <br> tags but no text.6

paragraphLength.js

Produces a warning for any paragraphs that contain only a single sentence.7

imageSize.js

Produces warnings for images that are wider than 400px and that are left- or right-aligned. Also produces warnings for any images that are more than 500px tall.8

horizontalRule.js

Produces warnings indicating that horizontal rules are deprecated.9

pseudoHeadings.js

Detects uses of bold text or of a description list with a single <dt> and no <dd> to emulate a heading. Produces error reports.10

pseudoIndent.js

Detects uses of a description list with a single <dd> and no <dt> to indent content.11

It seems that indenting content in this way may now be an accepted practice. So this rule may be a candidate for removal.

floatSandwiches.js

Produces warnings in cases where two images occupy overlapping vertical space, where one image is left-aligned and the other is right-aligned.12

spaceBetweenListItems.js

Checks for list items that have been (probably mistakenly) separated by blank lines in wiki markup. This results in adjacent lists that each contain only a single item.13

imageAlt.js

Reports warnings upon encountering images without an "alt" attribute.14

imageCaption.js

Checks images wider or taller than 50px for a caption. Reports a warning if none is found.15

inlineStyle.js

Reports errors when encountering inline styles.16

A number of the standard templates that Wikipedia provides include inline styles - so this rule produces some false positives. One of our work items is to expand and refine the list of exceptions included with this rule.