Skip to content

macbre/nodemw

Repository files navigation

nodemw

Node.js CI Coverage Status code style: prettier

MediaWiki API client written in node.js

Requirements

Installation

Using npm

npm install nodemw

Or Download the latest stable version via GitHub.

Development version

git clone https://github.com/macbre/nodemw.git

Note

To run integration tests against production Wikipedia and WikiData servers, you first need to reate your own bot account at https://test.wikipedia.org/wiki/Special:BotPasswords.

And then set the TEST_BOT_USERNAME and TEST_BOT_PASSWORD env variables when running tests. Otherwise, we're getting rate-limited (HTTP 429 responses).

CI checks are already set up.

Features

  • HTTP requests are stored in the queue and performed in parallel with limited number of "threads" (i.e. there's no risk of flooding the server)
  • articles creation / edit / move / delete
  • file uploads (using given content or via provided URL)
  • Special:Log processing
  • listing articles in categories
  • and much more
  • getting claims from WikiData

Where it's used

First script

An example script can be found in /examples directory.

cd examples
node pagesInCategory.js

You can enter debug mode by setting DEBUG enviromental variable:

DEBUG=1 node examples/pagesInCategory.js

You can enter dry-run mode (all "write" operations like edits and uploads will be disabled) by setting DRY_RUN environmental variable (or dryRun entry in the config):

DRY_RUN=1 node examples/pagesInCategory.js

Running unit tests

npm test

How to use it?

Creating a bot instance

var bot = require("nodemw");

// pass configuration object
var client = new bot({
  protocol: "https", // Wikipedia now enforces HTTPS
  server: "en.wikipedia.org", // host name of MediaWiki-powered site
  path: "/w", // path to api.php script
  debug: false, // is more verbose when set to true
});

client.getArticle("foo", function (err, data) {
  // error handling
  if (err) {
    console.error(err);
    return;
  }

  // ...
});

Config file

nodemw can use config files as well as objects directly provided to bot object constructor.

// read config from external file
var client = new bot("config.js");

Config file is a JSON-encoded object with the following fields (see /examples/config-DIST.js file):

{
      "protocol": "https",           // protocol to use (defaults to 'https')
      "server": "en.wikipedia.org",  // host name of MediaWiki-powered site
      "port": 443,                   // port to use (optional, defaults to protocol default)
      "path": "/w",                  // path to api.php script
      "debug": false,                // is more verbose when set to true
      "username": "foo",             // account to be used when logIn is called (optional)
      "password": "bar",             // password to be used when logIn is called (optional)
      "domain" : "auth.bar.net",     // domain to be used when logIn is called (optional)
      "userAgent": "Custom UA",      // define custom bot's user agent
      "concurrency": 5,              // how many API requests can be run in parallel (defaults to 3)
      "proxy": "http://proxy:8080",  // HTTP proxy to use (optional)
      "referer": "https://example.com", // referer header to send (optional)
      "dryRun": false                // dry-run mode - disables write operations (optional)
}

Making direct API calls

nodemw allows you make direct calls to MediaWiki API (example querying Semantic MediaWiki API):

var bot = require("nodemw"),
  client = new bot({
    server: "semantic-mediawiki.org",
    path: "/w",
  }),
  params = {
    action: "ask",
    query:
      "[[Modification date::+]]|?Modification date|sort=Modification date|order=desc",
  };

client.api.call(
  params /* api.php parameters */,
  function (
    err /* Error instance or null */,
    info /* processed query result */,
    next /* more results? */,
    data /* raw data */,
  ) {
    console.log(data && data.query && data.query.results);
  },
);

Bot methods

The last parameter of each function in nodemw API is a callback which will be fired when the requested action is done.

Callbacks use node.js style - err is always passed as the first argument.

bot.logIn(username, password, callback)

Log-in using given credentials - read more

You can also call logIn(callback) without arguments to use credentials from config file.

bot.getCategories(prefix, callback)

Gets the list of all categories on a wiki (optionally filtered by prefix)

bot.getAllCategories(callback)

Gets the list of all categories on a wiki

bot.getAllPages(callback)

Gets the list of all pages from the main namespace (excludes redirects) - read more

bot.getPagesInCategory(category, callback)

Gets the list of pages in a given category - read more

By providing Category:Foo as titles argument to bot.purge you can purge all pages in a given category (available since MW 1.21)

bot.getPagesInNamespace(namespace, callback)

Gets the list of pages in a given namespace - read more

bot.getPagesByPrefix(prefix, callback)

Gets the list of pages by a given prefix - read more

bot.getPagesTranscluding(page, callback)

Gets the list of pages that transclude the given pages - read more

bot.getPagesBySearch(query, callback)

Performs a search and returns matching pages

bot.getPagesBySearchSorted(query, sort, callback)

Performs a search with specified sort order

bot.getArticle(title, [redirect,] callback)

Gets article content and redirect info - read more

bot.getArticleRevisions(title, callback)

Gets all revisions of a given article - read more

bot.getArticleCategories(title, callback)

Gets all categories a given article is in - read more

bot.getArticleInfo(title, callback)

Gets all info of a given article - read more

bot.getArticlePages(title, callback)

Gets list of all pages that are used on a given page

bot.expandTemplates(content, title, callback)

Returns XML with preprocessed wikitext (expanded templates) - read more

bot.parse(content, title, callback)

Returns parsed wikitext (HTML output) - read more

bot.edit(title, content, summary, minor, callback)

Creates / edits an article (and mark the edit as minor if minor is set to true) - read more

bot.append(title, content, summary, callback)

Adds given content to the end of the page - read more

bot.prepend(title, content, summary, callback)

Adds given content to the beginning of the page - read more

bot.addFlowTopic(title, topic, content, callback)

Add a Flow topic - read more

bot.move(from, to, summary, callback)

Moves (aka renames) given article - read more

bot.delete(title, reason, callback)

Deletes an article - read more

bot.undelete(title, reason, callback)

Undeletes an article (restores all revisions) - read more

bot.purge(titles, callback)

Purge a given list of articles (titles or page IDs can be provided) - read more

By providing Category:Foo as titles argument you can purge all pages in a given category (available since MW 1.21)

bot.sendEmail(username, subject, text, callback)

Send an email to an user - read more

bot.getToken(title, action, callback)

Returns token required for a number of MediaWiki API operations - read more / for MW 1.24+

bot.upload(filename, content, summary /_ or extraParams _/, callback)

Uploads a given raw content as a File:[filename] - read more

bot.uploadByUrl(filename, url, summary /_ or extraParams _/, callback)

Uploads a given external resource as a File:[filename]

bot.uploadVideo(fileName, url, callback)

Uploads a given video as a File:[filename] (Wikia-specific API)

bot.whoami(callback)

Gets information about current bot's user (including rights and rate limits) - read more

bot.whois(username, callback)

Gets information about a specific user (including rights, current block, groups) - read more

bot.whoare(usernames, callback)

Gets information about specific users (including rights, current block, groups) - read more

bot.createAccount(username, password, callback)

Create account using given credentials - read more

bot.sendEmail(username, subject, text, callback)

Send an email to an user - read more

bot.whoami(callback)

Gets information about current bot's user (including rights and rate limits) - read more

bot.whois(username, callback)

Gets information about a specific user (including rights, current block, groups) - read more

bot.whoare(usernames, callback)

Gets information about specific users (including rights, current block, groups) - read more

bot.move(from, to, summary, callback)

Moves (aka renames) given article - read more

bot.getImages(callback)

Gets list of all images on a wiki

bot.getImageUsage(filename, callback)

Gets list of all articles using given image

bot.getImagesFromArticle(title, callback)

Get list of all images that are used on a given page - read more

bot.getImageInfo(filename, callback)

Gets metadata (including uploader, size, dimensions and EXIF data) of given image

bot.getLog(type, start, callback)

Get entries form Special:Log - read more

bot.getLogByType(type, start, callback)

Get log entries of a specific type from Special:Log

bot.expandTemplates(content, title, callback)

Returns XML with preprocessed wikitext - read more

bot.parse(content, title, callback)

Returns parsed wikitext - read more

bot.fetchUrl(url, callback)

Makes a GET request to provided resource and returns its content.

bot.getImages(callback)

Gets list of all images on a wiki

bot.getImageUsage(filename, callback)

Gets list of all articles using given image

bot.getImagesFromArticle(title, callback)

Get list of all images that are used on a given page - read more

bot.getImageInfo(filename, callback)

Gets metadata (including uploader, size, dimensions and EXIF data) of given image

bot.getRecentChanges(start, callback)

Returns entries from recent changes (starting from a given point)

bot.getSiteInfo(props, callback)

Returns site information entries - read more

bot.getSiteStats(props, callback)

Returns site statistics (number of articles, edits etc) - read more

bot.getQueryPage(queryPage, callback)

Returns entries from QueryPage-based special pages

bot.getMediaWikiVersion(callback)

Returns the version of MediaWiki given site uses - read more

bot.getToken(title, action, callback)

Returns token required for a number of MediaWiki API operations - read more / for MW 1.24+

bot.upload(filename, content, summary /* or extraParams */, callback)

Uploads a given raw content as a File:[filename] - read more

bot.uploadByUrl(filename, url, summary /* or extraParams */, callback)

Uploads a given external resource as a File:[filename]

bot.uploadVideo(fileName, url, callback)

Uploads a given video as a File:[filename] (Wikia-specific API)

bot.getTemplateParamFromXml(tmplXml, paramName)

Gets a value of a given template parameter from article's preparsed content (see expandTemplates)

bot.getExternalLinks(title, callback)

Gets all external links used in article

bot.getBacklinks(title, callback)

Gets all articles that links to given article

bot.search(query, callback)

Performs a search

bot.searchByTitle(query, callback)

Performs a search limited to page titles

bot.getSiteInfo(props, callback)

Returns site information entries - read more

bot.getSiteStats(props, callback)

Returns site statistics (number of articles, edits etc) - read more

bot.getQueryPage(queryPage, callback)

Returns entries from QueryPage-based special pages

bot.getUserContribs(username, callback)

Gets contributions of a given user

bot.getRecentChanges(start, callback)

Returns entries from recent changes (starting from a given point)

bot.sendEmail(username, subject, text, callback)

Send an email to an user - read more

Helpers

bot.log(msg)

Log a message using the bot's logger

bot.logData(obj)

Log a JSON object to the console

bot.error(msg)

Log an error message

bot.getRand()

Returns a random string (useful for generating unique edit summaries or tokens)

bot.getConfig(key, def)

Gets config entry value (returns def value if not found)

bot.setConfig(key, val)

Sets config entry value

bot.diff(old, current)

Returns a diff colored using ANSI colors (powered by diff)

bot.fetchUrl(url, callback, encoding)

Makes a GET request to provided resource and returns its content. Optional encoding parameter (defaults to 'utf-8', use 'binary' for binary data)

bot.getTemplateParamFromXml(tmplXml, paramName)

Gets a value of a given template parameter from article's preparsed content (see expandTemplates)

Wikia-specific bot methods

They're grouped in bot.wikia "namespace".

bot.wikia.getWikiVariables(callback)

Get wiki-specific settings (like ThemeDesigner colors and hubs).

bot.wikia.getUser(userId, callback)

Get information (avatar, number of edits) about a given user

bot.wikia.getUsers(userIds, callback)

Get information (avatar, number of edits) about a given set of users (by their IDs)

This API is Promise-based, use await keyword.

Examples:

const wikidata = require("nodemw/lib/wikidata");
const client = new wikidata();

// Where is Saksun, Faroe Islands located?
const geo = await client.getEntityClaim(
  "Q928875" /* Saksun */,
  "P625" /* place location */,
);

// will give you the geolocation of the place
expect(geo[0].mainsnak.datavalue.value).toMatchObject({
  latitude: 62.248888888889,
  longitude: -7.1758333333333,
});

// When was Albert Einstein born?
const res = await client.getArticleClaims("Albert Einstein");

const dateOfBirth = res.P569[0].mainsnak.datavalue.value;
expect(dateOfBirth.time).toMatch(/1879-03-14/);

const dateOfDeath = res.P570[0].mainsnak.datavalue.value;
expect(dateOfDeath.time).toMatch(/1955-04-18/);

// interwiki links for a given artlice
const links = await client.getArticleSitelinks("Albert Einstein");
console.log(links.enwiki); // {site: "enwiki", title: "Albert Einstein", badges: ["Q17437798"]}

Stargazers over time

Stargazers over time