Jump to content

Extension:Scribunto/Victor's API proposal

From mediawiki.org

There are several packages which will be shipped with Scribunto by default.

MediaWiki Lua API

The aim of Scribunto Lua in-script API is to provide the scripts an interface to certain features of MediaWiki software which are written in PHP and most of which are not feasible to implement in Lua. The first-priority target is to provide access to all those interfaces which were previously exposed to parser as magic words and parser functions.

Here we try to keep to the table arguments convention whenever it is feasible. func{a, b, c} means that you should invoke function as func{a = "whatever", b = "argument", c = "trolling"}.

Whenever a property is defined, it is either read-only ([ro]), write-only ([wo]) or read-write ([rw]).

Provided interfaces

All interfaces are the part mw package.

  • mw.lang — internationalization-related functions.
  • mw.page — interface to provide information about the current page (title, etc) and do direct manipulations with it.
  • mw.query — functions which require database queries in order to work. The total amount of calls to those functions is limited; the limit is shared with parser's expensive function count.
  • mw.site — functions which provide the information about the site.
  • mw.time — functions which provide interfaces for time manipulations.
  • mw.title — functions which allow to manipulate the titles
  • mw.text — functions which are used to handle the wikitext.
  • mw.url — functions which provide access to URL-related functions.

Data structures

The time is passed in a following structure, which extends over Lua's standard date/time structure:

  • Standard members:
    • year
    • month
    • day
    • hour
    • min
    • sec
    • wday — week day (Monday is 1)
    • yday — day of the year
  • Extensions:
    • monthname — localized month name
    • timezone — the timezone in which timestamp is supplied

The revision information is passed in the following structure:

  • id
  • author
  • timestamp (always UTC)

The parsed title data has the following fields:

  • namespace — namespace ID
  • namespaceName — namespace name (localized)
  • name — the name of the page, without namespace
  • fullName — full name of the page, with namespace
  • fullText — the full normalized title, including interwiki prefix
  • interwiki — the interwiki prefix, if is there
  • fragment — the destination fragment

Detailed interface description

mw.lang

  • mw.lang.contentLanguage [ro] — the language code of the content language, i.e. the main language of the wiki.
  • mw.lang.UILanguage [ro] — the language code of the UI language, i.e. language in which user has his interface now.
  • mw.lang.languageName{code[, language]} — returns the language name of language with code. If language is not specified, return in the language itself.
  • mw.lang.message(messageName, ...) — formats the message and returns it.
  • mw.lang.plural(number, form1, form2...) — similar to {{plural:number|form1|form2|...|}}.
  • mw.lang.formatNumber(number) — formats the number according to the language conventions.
  • mw.lang.gender(username, ...) — picks the right version of the string depending on the user gender.
  • mw.lang.specialPageName(page) — returns the localized name of a given special page.

mw.page

  • mw.page.title [ro] — returns the title structure
  • mw.page.currentRevision [ro] — returns the revision structure
  • mw.page.defaultSort [wo] — similar to {{DEFAULTSORT}}
  • mw.page.displayTitle [wo] — similar to {{DISPLAYTITLE}}

mw.query

The query module has different configurable limit-related variables:

  • blockSize — defaults to 100
  • listLimit — defaults to 500

In case when the limit is exceeded, the error is thrown.

  • mw.query.blockSize [ro] — the blockSize.
  • mw.query.listLimit [ro] — the listLimit.
  • mw.query.expensiveFunctionLimit [ro] — the limit of allowed calls to expensive functions.
  • mw.query.expensiveFunctionRemaining [ro] — how much more calls to expensive functions are allowed
  • mw.query.pagesExist(pages) — checks whether the pages exist and returns the result in form of page->existence table. Note that page name in the resulting table is normalized. This is counted as one expensive query, but for every blockSize of pages this count is increased by 1.
  • mw.query.pageInformation{pages, props} — returns the information about pages. The information to return is specified in props array. Currently available are size and is_redirect. This is counted as one expensive query, but for every blockSize of pages this count is increased by 1.
  • mw.query.prefixIndex{prefix, startWith, limit} — list the pages beginning with prefix, starting with startWith. Returns at most limit pages, or listLimit, whatever is smaller.
  • there will be more at the later stage

mw.site

  • mw.site.siteName [ro] — returns the name of the site.
  • mw.site.version [ro] — returns MediaWiki software version.
  • mw.site.namespaces [ro] — returns localized namespace ID to namespace name map.
  • mw.site.canonicalNamespaces [ro] — returns non-localized namespace ID to namespace name map.
  • mw.site.interwikiTable [ro] — returns the interwiki table in format { interwiki prefix -> { url, api, wikiID, isLocal, isTrans } }
  • mw.site.numberOfPages [ro]
  • mw.site.numberOfArticles [ro]
  • mw.site.numberOfFiles [ro]
  • mw.site.numberOfEdits [ro]
  • mw.site.numberOfViews [ro]
  • mw.site.numberOfUsers [ro]
  • mw.site.numberOfAdmins [ro]
  • mw.site.numberOfActiveUsers [ro]

mw.title

  • mw.title.parse(text) — parses the text and returns either the title structure or nil
  • mw.title.normalize(text) — normalizes the title; returns nil if the input is an invalid title
  • mw.title.isLocal(text) — returns true if the input is a valid title and is not an interwiki destination

mw.time

  • mw.time.UTC [ro] — returns the current time in UTC
  • mw.time.local [ro] — returns the current time in the wiki timezone
  • mw.time.unixTimestamp [ro] — returns the exact Unix timestamp in seconds, but with highest floating-point precision possible
  • mw.time.toLocal(timestamp)
  • mw.time.toUTC(timestamp)
  • mw.time.parse(text) — parses the text and returns a timestamp object, assuming by default that timezone is timezone (UTC if not specified).
  • mw.time.format{timestamp, format} — formats the date according to the format specification.

mw.text

  • mw.text.escape(text) — escapes wikitext.
  • mw.text.tag{name, contents, params} — creates a tag marker for tag named name. Similar to {{#tag}}.

mw.url

Title input in those functions may be both text and the title structure.

  • mw.url.encode(text) — escapes a URL string
  • mw.url.encodeAnchor(text) — escapes a URL anchor string
  • mw.url.local{title, query} — returns a local (relative) URL to title, optionally with query
  • mw.url.full{title, query} — same as above, but uses full URL instead of local one (includes server name).
  • mw.url.canonical{title, query} — same as above, but has a protocol prefix.
  • mw.url.server [ro] — similar to {{SERVER}}.
  • mw.url.serverName [ro] — similar to {{SERVERNAME}}.
  • mw.url.scriptPath [ro] — similar to {{SCRIPTPATH}}.

ustring API

The ustring module is a module which provides manipulations with UTF-8 strings. It aims to be similar to built-in string module in Lua; however, it extends it in some features and it does not provide pattern matching (a seperate regular expression library will be provided for that later). Also, it does not provide an OOP interface to strings[1]. There are the following functions in the ustring library:

  • ustring.find(s, needle[, init]) — does a substring search, and returns the start and the end point of the match (or nil, if not found). Note that the needle argument is not a pattern.
  • ustring.len(s) — returns the string length in code points.
  • ustring.lower(s) — converts the string to all-lowercase
  • ustring.pairs(s[, start, end]) — allows to iterate over all codepoints in the string, or in a substring (from start to end).
  • ustring.split(str, separator[, limit]) — splits the str into at most limit substrings (default limit is infinity)
  • ustring.sub(s, i[, j]) — returns the substring; the syntax is similar to string.sub.
  • ustring.trim(s) — trims all the whitespace at the beginning and at the end of the string.
  • ustring.upper(s) — converts the string to all-uppercase
  • ustring.upperFirst(s) — converts the first character of the string into uppercase

All functions index the offsets in string by codepoints, not bytes. If invalid UTF-8 is supplied, an error is raised.

Footnotes

  1. Such interface was considered, but it is impossible to adequately implement it in pure Lua.