User:John Cummings/Icelandic Women Report
Background
[edit]Wikipedia, the free and open encyclopaedia, is one of the most used educational resources in the world, with its articles being read over 20 billion times per month by over 500 million people. It has become an important part of many people's education. However there is a significant gender gap on Wikipedia, there are four times as many Wikipedia articles about men as there are about women, and a majority of the editors identify as males.
Wikidata, Wikipedia's database is becoming a worldwide database of women and gender related topics and is providing a substantial value to close the gender gap on the web:
- Identifying the who is missing from Wikipedia: Wikidata can be used to understand which people are covered by which articles, and in which languages, the tools to do this already exist. Women in Red (Wikipedia’s project to address the gender gap) is already using Wikidata to identify women missing from Wikipedia.
- Using Wikidata to create Wikipedia articles: Many language Wikipedias especially underrepresented languages use data from Wikidata to create Wikipedia articles, as well as in information boxes, maps and other resources. Investing in creating a larger database of women on Wikidata is also laying the groundwork for future developments including software created articles.
- As a resource for journalists: Only 1 in 5 experts interviewed in the media are women. This database will be a resource for journalists, making information on experts easily accessible and searchable by geography, profession etc. Part of the gender gap on Wikipedia is caused by the lack of coverage of women available from reliable sources in the media. Wikipedia requires references of subjects to establish ‘notability’ meaning Wikipedia duplicates the gender bias of the media. This data could easily be developed into a tool for journalists or other researchers. For example, they could enter an occupation, field of work or some other attributes of interest and a query could be auto-generated to show them a matching list. This tool is an interesting example showing similar functionality: wiser.d4science.org/search. Depending on the richness of data available, this could even include contact methods like the Twitter handles of the individuals.
- Make information easier to find: Wikidata is used by Google, Siri and other search tools which provide answers to the public. Example: ‘What date was Emmeline Pankhurst born’ natural language Google search question using data from Wikidata.
- Educational use: Wikidata can also be used by external educational tools to generate data visualisations e.g timelines.
- Example: Timeline of 430 prominent women computer scientists
- Example: Timeline of Astronauts by nationality (click a picture to go to the Wikipedia article)
- A tool for researchers: Wikidata connects information sources from across the web on a subject through its ‘external sources’ section. Example: The external sources section of Marie Curie structured Wikidata page.
- Visualising the gender gap: By populating Wikidata with information about women researchers can see the gender gap and how efforts to address it are progressing. More information
Case study: Iceland
[edit]We have worked with many organisations in Iceland to massively increase the number of women included on Wikidata. This can act as a model for understanding sources of data to create more complete coverage of people from a country or subject area. Iceland is a good place to start as it has strong policies and history of gender equality, a relatively small population and a small number of institutions and sources of information.
Process
[edit]We worked with Saeunn Gísladóttir, an Icelandic staff member at UNESCO to collect the data, she is wellconnected, understands the cultural context and can speak the language. Our process was:
- Saeunn creates a list of all the major Icelandic institutions that could be investigated for possible data to import
- Contact these institutions to request data for import into Wikidata
- Match the data to Wikidata to identify any items that already exist (usually using the Mix'n'Match tool, or OpenRefine).
- Create missing items if necessary, then import source data in to Wikidata (usually using QuickStatements, or editing via the Wikimedia API)
Outcomes
[edit]As a starting point, we imported the women listed in the Ísmús online database of Icelandic music and cultural heritage as the data was available in a well structured format (see dataset imports tracking page). With just this small dataset the number of known Icelandic Woman in Wikidata has been increased by 250%. Here is a summary of the impact on the coverage of Icelandic Women in Wikidata. As Wikidata contains information about the rest of the world, the imported data can be cross referenced and compared to find new insights and information that would not be possible if it existed in isolation.
Description | Count |
---|---|
Icelandic Women before import | 771 |
New people added to Wikidata | 1924 |
Total statements added | 11,925 |
Visualisations
[edit]Once data is imported, we can easily visualise it using the many tools available.
Compare with other areas of Wikidata
[edit]After this import, Iceland has entered the top 50 countries by count of Women on Wikidata (43rd position)
Examples of where we have used the data
[edit]Understand who is missing from Wikipedia
[edit]Any new items that are created can immediately be used to inform editors about missing content on Wikipedia in any language. For example, this can be used to generate automated lists for the Wiki Women in Red project or other communities of interest.