Wikidata JSON Dump File Format

Data dumps files:

  1. Weekly dumps of the entire Wikidata database can be downloaded from http://dumps.wikimedia.org/other/wikidata/
  2. JSON (javascript object notation) is the recommended format
  3. JSON dump files are named YYYYMMDD.json

File format:

  • The file is encoded as a single list (i.e. a sequence of elements between the characters [ and ])
  • The brackets have their own lines in the file
  • Every other line holds a packed JSON encoded object.

There are two types of JSON objects:

  1. Items (e.g. universe, happiness, Jack Bauer, etc.)
  2. Properties (e.g. father, instance of, employer, etc.)

Both types have these common elements:

  • id: “Q###” for items, “P###” for properties
  • type: “item” or “property”
  • labels: name of the item or property in different languages
  • descriptions: description in different languages
  • aliases: lists of aliases in different languages
  • claims: statements, groups by property
  • sitelinks: links to pages on different sites describing the item
  • lastrevid: The JSON document’s MediaWiki revision ID
  • modified: The JSON document’s publication date

To look up an id, browse to http://www.wikidata.org/Q### (or P###)

Advertisements

About jimbelton

I'm a software developer, and a writer of both fiction and non-fiction, and I blog about movies, books, and philosophy. My interest in religious philosophy and the search for the truth inspires much of my writing.
This entry was posted in programming, wikidata and tagged , , . Bookmark the permalink.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s