Wikidata JSON Dump File Format

Data dumps files:

  1. Weekly dumps of the entire Wikidata database can be downloaded from http://dumps.wikimedia.org/other/wikidata/
  2. JSON (javascript object notation) is the recommended format
  3. JSON dump files are named YYYYMMDD.json

File format:

  • The file is encoded as a single list (i.e. a sequence of elements between the characters [ and ])
  • The brackets have their own lines in the file
  • Every other line holds a packed JSON encoded object.

There are two types of JSON objects:

  1. Items (e.g. universe, happiness, Jack Bauer, etc.)
  2. Properties (e.g. father, instance of, employer, etc.)

Both types have these common elements:

  • id: “Q###” for items, “P###” for properties
  • type: “item” or “property”
  • labels: name of the item or property in different languages
  • descriptions: description in different languages
  • aliases: lists of aliases in different languages
  • claims: statements, groups by property
  • sitelinks: links to pages on different sites describing the item
  • lastrevid: The JSON document’s MediaWiki revision ID
  • modified: The JSON document’s publication date

To look up an id, browse to http://www.wikidata.org/Q### (or P###)

About jimbelton

I'm a software developer, and a writer of both fiction and non-fiction, and I blog about movies, books, and philosophy. My interest in religious philosophy and the search for the truth inspires much of my writing.
This entry was posted in programming and tagged , , . Bookmark the permalink.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s