Tag Archives: big data

Big Data: How Much Space is Needed to Unzip a Gzip?

When you start playing with big data files like wikimedia dumps, you need a lot of disk space. Normally, if you run zcat -l file.gz, it will tell you how much. But for very large files: wikidata/data $ zcat -l … Continue reading

Posted in programming | Tagged , , , | Leave a comment

Working with Freebase Data

I’ve been playing with the data produced by http://www.freebase.com/ for a while now. This site, recently bought by Google, allows community editing of a shared knowledge base. The snapshot I’ve been using is 83 Gbytes in size, pushing it into … Continue reading

Posted in freebase | Tagged , , , , , | Leave a comment