Over 1 million records that disappeared from WikiLeaks are still missing

Update: As of April 15, WikiLeaks restored access to the Podesta emails, the Macron emails, the HBGary emails, the Sony files, the Saudi Cables and the Embassy Shopping list.

In November 2022, Gizmodo and the Daily Dot reported that WikiLeaks’ website was falling apart and struggling to stay online as millions of documents disappeared. According to the Daily Dot, as few as 3,000 of the ten million documents that once populated the WikiLeaks website remained accessible. Fifteen months later, many of the documents have returned to the website – but over 1 million are still missing, ranging from diplomatic records to 2016 election emails. Attempting to load them only produces error messages, ranging from 404 pages to server errors and even prompts for passwords.

The missing data appears to all date from 2015 or later, with the earliest dataset being the republication of the ICWatch documents, followed soon by the Sony documents. While the emails are once again accessible, the 30,287 documents are still missing. Attempting to search the Sony documents directly – or view the results from the site’s general search – returns an error page promising the page “is currently being rebuilt.” Unfortunately, it’s unclear if the page is actually being rebuilt as this is the site’s 404 page that will be shown for any URL – regardless of if it’s nonsense whether it is, was, or ever will be legitimate. 

A little over a month later, WikiLeaks began publishing over 500,000 diplomatic records from Saudi Arabia, dubbed the Saudi Cables. Attempting to load them now produces a server error – 504 Gateway Time-out. It’s unclear why the cables were never restored to the website as the raw data appears to still be downloadable on WikiLeaks’ file server, though it’s worth noting that WikiLeaks faced accusations of privacy violations after making them searchable.

On July 19, 2016, WikiLeaks released the first part of the AKP emails, followed by the rest of the emails in August. Between the two tranches of data, there were over 400,000 emails. Trying to load the page now produces the 404 error page again.

Days later, WikiLeaks began releasing the DNC emails, which eventually included over 35,000 emails and attachments. Trying to load the DNC emails now produces an “Internal Server Error.” Later in the 2016 U.S. Presidential election cycle, WikiLeaks began releasing the Podesta emails, which now produces another 404 error. The exact number of emails in the Podesta emails remains slightly unclear, though various counts have placed it between 50,000 and 60,000 emails. It’s unclear why this data was never restored to the website or any searchable form since, like the Saudi Cables, multiple versions of the Podesta emails appear to still be downloadable on WikiLeaks’ file server.

WikiLeaks made two more datasets searchable in 2016 that have since disappeared from the website. At the end of November, WikiLeaks republished more than 60,000 hacked emails from HB Gary. Currently it returns another 404 page. The original data from 2011 remains readily accessible by torrent and archived in various places (including DDoSecrets) it’s again unclear why this data was never restored to the website. It’s worth noting, however, that HB Gary appears to be referenced in the second superseding indictment against Assange

At the beginning of December, WikiLeaks republished over 57,000 emails from Berat Albayrak, Turkey’s Minister of Energy. Trying to load the page now results in WikiLeaks asking for a password. It’s unclear why this data was never restored to the website or any searchable form since like the Saudi Cables and the Podesta emails, Berat’s Box appears to still be downloadable on WikiLeaks’ file server.

In July 2017, WikiLeaks republished about 20,000 verified and 50,000 unverified Macron emails. If you load the page now, it produces another “Internal Server Error.” In December 2018, the site republished more than 16,000 embassy procurement requests. It’s unclear why the dataset is still displayed and linked prominently on the front page, since it produces a “502 Bad Gateway” error. In November 2019, WikiLeaks began making the Fishrot Files public with a database of over 30,000 documents. Trying to access the database now produces another “504 Gateway Time-out.” (The source has since given DDoSecrets the same data and more.)

At least 1,250,000 records are still be missing from the website (in addition to the million missing Syria files), though up to half seem to remain on their file server in some form. Many (if not all) of the previously published documents remain indexed and searchable in WikiLeaks’ master search, but inaccessible. WikiLeaks hasn’t given any explanation, though WikiLeaks 2018 response to errors on their site and that Assange said in mid-December 2023 that WikiLeaks wasn’t able to publish the way it did before.

Correction: This previously omitted one of the missing datasets ICWatch. This has been added.