Something strange is happening with WikiLeaks’ DNC Emails

The strange happenings on WikiLeaks’ server, which seem to have begun the day before Assange stepped down as editor in chief, continue to appear. WikiLeaks previously dismissed content breaking and disappearing from their website as being side effects of “changing” their Content Management System (CMS), an explanation that falls short of plausibility. While the CMS explanation seemed extremely unlikely before, it would be impossible for the most recently discovered batch of errors to be explained this way. The previous errors involved pages that were part of WikiLeaks’ CMS, which meant the connection existed and it was at least theoretically possible that it involved a change to the CMS. The newest errors do not involve the CMS in anyway – they involve the raw source files of the DNC emails posted by WikiLeaks.

1/ #WikiLeaks has released a statement in response:
“lizard
people

… or maybe changing CMS is not an easy task with a site of that scale.

While we also have to improve on security and reliability.

but, probably lizard people.” https://t.co/Zkxcq7L4O6 https://t.co/ersTMjk9Fx

— Emma Best (ᴜ//ғᴏᴜᴏ) (@NatSecGeek) October 6, 2018

While the contents of the emails themselves don’t appear to have been altered since their original release (see below for an explanation of the false positive), the raw source file for a number of the emails have demonstrably changed. When first released, the .eml files (the raw source file) for the DNC emails were all given names based off of an internal message ID, such as [email protected] AKA 00000083.eml, WikiLeaks email ID 325. (This particular file name has since been restored, though other errors have not.) While a majority of the .eml files appear to have been unaffected, some have mysteriously changed to the naming scheme that resembles that of WikiLeaks’ Podesta emails.

Far more strangely, at least three dozen of the DNC emails have had their file names changed to match WikiLeaks’ email ID – something not seen elsewhere. To make matters stranger still, the file extensions for these files have inexplicably changed from .eml to .txt. The contents of the source files remain unchanged (.eml files are .txt files with an extension that identifies them as emails and instructs computers to open them with email clients (as would almost always be preferred by users) instead of a text editor). The known instances are WikiLeaks email IDs 796, 836, 1880, 2899, 3146, 3157, 3584, 4520, 5162, 5530, 6070, 6291, 6819, 6874, 7688, 9097, 9288, 9693, 10369, 10902, 10979, 11508, 12496, 12578, 16192, 16441, 16566, 17055, 17248, 17341, 18497, 19190, 19551, 21586, 21776, and 21994 – each of which has had it’s inexplicable .txt format saved in the Wayback Machine.

After pointing this out on Twitter and listing the 36 known instances, one user checked a copy of the DNC emails they had retrieved months before. They found what appeared to be a modification to the email – a missing piece of metadata that identified the internal IP address that sent the email. After several hours of searching and comparing five different caches of DNC emails, the difference was both confirmed and explained – WikiLeaks’ copy of the DNC emails comes from several accounts, which resulted in some duplicates in their cache. The internal message ID for the duplicates would be the same, but differences in metadata would appear based on whether the email was being sent or received, and in the case of the former what device and client was sending the emails. Since the x-originating-ip metadata which seemed to appear and then disappear is added by the server when it’s sent, it would naturally be missing from the sender’s copy of the email. This addresses the most alarming question regarding the DNC emails, but does nothing to address the rest.

While some of the naming errors have resolved themselves, others persist and remain inexplicable. DNC Email ID 3584 continues to produce a raw source file named 3584.txt, as do the email IDs listed above. These changes are ultimately innocent, but inexplicable. More significantly, they further undermine WikiLeaks’ dismissal of their website errors as being the result of CMS changes, as the raw source files are entirely separate from their CMS. Nor can these changes be readily explained by making improvements to security and/or reliability. While some filename changes have been corrected and some questions answered, others persist. What significance, if any, this actually has remains to be seen – but WikiLeaks’ dismissal of the strange occurrences involving their server(s) unquestionably fall short of actually explaining them.

Unless lizard people.

Update: As of October 23rd 2018, WikiLeaks has addressed the .txt errors.