Analysing Data Leaks and avoiding early Attribution

René Pfeiffer/ January 4, 2019/ High Entropy

Hex dump of compressed Linux 4.20 kernel image.The new year starts with the same old issues we are dealing with for years. German politicians, journalists, and other prominent figures were (are) affected by a data leak. A Twitter account started tweeting bits from the leaked data on 1 December 2018 in the fashion of an Advent calendar. The account was closed today. You will find articles describing single parts of what may have happened along with tiny bits of information. Speculation is running high at the moment. So we would like to give you some ideas on how to deal with incomplete information about a security event floating around in the Internet and elsewhere.

Attributing data leaks of this kind is very difficult. Without thoroughly understanding and investigating the situation, proper attribution is next to impossible. Given the method of disclosure the leak is not published completely. While the links published on the Twitter account led to a data sharing platform, there is no way of knowing how much data was really copied from where. Analysing where the data came from is only possible with the help of the owners. The type of dumped data varies. There were mobile phone numbers, addresses, internal political party communications, photographs of ID cards, letters, emails, invoices, chat transcripts, mobile phone numbers, and credit card information. This selection points to a communication device such as an email client or a smartphone. Personal communication is often governed by the need to access data when being mobile. Again this is speculation. Given the variety of data owners there are probably more accounts compromised. Which kind of account exactly is guesswork. You would have to see a more complete picture of the data dumped.

The leaked bits of data also do not pose a complete picture in terms of chronological information. Some data was commented as already being copied months ago. Leaked data usually gets post-processed into collections. These collections are being refined and verified in order to increase the value of the data. Apparently this wasn’t important to whoever put the data online.

It’s always a good idea to go for the agenda. Look at the way the data is leaked, and ask who benefits from this. Just dumping data somewhere is not very smart. Using the data without publishing it has a lot more advantages. Publicity is a sign for the dreaded manipulation of the mind – information warfare. Advertising works the same way. Publish something that sticks to your thoughts. Works almost all of the time, especially in all kinds cyber. But again, this is speculation.

If you read about issues like this, there is a simple rule: Do not read any articles with a question mark (this „?“) in titles or subtitles. The „?“ is usually a sign for speculation. No offence, but you do not get anywhere in an analysis by asking your audience questions. The audience wants to know your facts, not your questions.

Share this Post

About René Pfeiffer

System administrator, lecturer, hacker, security consultant, technical writer and DeepSec organisation team member. Has done some particle physics, too. Prefers encrypted messages for the sake of admiring the mathematical algorithms at work.