DeepSec Scuttlebutt: Tech Monsters from Novels and the Call for Papers Reminder
[This message was published via our DeepSec Scuttlebutt mailing list. The text was written by a human. This is a repost via our blog and Mastodon. Our Call for Papers for DeepSec 2023 is still running. If you have interesting content, please submit your idea.]
Dear readers,
the wonderful world of computer science and teaching courses has kept me busy. The scuttlebutt mailing list has the aim of having at least one letter per month. It is now the end of June, and the Summer has begun here in Vienna. The university courses have finished. The grades are ready. More projects are waiting. In information society, it is never a good idea to wait until something happens. A lot of blue teams are busy improving defences, testing configurations, and rehearsing their processes. However, there is a skill lost in between tight schedules and to-do lists. What about practising the art of observation? Let me explain.
In 2005, a hacker called lcamtuf wrote the book titled “Silence On The Wire”. The content addresses how computers and networks work. It starts with protocols and processors. The focus is on how information is processed and delivered. Data is the key to attacking organisations and defending against attacks. There have to be some side effects, some changes of states, or otherwise, the attack won’t work. So the idea is to educate oneself about what the network does. What is transported? What are the endpoints? How does the data look like when in transit? Do systems and applications have their unique fingerprint? As with intrusion detection, the first step is to learn about your organisation’s baseline. Few people do this anymore. I don’t want to deep dive into the reason time isn’t spent on this important aspect of defence. Everything that follows this observation phase is heavily influenced by what you have learned about your infrastructure. Your network traffic, system fingerprints, and application peculiarities are a useful tool. Make sure you know how to get a good view of your data flows.
What happens if you don’t spend some time mapping network traffic and data flows? Well, you can still defend your assets, but you will have to use generic knowledge from outside your own organisation. You can buy signatures and anomaly profiles from vendors around the world. Tools are your friend. Actually, you use someone else’s view of network and attack data flows. Your defence still might work, but you are not using the right tools and the correct configurations for your security measures. You are not doing anything wrong. Compliance is valid, and as long as nothing serious happens, you are probably fine.
Now consider the current hype of large language models (LLMs). The science-fiction writer Bruce Sterling has presented a brilliant analogy from Mary Shelley’s Frankenstein. Sterling describes Frankenstein as “the original big tech monster.” He continues: “Mind you, Large Language Models are remarkably similar to Mary Shelley’s Frankenstein monster—because they’re a big, stitched up gathering of many little dead bits and pieces, with some voltage put through them, that can sit up on the slab and talk.” [Source] Tools have their uses. For example, I use a writing aid that helps me to see typos, grammatical errors, and the use of wrong words directly when writing. The same goes for code. Most people will have tools running in the background when working. It doesn’t matter if you write code, novels, or articles. This is where a big misunderstanding comes into play. A lot of tools that work correctly and don’t get noticed are not based on extensive data collections with a wide spectrum of content. Most tools employ algorithms or filters with a very specific purpose and a narrow rule set. Even the writing tool I use for the scuttlebutt letters is much simpler than the LLMs published in the past years. If you have done the observations describes in lcamtuf’s book right, then you do not have the need for huge and complex tools such as GPT-x and company. The Keep It Simple principle still applies to the Age of LLMs.
Our call for papers still runs until 31 July 2023. Apart from the blue team side, we are also interested in hearing about how to use LLMs for offensive purposes. Mass producing convincing messages is the obvious choice, but are there other ways to make (ab)use of this technology? And what about attacking the LLMs itself? Poisoning the learning data corpus is difficult given the size of the input. Another method is to unlock certain modes or language constructs by varying prompts. Endless opportunities; you can spend a lot of time talking to algorithms these days. If you have some ideas, please let us know, too. We have an extra track for potential Birds of a Feather sessions or small break-out workshops.
I forgot to mention the code. Secure coding benefits from tools inspecting the code and data flow. LLMs may or may not improve the security of code. Having an algorithm as a partner to discuss code with can be beneficial. The output is based on undisclosed input (with GPT-4, for example). If you play with this, bring your generated code and critical test cases (i.e. ask yourself if you would trust this generated code with your life or health).
Best regards,
René.