The Physicality of Data And the Road to Cybersecurity
By David Kruger (featured on Forbes.com)
This article is the second in a series on the physicality of data. The series begins here. I’ll follow up with additional installments of this series over the next several weeks, so check back to see those as they become available.
Cybersecurity failures have been trending sharply upwards in number and severity for the past 25 years.
The target of every cyberattack is data — i.e., digitized information that is created, processed, stored and distributed by computers. Cyberattackers seek to steal, corrupt, impede or destroy data. Users, software, hardware and networks aren’t the target; they’re vectors (pathways) to the target. To protect data, the current strategy, “defense in depth,” seeks to shut off every possible vector to data by erecting layered defenses. Bad news: That’s mathematically impossible.
Let’s do an easy word problem; you won’t even need a calculator.
1. Count vulnerabilities: Add up every type of user (human or computer), hardware, software and network that currently has an exploitable vulnerability.
2. Count vectors: Add up the total of users, networks and instances of software and hardware that contain the above-counted vulnerabilities.
Multiply vulnerabilities by vectors to get “total cyberattack potential.”
Now, let’s figure out the “total cyberdefense potential”:
1. Add up every currently available defense, including technological defenses and human defenses such as cybersecurity training and education.
2. Subtract unerected defenses, either due to 1. insufficient cybersecurity personnel, money and time, 2. the fact that it doesn’t yet exist due to lag time between vulnerability discovery and defense development and 3. the fact that the vulnerabilities are known by cyberattackers but unknown by cyberdefenders.
3. Subtract erected defenses that 1. cyberattackers can defeat and 2. are improperly implemented.
Which is greater, total cyberattack potential or total cyberdefense potential? Cyberattack potential is always greater than cyberdefense potential.
Defense in depth can’t close off every vector every time, not just because possible attacks always outnumber possible defenses but also because cyberwarfare is immensely asymmetrical. If a cyberdefender scores 1,000,000 and a cyberattacker scores 1, the cyberattacker wins.
So, why is defense in depth recommended if it can’t succeed? Because it’s the only possible strategy if data is inherently defenseless. Cyberattackers must be kept away from the data.
Software manufactures data objects in conformance with a design. Data is inherently defenseless because that’s how the software makes it. (To understand why software makers aren’t already making self-defending, self-directing data, see the first article of this series.)
Manufacturing. We don’t think of computers as miniaturized manufacturing plants, but that’s what they are. They receive raw information in the form of language (human or machine), sound and imagery, and convert it into physical data objects comprised of patterns of ones and zeros which are applied to “quantum small” physical substrates: microscopic transistors, electrical pulses, light, radio waves, magnetized particles or pits on a CD/DVD.
A data object is like other kinds of man-made, mass-produced physical objects — known physical properties enable manufacturing systems to build data objects in conformance with a design, creating objects that can be “processed” — that is, combined with other objects, stored, reused, modified, copied, shipped or scrapped.
Design. Information was first digitized in the early 1950s. The software controlling the manufacture and processing of data objects was primitive, so the first data objects had to be simple. Objects only had two components: digitized information (data) and metadata (data about the data) — a name and physical address so objects could be found later. Anyone with access to the software could find, reuse, modify, delete or copy data objects without limit. Because data objects were shared by saving a copy at another destination, every copy was limitlessly reusable — and therein lies the problem.
The vast majority of data objects today use the same nearly 70-year-old design: Simple data/metadata objects that are inherently defenseless because they have no built-in capacity to defend themselves or direct their own use.
If software controls the design and manufacture of data objects, can it manufacture self-defending, self-directing data? Of course, it can. The design and manufacture of physical data objects are controlled by the software maker. So, how does data defend and direct itself?
Self-Defense. Data objects defend themselves with encryption, which renders them unusable if captured by cyberattackers. Unfortunately, little existing or newly produced data is encrypted. When it is, it’s usually only partially applied. Most encryption is applied only to copies of data placed in third-party encryption enclaves, meaning potentially n number of copies outside the enclave remains defenseless — and cyberattackers know exactly where to look for them. The only way to achieve full coverage encryption is to apply it by default. The software that creates the data encrypts data objects at creation, and the software stores and transports the data encrypted.
Self-Direction. Cybersecurity fails when cyberattackers gain control of usable data in violation of a predetermined set of rules. Common cybersecurity and usage policy rules include restrictions as to who, on what hardware, where geographically, when, for how long and for what purposes data may be used. Data is made self-directing by software tightly binding decryption rules and usage rules for decrypted data to the data itself, so the rules go wherever the data goes.
The design of self-defending, self-directing data assumes compromise:credentials have been stolen, the network is penetrated, malware is installed and negligent/malicious insiders are working. Why? Because, as we’ve seen, it’s impossible for defense in depth to close off every vector, every time.
So, what’s the road to cybersecurity? It begins with recognizing that the manufacture of inherently defenseless data induces cyberattack. Inherently defenseless data is the cause; cyberattack is its logical effect. If cyberattackers can only obtain unusable data, they’ve little reason to attack.
Movement accelerates as business executives, government leaders and individuals consistently ask themselves and their software markers a simple question: “Why am I using software that needlessly puts my organization’s data (or my customer’s data) at risk?”
New and existing applications can automatically encrypt data and add usage rules to it — and they should. So, keep asking the question until your software makers answer, “Your data isn’t at risk; we’ve made it self-defending and self-directing” — and we’ll get to where we want the cybersecurity road to take us.