These came across the SIRA mailing list. They were so good, I had to share:
http://eight2late.wordpress.com/2009/10/06/on-the-limitations-of-scoring-methods-for-risk-analysis/
Thanks to Kevin Riggins for finding them and pointing them out.
The Blog Inspired By The Book
These came across the SIRA mailing list. They were so good, I had to share:
http://eight2late.wordpress.com/2009/10/06/on-the-limitations-of-scoring-methods-for-risk-analysis/
Thanks to Kevin Riggins for finding them and pointing them out.
So if you don’t follow the folks over at OKCupid, you are missing out on some hot data. In case you’re not aware of it, OKCupid is:
the best dating site on earth. Compiling our observations and statistics from the hundreds of millions of user interactions we’ve logged, we use this outlet to explore the data side of the online dating world.
And in their latest post, they explore what brand of camera makes you look good. You should go read “Don’y be ugly by accident.” I’ll wait.
You’re back? Ok. So here, let me lay this out for you. These folks are applying science, not to dating, but to online dating profiles. They’re not slinging some best practice shtick, or re-writing profiles at $50 a pop, they’re telling you exactly what photos work and which ones don’t. How are they doing this? Data. Experiment. Analysis.
I don’t want to understate the importance of finding a good partner, but I will say how sad it is that they have all this data on this highly intimate activity, and we have 2,000 entries in DatalossDB.
Using a dish full of marshmallows. We’re doing this with my oldest kids, and while I was reading up on it, I had to laugh out loud at the following:
…now you have what you need to measure the speed of light. You just need to know a very fundamental equation of physics:
Speed of a Wave (c) = Frequency (f) x Wavelength (L)
The distance between the melted sections of the marshmallow is in fact L/2, because there are two nodes for each wave (see animation). So if you have measured 6cm and your oven operates at 2450 MHz, then your measured speed of light is (0.12 x 2450,000,000) 294,000,000 metres per second.
The agreed value of the speed of light through a vacuum is 299,792,458 metres per second. See how accurately you can measure it? what could you do to make the experiment better, and thus get a closer answer?
IMHO, we need more published security metrics (and risk analytics) that don’t worry about those few million meters per second, and focus rather on the cleverness of using marshmallows and microwaves.
For your consideration, two articles in today’s New York Times. First, “How to Remind a Parent of the Baby in the Car?:”
INFANTS or young children left inside a vehicle can die of hyperthermia in a few hours, even when the temperature outside is not especially hot. It is a tragedy that kills about 30 children a year, according to the National Safety Council.
…
Janette Fennell is the founder and president of KidsAndCars.org, a safety advocacy group based in Leawood, Kan., that focuses on issues involving children and automobiles. In a telephone interview, Ms. Fennell made her view clear, saying she believed that carmakers must develop reminder devices to warn drivers if a child is left behind.
Second, “The Hard Sell on Salt:”
High blood pressure is rising among adults and children. Government health experts estimate that deep cuts in salt consumption could save 150,000 lives a year.
Bets on which problem is “addressed” first are encouraged in the comments.
If you are developing or using security metrics, it’s inevitable that you’ll have to deal with the dimension of time. It’s harder than it looks and I’ve seen many people make mistakes with it, and in doing so, rendering their overall metrics faulty or worse. The problems often start with our basic concepts and how we use words.
“Data” is the output of some observation or measurement process. If your data is about some states of the world, then by definition your data lives in the past. You did your measurements or your experiments, generated your data, and then time passed as you assess it, report it, and act on it. Thus, your data is reporting on history. Only by acts of inference can you connect your data with the present state of the world or the future state.
In the physical sciences and engineering, they can safely assume that the system under study is the same over time — past, present, and future. This is called the ergodic hypothesis. In statistics, the underlying stochastic process is treated as stationary. This makes it possible to extrapolate the past into the present and future using regression and other techniques.
There are people in the security metrics community that only want to operate on data. They view anything that is not the result of empirical measurement is pure speculation or a dangerously-seductive “model”. (See Models are Distracting, and Measurement over Models) Being an engineer myself, I’m all in favor of empirical data, measurment, and experiments. But I contend that we will never get to measures of “security” or “risk” through empirical data alone. Our systems are non-stationary and non-ergodic.
If we start with the simple high-level question: “Am I secure?”, it becomes clear that any measurement of security must relate to the present time (or possibly a retrospective view on a previous time, i.e. past perfect tense, or prospective view on a future time, i.e. “will I be secure?”). I call it a “judgement” because security depends on the threats you are facing. (I play a historically-realistic computer game with my son, called Total War, that includes features that allow you invest in offensive and defensive capabilities. How much to invest and how fast to invest depends on who you are facing. A wooden pallisade will be an adequate defense against peasants and spear militia, but hopelessly inadequate against onagers and trebuchets, backed by armored cavalry!)
Thus, you can measure anything and everything you want about security, generating tons of data, and in the end you will have to make a judgement: “Am I secure?” — or are my security provisions adequate given the threats we face? Seen this way, your data is really just evidence that is used in this judgement (and inference) process. What I mean by this is that I don’t think you can simply calculate your way from ground-truth data to any overall security metrics. There will always be a judgement or inference step(s).
Why? Because we must account for events, circumstances, and scenarios that haven’t happened yet, or happen so rarely that we have no relevant data, or are beyond the reach of measurements. (Afterall, the miscreants often do their best to hide their actions.) On top of this, the security landscape changes rapidly and occasionally dramatically. Our judgement about security must factor in these changes, to the best of our knowledge. Finally, our judgement about “are we secure?” is predicated on our risk tolerence. But what is “risk”?
This is the economist’s definition of risk, where “cost” here means downside cash flows that are beyond some threshold of expectation or variability. Those costs become “risk” when you can account for them in present dollars using some discounting and insurance method. (This says nothing about the “insurability” of the risk, only about the theoretical possibility of accounting for risk in present dollars by some reasonable method. The “insurance method” might be diversification, hedging, self-insurance, risk pooling, contingent contracts, or traditional insurance.)
This parallels Peter Drucker’s characterization of profit: “Profit is … needed to pay for attainment of the objectives of the business. Profit is a condition of survival. It is the cost of the future. The cost of staying in business.” [emphasis added] Ontologically, “profit” and “risk” are in the same category, which is why it makes sense to measure “risk-adjusted return” and the like.
From the viewpoint of risk, what you have spent in the past is irrelevant (“sunk costs”). All rational decisions are based on future cash flows and options. The only value of the past is if it helps you predict or forecast the future. Thus, you can’t reach a final judgement about security in the present if you don’t also have some useful estimate of risk in the future. If the answer to “Am I secure?” is “Yes”, then the implication is that you can live with the risk associated with this level of security. By “useful”, I mean sufficiently discriminating to inform the judgement — “bigger than a breadbox, smaller than a house”.
This is where information security deviates from reliability engineering. In the latter, the ergodic hypothesis holds and the dynamics are sufficiently “tame” to permit statistical data analysis for inference and forecasting. Even when there are “humans in the loop”, their behavioral tendencies can often be characterized by stable probability distributions. In information security, we are dealing with adaptive, intelligent, strategic players — not only miscreants, but also “ancillary players” like end-users, auditors, supply chain partners, and so on. This makes risk estimation a ”wicked problem“. But is it hopeless?
Plenty of smart security people contend that quantitative risk estimation is impossible or infeasible in principle. Proving or disproving this assertion would take heavy-duty theoretical analysis (and I may do it some day). But for now consider two extreme situations.
Think of security and risk as a black-box process that generates a continuous stream of cash flows in time (i.e. total spending on security and losses in that time period). At one extreme, the output is a stationary function or stochastic process. This is the relm that Nicholas Taleb called “Medicoristan“, since the data stream is well-behaved enough that nothing very surprising happens. With enough historical data and enough data analysis, I think we’d all agree that risk estimation is feasable with current methods.
At the other extreme, the output is generated by a strategic agent (inside the box) whose sole purpose is to screw up our risk estimation process. Let’s call this Descartes’ Demon, after Rene Descartes, who introduces a skeptical scenario called the deceiving demon argument to challenge our beliefs that an external world exists; in particular, it raises the possibility that some sort of malicious, demonic non-God, has “employed all his energies in order to deceive me”. If Descartes’ Demon can maintain history of the output and also has information about our risk estimation process, he can mimic any output pattern and change those patterns arbitrarily to defeat any estimation process we might apply. (This is more extreme than Taleb’s “Extremestan” in terms of defying estimation or prediction.) In this case, I believe it could be proved that estimation is impossible (or undecidable or infeasable from a computation point of view).
Some people might argue that information security is exactly in this latter extreme situation, but I don’t think so. The reason is that all the players have much stronger motives and forcing functions than to subvert the risk estimation processes. Bad guys want to make money or cause harm. End users want to avoid hassles and minimize effort and get their job done. Managers want to manage their business while avoiding negative repercussions. All of these factors add some elements of predictability and understandability.
But it may only be possible to factor all of these in through the use of models and simulations that represent our best knowledge, our best estimates, and our best beliefs about how they all relate to each other and the overall results.
Putting this all together, we need to gather a lot of empirical data to understand relationships, patterns, and dependencies. But to measure security we need to add inference and judgement processes that extend our data into the present, given the threat landscape we believe we are facing. But to make a judgement about security and make decisions about alternative security postures, we need a useful estimate of risk to decide how much security is enough. To tie these all together over time requires effective social learning processes, including model validation through experiments and data analysis. Likewise, risk estimation and security judgement processes tell us what data we need to collect and how to analyze it.
Whether you agree with this framework or not, you should make explicit and consistent definitions of the time dimension relative to your metrics.
One of the reasons I like climate studies is because the world of the climate scientist is not dissimilar to ours. Their data is frought with uncertainty, it has gaps, and it might be kind of important (regardless of your stance of anthropomorphic global warming, I think we can all agree that when the climate changes, crazy things can happen).
Recently, the mainstream press has begun to pick up on this, and trying to explain what science is doing. One such example is this Times (UK) story called
Scientists Need The Guts To Say, “I Don’t Know”
In it, the author (David Spiegelhalter – Professor of the Public Understanding of Risk at the University of Cambridge) discusses uncertainty in past (and forward) looking predictions. Yes, it’s worth noting that the science of prediction applies to all three states of time: past, present, and future.
As a security professional, I always encourage the representation of uncertainty. Depending on the audience, I’ll represent uncertainty technically, or at a high level with words like “back of the napkin, very rough, a lot of unknowns, fairly certain, pretty good idea…” I’ve found that as long as they are properly qualified, demonstrations of risk with high degrees of uncertainty are not unuseful.
HEY, YOU GOT YOUR VISIBILITY INTO MY UNCERTAINTY!!! AND YOU GOT YOUR UNCERTAINTY IN MY VISIBILITY!!!
They really *are* two great tastes that taste great together….
One of the great reasons for the IT Risk management/Security team to communicate uncertainty (esp. to others with money) is that if you say “here’s what we think but we’re not sure “, you can then tell the business owner “and if you give me $funding we can decrease that uncertainty by gaining visibility into $whatever”. If they decline, they’re accepting both the risk and the probability that you’re wrong. But if they’re uncomfortable with the uncertainty, now you have a pretty good qualitative way of knowing that their tolerance for this level of risk is pretty low, and you might even be able to skip right past the “buy more visibility” step above and move right into “of course, we can just spend $Y and take care of the whole thing, visibility, risk reduction and all….”
Similarly, if you, the security manager, keep getting risk analyses back that have significant uncertainty in them – you know that these are areas where you really don’t have much control. They may represent reasons or opportunities to strengthen policies, processes, capabilities (w00t everybody goes to training in Cancun!) and so forth.
So while it’s also the enemy of accuracy, uncertainty can also be your friend.
One last note, having to do with uncertainty; in the article the author uses the Taleb definition of “Black Swan”. Again, calling a rare event a “Black Swan” is a misnomer. Rarity in frequency is only one aspect of what the concept of Black Swan represents. A much better definition of a Black Swan is “an occurance which is not representable at all given our prior distributions. Certainly, even after before Prof. Spiegelhalter corrected the model for double yoked eggs – the occurance of 6 is not a true Black Swan. We could have run MCMC sims until our computers melted into hot lumps of toxic waste and various occurrences of double yoked eggs would/could have been represented.
There’s a good post at Gartner pointing out the lack of data reported by vendors or customers regarding the false positive rates for anti-spam solutions.
Although Gartner customers almost never complain about false positive rates, I wonder if false positives are under estimated. End users rarely complain about false positives, but they are very vocal reporting Spam in their inbox. Box Sentry (www.boxsentry.com) recently did a tests in a number of organizations and found the false positive rate in some organizations using popular anti-spam tools was as high as 13% of legitimate emails. The largest proportion of false positives in their study was legitimate person-to-person traffic. While it could be that these organizations have over-tuned their systems to block more Spam at the expense of quarantining more legit email, the reality was the email administrators had no idea they had such a high false positive rate because they never checked. Have you?
Going further, it would be very valuable to estimate the cost of false positives.
As I’ve discussed in a previous post, this is just another instance of a general problem in the security industry. You can’t do rational analysis of effectiveness, cost-effectiveness, risk, and the rest without some estimate of false positive rates and their costs.
In Verizon’s post, “A Comparison of [Verizon's] DBIR with UK breach report,” we see:

Quick: which is larger, the grey slice on top, or the grey slice on the bottom? And ought grey be used for “sophisticated” or “moderate”?
I’m confident that both organizations are focused on accurate reporting. I am optimistic that this small example in the utlity of pie charts will inform report writers. The report writers and their graphics departments, loving their customers, will move to bar charts to help them compare numbers between sources.
I’m confident that not using pie charts is a best practice.
Elsewhere: “The only time it makes sense to use a pie chart.”
And elsewhere: “The Visual Display of Quantitative Information, 2nd edition“
The EFF is doing some measurement of browser uniqueness and privacy. It takes ten seconds.
Before you go, why not estimate what fraction of users have the same
transmitted/discoverable browser settings as you, and then check your
accuracy at https://panopticlick.eff.org. Or start at http://www.eff.org/deeplinks/2010/01/help-eff-research-web-browser-tracking for a bit more detail.
What You’ve Said