Archive for the 'metrics' Category

What They Know (From the WSJ)

Interesting interactive data app from the Wall Street Journal about your privacy online and what various websites track/know about you.

http://blogs.wsj.com/wtk/

Full disclosure, our site uses Mint for traffic analytics.

Measuring The Speed of Light Using Your Microwave

Using a dish full of marshmallows.  We’re doing this with my oldest kids, and while I was reading up on it, I had to laugh out loud at the following:

…now you have what you need to measure the speed of light. You just need to know a very fundamental equation of physics:

Speed of a Wave (c) = Frequency (f) x Wavelength (L)

The distance between the melted sections of the marshmallow is in fact L/2, because there are two nodes for each wave (see animation). So if you have measured 6cm and your oven operates at 2450 MHz, then your measured speed of light is (0.12 x 2450,000,000) 294,000,000 metres per second.

The agreed value of the speed of light through a vacuum is 299,792,458 metres per second. See how accurately you can measure it? what could you do to make the experiment better, and thus get a closer answer?

IMHO, we need more published security metrics (and risk analytics) that don’t worry about those few million meters per second, and focus rather on the cleverness of using marshmallows and microwaves.

Getting the time dimension right

If you are developing or using security metrics, it’s inevitable that you’ll have to deal with the dimension of time.  It’s harder than it looks and I’ve seen many people make mistakes with it, and in doing so, rendering their overall metrics faulty or worse.  The problems often start with our basic concepts and how we use words.

"Time flies like an arrow, but fruit flies like bananas" -- Groucho Marx

“Data” tells you about the past

“Data” is the output of some observation or measurement process.  If your data is about some states of the world, then by definition your data lives in the past.  You did your measurements or your experiments, generated your data, and then time passed as you assess it, report it, and act on it.  Thus, your data is reporting on history.  Only by acts of inference can you connect your data with the present state of the world or the future state.

In the physical sciences and engineering, they can safely assume that the system under study is the same over time — past, present, and future.  This is called the ergodic hypothesis.  In statistics, the underlying stochastic process is treated as stationary.   This makes it possible to extrapolate the past into the present and future using regression and other techniques.

There are people in the security metrics community that only want to operate on data.   They view anything that is not the result of empirical measurement is pure speculation or a dangerously-seductive “model”.    (See Models are Distracting, and Measurement over Models)    Being an engineer myself, I’m all in favor of empirical data, measurment, and experiments.  But I contend that we will never get to measures of “security” or “risk” through empirical data alone.   Our systems are non-stationary and non-ergodic.

“Security” is a judgement about the present

If we start with the simple high-level question: “Am I secure?”, it becomes clear that any measurement of security must relate to the present time (or possibly a retrospective view on a previous time, i.e. past perfect tense, or prospective view on a future time, i.e. “will I be secure?”).  I call it a “judgement” because security depends on the threats you are facing.  (I play a historically-realistic computer game with my son, called Total War, that includes features that allow you invest in offensive and defensive capabilities.  How much to invest and how fast to invest depends on who you are facing.  A wooden pallisade will be an adequate defense against peasants and spear militia, but hopelessly inadequate against onagers and trebuchets, backed by armored cavalry!)

Thus, you can measure anything and everything you want about security, generating tons of data, and in the end you will have to make a judgement:  “Am I secure?” — or are my security provisions adequate given the threats we face?   Seen this way, your data is really just evidence that is used in this judgement (and inference) process.   What I mean by this is that I don’t think you can simply calculate your way from ground-truth data to any overall security metrics.  There will always be a judgement or inference step(s).

Why?  Because we must account for events, circumstances, and scenarios that haven’t happened yet, or happen so rarely that we have no relevant data, or are beyond the reach of measurements.  (Afterall, the miscreants often do their best to hide their actions.)   On top of this, the security landscape changes rapidly and occasionally dramatically.  Our judgement about security must factor in these changes, to the best of our knowledge.   Finally, our judgement about “are we secure?” is predicated on our risk tolerence.  But what is “risk”?

“Risk” is a cost of the future, brought to the present

This is the economist’s definition of risk, where “cost” here means downside cash flows that are beyond some  threshold of expectation or variability.  Those costs become “risk” when you can account for them in present dollars using some discounting and insurance method.  (This says nothing about the “insurability” of the risk, only about the theoretical possibility of accounting for risk in present dollars by some reasonable method.  The “insurance method” might be diversification, hedging, self-insurance, risk pooling, contingent contracts, or traditional insurance.)

This parallels Peter Drucker’s characterization of profit: “Profit is … needed to pay for attainment of the objectives of the business. Profit is a condition of survival. It is the cost of the future.  The cost of staying in business.” [emphasis added]   Ontologically, “profit” and “risk” are in the same category, which is why it makes sense to measure “risk-adjusted return” and the like.

From the viewpoint of risk, what you have spent in the past is irrelevant  (“sunk costs”).  All rational decisions are based on future cash flows and options.  The only value of the past is if it helps you predict or forecast the future.  Thus, you can’t reach a final judgement about security in the present if you don’t also have some useful estimate of risk in the future.   If the answer to “Am I secure?” is “Yes”, then the implication is that you can live with the risk associated with this level of security.   By “useful”, I mean sufficiently discriminating to inform the judgement — “bigger than a breadbox, smaller than a house”.

This is where information security deviates from reliability engineering.   In the latter, the ergodic hypothesis holds and the dynamics are sufficiently “tame” to permit statistical data analysis for inference and forecasting.  Even when there are “humans in the loop”, their behavioral tendencies can often be characterized by stable probability distributions.  In information security, we are dealing with adaptive, intelligent, strategic players — not only miscreants, but also “ancillary players” like end-users, auditors, supply chain partners, and so on.  This makes risk estimation a ”wicked problem“.  But is it hopeless?

Estimating risk may be hard, but not impossible

Plenty of smart security people contend that quantitative risk estimation is impossible or infeasible in principle.  Proving or disproving this assertion would take heavy-duty theoretical analysis (and I may do it some day).  But for now consider two extreme situations.

Think of security and risk as a black-box process that generates a continuous stream of cash flows in time (i.e. total spending on security and losses in that time period).  At one extreme, the output is a stationary function or stochastic process.  This is the relm that Nicholas Taleb called “Medicoristan“, since the data stream is well-behaved enough that nothing very surprising happens.  With enough historical data and enough data analysis, I think we’d all agree that risk estimation is feasable with current methods.

At the other extreme, the output is generated by a strategic agent (inside the box) whose sole purpose is to screw up our risk estimation process.  Let’s call this Descartes’ Demon, after Rene Descartes, who introduces a skeptical scenario called the deceiving demon argument to challenge our beliefs that an external world exists; in particular, it raises the possibility that some sort of malicious, demonic non-God, has “employed all his energies in order to deceive me”.    If Descartes’ Demon can maintain history of the output and also has information about our risk estimation process, he can mimic any output pattern and change those patterns arbitrarily to defeat any estimation process we might apply.   (This is more extreme than Taleb’s “Extremestan” in terms of defying estimation or prediction.)   In this case, I believe it could be proved that estimation is impossible (or undecidable or infeasable from a computation point of view).

Some people might argue that information security is exactly in this latter extreme situation, but I don’t think so.  The reason is that all the players have much stronger motives and forcing functions than to subvert the risk estimation processes.  Bad guys want to make money or cause harm.  End users want to avoid hassles and minimize effort and get their job done.  Managers want to manage their business while avoiding negative repercussions.  All of these factors add some elements of predictability and understandability.

But it may only be possible to factor all of these in through the use of models and simulations that represent our best knowledge, our best estimates, and our best beliefs about how they all relate to each other and the overall results.

The marriage of data, security, and risk = social learning processes

Putting this all together, we need to gather a lot of empirical data to understand relationships, patterns, and dependencies.  But to measure security we need to add inference and judgement processes that extend our data into the present, given the threat landscape we believe we are facing.  But to make a judgement about security and make decisions about alternative security postures, we need a useful estimate of risk to decide how much security is enough.  To tie these all together over time requires effective social learning processes, including model validation through experiments and data analysis.  Likewise, risk estimation and security judgement processes tell us what data we need to collect and how to analyze it.

Whether you agree with this framework or not, you should make explicit and consistent definitions of the time dimension relative to your metrics.

On Uncertain Security

One of the reasons I like climate studies is because the world of the climate scientist is not dissimilar to ours.  Their data is frought with uncertainty, it has gaps, and it might be kind of important (regardless of your stance of anthropomorphic global warming, I think we can all agree that when the climate changes, crazy things can happen).

Recently, the mainstream press has begun to pick up on this, and trying to explain what science is doing.  One such example is this Times (UK) story called

Scientists Need The Guts To Say, “I Don’t Know”

In it, the author (David Spiegelhalter – Professor of the Public Understanding of Risk at the University of Cambridge) discusses uncertainty in past (and forward) looking predictions.  Yes, it’s worth noting that the science of prediction applies to all three states of time: past, present, and future.

As a security professional, I always encourage the representation of uncertainty.  Depending on the audience, I’ll represent uncertainty technically, or at a high level with words like “back of the napkin, very rough, a lot of unknowns, fairly certain, pretty good idea…”  I’ve found that as long as they are properly qualified, demonstrations  of risk with high degrees of uncertainty are not unuseful.

HEY, YOU GOT YOUR VISIBILITY INTO MY UNCERTAINTY!!! AND YOU GOT YOUR UNCERTAINTY IN MY VISIBILITY!!!

They really *are* two great tastes that taste great together….

One of the great reasons for the IT Risk management/Security team to communicate uncertainty (esp. to others with money) is that if you say “here’s what we think but we’re not sure “,  you can then tell the business owner “and if you give me $funding we can decrease that uncertainty by gaining visibility into $whatever”.  If they decline, they’re accepting both the risk and the probability that you’re wrong.  But if they’re uncomfortable with the uncertainty, now you have a pretty good qualitative way of knowing that their tolerance for this level of risk is pretty low, and you might even be able to skip right past the “buy more visibility” step above and move right into “of course, we can just spend $Y and take care of the whole thing, visibility, risk reduction and all….”

Similarly, if you, the security manager, keep getting risk analyses back that have significant uncertainty in them – you know that these are areas where you really don’t have much control.  They may represent reasons or opportunities to strengthen policies, processes, capabilities (w00t everybody goes to training in Cancun!) and so forth.

So while it’s also the enemy of accuracy, uncertainty can also be your friend.

One last note, having to do with uncertainty; in the article the author uses the Taleb definition of “Black Swan”.  Again, calling a rare event a “Black Swan” is a misnomer.  Rarity in frequency is only one aspect of what the concept of Black Swan represents.  A much better definition of a Black Swan is “an occurance which is not representable at all given our prior distributions.  Certainly, even after before Prof. Spiegelhalter corrected the model for double yoked eggs – the occurance of 6 is not a true Black Swan.  We could have run MCMC sims until our computers melted into hot lumps of toxic waste and various occurrences of double yoked eggs would/could have been represented.

Data void: False Positives

There’s a good post at Gartner pointing out the lack of data reported by vendors or customers regarding the false positive rates for anti-spam solutions.  

Although Gartner customers almost never complain about false positive rates, I wonder if false positives are under estimated. End users rarely complain about false positives, but they are very vocal reporting Spam in their inbox. Box Sentry (www.boxsentry.com) recently did a tests in a number of organizations and found the false positive rate in some organizations using popular anti-spam tools was as high as 13% of legitimate emails. The largest proportion of false positives in their study was legitimate person-to-person traffic.  While it could be that these organizations have over-tuned their systems to block more Spam at the expense of quarantining more legit email, the reality was the email administrators had no idea they had such a high false positive rate because they never checked.  Have you? 

Going further, it would be very valuable to estimate the cost of false positives.

As I’ve discussed in a previous post, this is just another instance of a general problem in the security industry.  You can’t do rational analysis of effectiveness, cost-effectiveness, risk, and the rest without some estimate of false positive rates and their costs.

Does It Matter If The APT Is “New”?

As best as I can describe the characteristics of the threat agents that would fit the label of APT, that threat community is very, very real.  It’s been around forever (someone mentioned first use of the term being 1993 or something) – we dealt with threat agents you would describe as “APT” at MicroSovled when I was there in 2001-2005.  We dealt with it as a firewall vendor at Progressive Systems in 1998.  This isn’t a “is the APT real?” blogpost.

That said, I wanted to talk about why there should be still more discussion around the APT.  Hogfly at the Forensic Incident Response blog asks:

“What should matter is how successful they have been. What should matter is defending ourselves. What should matter is how and where we share this information. What should matter is taking this information to those with the ability to do something about it. What should matter is taking the fight to the enemy.

So I ask again, does it matter if this threat is new?”

My response is that it actually matters very much.

We are hearing a new label.  Whether the label originated from “the cool kids” or not, it’s being co-opted by marketing.  And right now, we’re sort of in this important window of trying to get some understanding, some significant amount of intersubjectivity about what the APT is and what it means to a broader audience.  Once that’s established, then we can try to understand what to do.  But why does it matter if the threat is new or old?

There is a significant increase in the use of the term.  When it’s a BusinessWeek cover story (2008, btw), it gets seen by people.  What we need to understand is if this “new” visibility is the result of either a change in the threat landscape or a change in the marketing landscape.

IS APT A SHIFT IN FREQUENCY, A SHIFT IN CAPABILITY, OR A SHIFT IN BOTH FREQUENCY AND CAPABILITY?

If it is a change in the threat landscape, we need to understand what aspect of the landscape is changing.  The shift could be said to be one of a few scenarios:

1.)  More attacks on the same targets by the same actors. That is, is the government, defense industrial base, or other targets attractive to certain nation-states are experiencing a new amount of threat events.

2.) More attacks on new targets by the same actors. That is, are the nation-state actors finding new targets?  If so, are their targets of choice changing from organizations that are antagonistic to the policy desires of the sponsor state (certainly the Mandiant report reads like the Chinese are after anyone who threatens their political stability), to other targets – like retailers or hospitals (has, as Mandiant says, the APT become *everyone’s* problem)?

3.)  More attacks on the same targets by new actors. That is, it’s not just the usual suspects.  If *this* is the case, then we’re seeing a fundamental shift in the capabilities of threats.  That is, bad guys who used to be dumb just got a lot smarter thanks to the dissemination of skills/resources (sharing of technique, new access to advanced toolsets, etc) and they are going after all those people who were worrying about the APT in 2003.

4.)  More attacks on new targets by new actors. That is, the bad guys who used to be dumb just got a lot smarter and are now trying to use their new smarts against victims who heretofore had not had to worry about the APT.

Finally, the other option is that there is no shift in frequency or capability, but there is a shift in marketing budgets.  I tried to run a google trend on “Advanced Persistent Threat” but got:

Your terms – “Advanced Persistent Threat” – do not have enough search volume to show graphs.

And “APT” trend search was clouded by other things that shared the same TLA.

WHAT DO YOU THINK?

I’m not sure what we’re seeing.  I was personally disappointed by the Mandiant report’s lack of demographics and frequency information.  I’m ready to believe that we’re seeing a fundamental shift in distributions concerning the threat agents, but there wasn’t anything in the report to support that notion.  I will leave you with a couple of items from the Verizon Report, though, and I’ll let you draw your own conclusions, given that the Verizon data set isn’t heavy on what we might call the Defense Industrial Base – those folks already live and breathe this stuff  – and this data is from 2008.

SOURCE OF ATTACKING IP

TARGETED VS. OPPORTUNISTIC ATTACKS

TREND IN USE OF CUSTOMIZED MALWARE

TIME TO DISCOVERY

V-22 Osprey Metrics

Metrics seem to be yet another way in which Angry Bear noticed that the V-22 Osprey program has hidden from its failure to deliver on its promises:

Generally, mission capability runs 20% higher than availability, but availability is hidden on new stuff, while shouted about on older stuff, because there would be severe embarrassment if you considered that 40% of the brand new V-22 were not available (okay 60% available sounds much better, buy a car which is broke 40% of the time, how good does the warranty service need to be?).
The Navy and GAO are not sure which metrics to use. One of the reasons that US quality fell in the 70’s was avoiding measuring the hard things [that] gets you in trouble; a weakness of the DoD acquisition process. But the spending is more important than meaningful results.
Missing mission capable suggests that basic reliability and maintenance performance are not part of V-22 repertoire. Quality may not have been affordable during the long development cycle, and the savings are now costing in added support and lost use of the V-22

And as one commenter notes, the problem is even more fundamental than poor quality–the Osprey “cannot do a lot of what it is replacing:  HH 53 and HH 46.”  I would pretty much guarantee that no one is measuring the number of missions that are not performed by the Osprey but which could have been by the helicopters it replaced.

Metrics are powerful tools, but they can be as much a force for evil as a force for good.  Choosing the easy-to-gather metrics or the metrics that make the thing being measured look better may play well in Slide-Deck-Land, but it doesn’t change the fact that there is still a reality lurking underneath there which isn’t going away just because someone refuses to measure it.

What people choose to measure can tell you a lot about both their competence and their motivations.  Ignore it at your peril.

Help EFF Measure Browser Uniqueness

The EFF is doing some measurement of browser uniqueness and privacy. It takes ten seconds.

Before you go, why not estimate what fraction of users have the same
transmitted/discoverable browser settings as you, and then check your
accuracy at https://panopticlick.eff.org. Or start at http://www.eff.org/deeplinks/2010/01/help-eff-research-web-browser-tracking for a bit more detail.

NotObvious On Heartland

I posted this also to the securitymetrics.org mailing list.  Sorry if discussing in multiple  venues ticks you off.

The Not Obvious blog has an interesting write up on the Heartland Breach and impact.  From the blog post:

“Heartland has had to pay other fines to Visa and MasterCard, but the total of $12.6 million they have set aside to handle the one-time costs is a drop in the bucket compared $1.5 billion in 2008 revenue and does not really even skim much off the top of the $161 million in profits from that same year (the numbers for 2009 look to be tracking the same). It is almost a guarantee that any member of the class action who submits a claim will see many years of scrutiny before receiving any payment, something which Heartland can factor into their yearly financial plans (and accommodate for by increasing fees).”

For thought:

  1. One wonders how much a “sufficient” (loaded term, of course) InfoSec program for a company like Heartland costs on an annual basis.
  2. Does this set a sort of “worst case” bounds to impact distributions?
  3. If so, how does a worst case impact of ~$13million (US) impact security management at retailers (politically)?

Sweden: An Interesting Demographic Case Study In Internet Fraud

saab-900(quietly, wistfully singing “Yesterday” by the Beatles)

From my favorite Swedish Infosec Blog, Crowmoor.se. I don’t speak Swedish, so I couldn’t really read the fine article they linked to.  Do go read their blog post, I’ll wait here.

Back?  Great.  Here are my thoughts on those numbers:

SWEDISH FRAUD STATISTICS RELEASED

The World Bank estimates the population of Sweden to be 9,220,986 - 2008

For Reference, London (2006 figures) was 7.5 million, New York City was 8.275 million in 2007

So the Swedish “market” for fraud was around 60,000 people out of a total population of 9,000,000 suffering an average  of  €1050-1100 each.  This line of thinking draws the inevitable comparison to what VC call The Chinese Soft Drink Argument (If we can just get each person from China to buy one drink, we’ll make a billion!), obviously, but I thought it was interesting to put this into context.

When I saw those numbers, I thought of a couple of other stats I’d like to have at hand:

Break down of types of “attacks” that resulted in fraud (was the attack primarily hacking, was their SE involved, was it phishing, etc.), estimated number of attack attempts, number of arrests, demographics around Internet banking and broadband penetration…

What other information do you think would be helpful to you as a practitioner?

obligatory Swedish Chef reference: