Monthly Archive for September, 2009

Models are Distracting

claire-cropped.jpgSo Dave Mortman wrote:

I don’t disagree with Adam that we need raw data. He’s absolutely right that without it, you can’t test models. What I was trying to get at was that, even though I would absolutely love to have access to more raw data to test my own theories, it just isn’t realistic to expect that sort of access in the legal and business environment we have today. So until things change, we have to figure out another way to get at the data.

First off, I don’t disagree with why Dave is going where he’s going, but I think it’s built on a mis-read of where we are, and a strategic error regardless.

Where we are: we do have raw data. It’s coming to us from unexpected sources, and we’re getting more of it day by day. We’d like more details, we’d like more consistency and we’d like more depth, and each of those will come.

But far more important is the strategic error of asking for something that isn’t the fullness of what we want, and the risk that the cover-up club will use it to avoid the real goal by talking about how much progress we’ve made sharing models.

You almost never get anything you don’t ask for. If we have a list of requests, the top of the list is data, data and data.

Further, I declare that this is a realistic request, and attach precisely the level of proof that the good Mr. Mortman did when asserting that “it just isn’t realistic.”

Not that I’m opposed to model sharing. We just need to recognize it for the poor substitute that it is, and keep our eyes on the real goal.

Speaking of where your eyes are, that’s Claire, she’s represented by Specs Model Management. And as the title says, quite distracting.

Meta-Data?

So awhile back, I posted the following to twitter:

Thought of the Day: We don’t need to share raw data if we can share meta-data generated using uniform analytical methodologies.

Adam, disagreed:

@mortman You can’t test & refine models without raw data, & you can’t ask people with the same orientation to bring diverse perspectives.

We went back and forth a bit until it became clear that this needed an actual blog post, so here it is:

I don’t disagree with Adam that we need raw data. He’s absolutely right that without it, you can’t test models. What I was trying to get at was that, even though I would absolutely love to have access to more raw data to test my own theories, it just isn’t realistic to expect that sort of access in the legal and business environment we have today. So until things change, we have to figure out another way to get at the data.

One thing that has become increasingly popular is for vendors to publish aggregate data about what they’ve seen with their customers or on their networks. Verizon and WhiteHat have used this model to great effect. Not only has it generated a lot of press for them, but we as an industry have learned a lot from these reports.

What would be even better is if people would share the models they are using when generating their data. This way, other organizations could use the models and as reports were published, the rest of us could actually compare apples to apples. This would also allow us to more quickly identify issues/errors in the models, allow for public discussion of necessary tweaks and then test said changes while limiting liability for the data owners.

This is really where I was going with my initial thought above; that we need common models so we can have an intelligent discussion. This is also how things generally work in the sciences (yes, Alex, I know, we’re not a science yet :) . Researchers almost never publish their raw data, but just their models, methods and results. I feel strongly that until we can convince people to share raw data more openly, this is our best shot to figuring real information about what’s going on in the security world. It’s also what drove me to start developing the soon to be renamed Mortman/Hutton Model that Alex and I presented at Blackhat and BSides Las Vegas.

More data, even if it’s aggregate, is better then no data.

Visualization Friday – Beautiful, Functional, and Effective

We can all learn from this great role model, aimed at personal nutrition awareness and education: Nutritiondata.com .

I encourage you to click on the images below to visit the site and explore interactive features. 

nutritiondata-dot-com1
nutritiondata-dot-com2

 

 

 

 

 

 

 

 

 

 

 

 

 

If only security awareness web sites aimed at end-users and consumers were this good.

VP’s residence is still blurred on Google Earth (political influence on data and its long shadow)

Amusement: Some of you may have heard that former VP Dick Cheney pulled some strings to get Google (or rather their third party supplier) to blur the image of his residence in DC  (One Observatory Circle), presumably for security reasons.   Cheney is out and Biden is in, so you’d think that the image would now be unblurred.  Not so.  Here’s the current image.   Compare it to the neighboring buildings across the street  and you can clearly see that the VP’s residence is still blurred.    What about the more important targets?  Both the White House (1600 Pennsylvania Ave.) and 10 Downing St. in London  are not blurred.  Maybe they didn’t have the same clout as Cheney. 

Lesson:  Politics and power can manipulate the data, and also leave a shadow.   Could this happen to information security data if it were more visible and public?  You bet.  I’m not being cynical, just realistic.  Reminds me of a team motto from a project long ago:  “Trust no one.  Believe nothing.”  In other words, don’t take any data on face value.  Always inquire about the interests of the parties who produce or publish the data.

National Cyber Leap Year Summit reports now available

I believe these are the final deliverables:

Also worth noting is that three task groups came up with a similar idea: a Federal government function similar to CDC or NTSB for collecting, analyzing, and disseminating security data. (see participant’s report p. 50 and 77, and also p 21 from the co-chairs report)

It’s going to take me a while to digest the 166 pages total, but I plan to post comments here. I wonder what the policy makers and budget writers are going make of this.

Making Sense of the SANS “Top Cyber Security Risks” Report

The SANS Top Cyber Security Risks report has received a lot of positive publicity (19 online stories, at last count).  (TippingPoint and Qualys were partners in the report.) But none of the reporters or bloggers analyzed the report, the methods, or the data.  They just repeat the main points from the report. 

I applaud the effort and goals of the study and it may have some useful conclusions. We should have more of this type of study, especially at a large scale.

Unfortunately, the report has some major problems, listed roughly in order of severity:  (for details, read on…)

Continue reading ‘Making Sense of the SANS “Top Cyber Security Risks” Report’

Visualization Friday – Improving a Bad Graphic

We can learn from bad examples and how they can be corrected.  Case in point:  the newly released SANS “Top Cyber Security Risks” report .  Here’s the first graphic in the report:

SANS graphic

(I imagine that this graphic was created by a professional designer based on some simple sketch or even just notes from an expert.  I assume that the designer picked all the colors and shapes.  Pretty, isn’t it?)

 

 

 

But what does this upside-down pyramid mean, with the arrow by its side and a bar labeled “number of vulnerabilities”?   My first several interpretations turned out to be mistaken.  I thought the pyramid shape was significant, which it isn’t.  There is no meaningful horizontal axis.  I thought the colors might be significant (i.e. red = “most severe” and green=”least severe”) but that was mistaken.  I thought the bar labeled “number of vulnerabilities” was some separate quantity being represented, which it isn’t.  For a moment, I thought the arrow might signify some sort of migration or causation path.  Wrong again.  Finally, I thought the vertical size of each segment of the pyramid was significant, as if it was proportional to the number of vulnerabilities in that category.  Muy wrong-o!  The three top slices are all the same size, which suggests they are sized to allow the text to fit comfortably, not to represent any quantity.

I puzzled over this graphic for almost 15 minutes before I was confident I knew what it was trying to communicate. 

I created this alternative graphic to communicate the same message more clearly:

Alternative SANS graphic

This simple graphic expresses the essential message that “total vulnerabilities” is the sum of each of the components.  That is the whole message – nothing more.  Topologically, it’s basically a Venn diagram.  Because the individual boxes are not quite touching, there is less chance for the reader to assume that the box size (or shape) matters in any quantitative sense.  You aren’t visually adding them up in any direction, and therefore you don’t need any axis or axis label.

OK. This graphic isn’t as pretty as the one above, but at least it’s clean and the meaning is very apparent.  There’s little room for misinterpretation. 

 

 

The bigger question is why would this piece of information merit a graphic in the first place?  Wouldn’t a sentence with a bulleted list work just as well?  I think it would.  Thus, this is another case of graphics overkill.

Moral of this story:  don’t simply hand your graphics to a designer with the instructions to “make this pretty”.   Yes, the resulting graphic may be pretty, but it may lose its essential meaning or it might just be more confusing than enlightening.  Someone has to take responsibility for picking the right visualization metaphor and structures.

Proskauer Rose Crows “Rows of Fallen Foes!”

Over on their blog, the law firm announces yet another class action suit over a breach letter has been dismissed. Unfortunately, that firm is doing a fine business in getting rid of such suits. I say it’s unfortunate for two reasons: first, the sued business has to lay out a lot of money (not as much as a full trial, but it’s not socially useful to transfer money from shareholders to lawyers after a breach). Secondly, there may be some real harms, but those are not the subject of most of these suits.

As we see more and more breach notices, and as the number of social security numbers exposed comes to exceed the number issued, showing that a particular crime can be traced to a particular breach is going to get harder. The data is traded freely in markets and aggressively stirred together to make it harder to track origins.

Putting together a real case that this breach lead to that problem and thus that company is liable is going to be tricky. (And then there’s the question of what actions must a company take, but that’s another post.)

So having learned to mow down all these lawsuits (and Prokauser has it down to a science), I’m going to propose that there’s something else they should be advising their clients: notify early and often. The more notices that are out there, the harder it becomes to pin liability for any incident on any one company. So embrace the brave new world in which disclosure is required, and don’t worry about it so much. And while you’re at it, tell us what happened so we can learn from it and start making new and innovative mistakes.

Notes to the Data People

Over on his Guerilla CISO blog, Rybolov suggests that we ask the Data.gov folks for infosec data using their Suggest a data set page. It sounds like a good idea to me! I took his request and built on it. Rather than breaking the flow with quotes and edit marks, I’ll simply say the requests are mostly his work, the context mine.

I’d love feedback before I submit this next week.

Dear Data.gov,

Thank you for the opportunity to suggest data which would be in the public interest to release or make more available. I applaud your mission of improving and updating your site with a wide variety of data.

Today, the public is sorely lacking in data about information security outcomes, that is, what really goes wrong, and how often. The government gathers a great deal of data on federal enterprise security under FISMA and many regulations. It gathers a good deal of information about consumer issues at the FBI and FTC. It may seem that this sort of information falls under the sensitive data exemption which you call out on your suggestion page. I believe that’s not the case, and so before summarily rejecting this request, I ask that you consider the following.

First, it is understood that enterprise security is a challenge, and security failures are widespread. These range from tax records unsecured to sensitive plans showing up on peer to peer filesharing networks are widespread. A great many government security failures are documented in the DatalossDB.org, a project of the Open Security Foundation. It is disgraceful that hobbyists must comb through news reports to make this data available in a better way than Data.gov.

Second, security failings are consistent and not improving. Failings are documented as far back as a GAO report, “Computer Related Crimes in Federal Programs” submitted to Congress in April, 1976. Stripped of jargon and brands, and updated to reflect re-organizations of government departments, the issues and recommendations could be issued today and few people would notice. Data.gov could make available data about what goes wrong that would allow researchers and scientists to assess their advice. The importance of cyber-security has caused President Obama to dedicate a speech to it, order a 60 day special review, etc. The general availability of this data would support and enhance the President’s goals in securing cyber-space.

Third, some small subset of the data may represent on-going issues which are not yet remediated, rather than past issues which have been addressed. These are clearly sensitive, and drawing attention to them would have negative operational consequences. At the same time, there is a public interest in oversight and accountability, and I urge you to consider partial, redacted, or summarized data releases as you balance that sensitivity. For example, information on how many issues each department has open, how long they have been open, and how severe they are is unlikely to change the daily flood of attacks focusing on the Federal information infrastructure.

Therefore, most of the data I am requesting is not sensitive, and its rapid release serves an important public interest.

I am requesting:

Complete responses from the Departments and Agencies to the FISMA reporting requirements for FY2004-2009 based on OMB Memoranda 04-25, 05-15, 06-20, 07-19, 08-21, and 09-29.

Raw incident data for years 2005-2007 as reported to OMB and summarized in their report to Congress on FY2007 FISMA performance and published at http://www.whitehouse.gov/omb/inforeg/reports/2007_fisma_report.pdf

Raw incident data for years 2007 and later in any type and format which would allow a researcher to compile data similar to the Verizon Data Breach Incident Report available at http://www.verizonbusiness.com/resources/security/reports/2009_databreach_rp.pdf

Data collected by the FTC and/or FBI on identity theft, broken down by type and duration, making clear the differences between credit card and other short term thefts and SSN, drivers license, or other longer term impersonation.

This information is necessary for researchers to study the effectiveness of information security management techniques and regulatory schemes and for industry to propose changes to national-level information security management frameworks and legislation such as FISMA. This information for the most part has been released in a summary format to Congress and the release of the complete dataset on data.gov would greatly aid the information security community.

12 Tips for Designing an InfoSec Risk Scorecard (its harder than it looks)

A few months ago on the Securitymetrics.org mailing list, someone bravely posted their draft of an Information Security (InfoSec) Risk scorecard, asking for feedback.  I sent feedback via private email, and then forwarded it to specific people who asked for a copy.  Several of those folks, including the original poster, said I should generalize the feedback and post it some place to help anyone who is trying to design an InfoSec risk scorecard.  Here it is in the form of “12 tips”.

Why is it important to get the design right?  A risk scorecard is often the first step an organization takes toward the risk management approach to InfoSec.  If it’s done poorly, it might be their last step, too.

(For the tips, read on…)  Continue reading ‘12 Tips for Designing an InfoSec Risk Scorecard (its harder than it looks)’