<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The New School of Information Security &#187; Data Analysis</title>
	<atom:link href="http://newschoolsecurity.com/category/data-analysis/feed/" rel="self" type="application/rss+xml" />
	<link>http://newschoolsecurity.com</link>
	<description>The Blog Inspired By The Book</description>
	<lastBuildDate>Mon, 06 Feb 2012 16:09:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Sharing Research Data</title>
		<link>http://newschoolsecurity.com/2012/01/sharing-research-data/</link>
		<comments>http://newschoolsecurity.com/2012/01/sharing-research-data/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 15:45:38 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[disclosure]]></category>
		<category><![CDATA[research papers]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2484</guid>
		<description><![CDATA[I wanted to share an article from the November issue of the Public Library of Science, both because it&#8217;s interesting reading and because of what it tells us about the state of security research. The paper is &#8220;Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting [...]]]></description>
			<content:encoded><![CDATA[<p>I wanted to share an article from the November issue of the Public Library of Science, both because it&#8217;s interesting reading and because of what it tells us about the state of security research.  The paper is &#8220;<a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026828">Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results</a>.&#8221;  I&#8217;ll quote the full abstract, and encourage you to read the entire 6 page paper.</p>
<blockquote><p>
<b>Background</b><br />
The widespread reluctance to share published research data is often hypothesized to be due to the authors&#8217; fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically.</p>
<p><b>Methods and Findings</b><br />
We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.</p>
<p><b>Conclusions</b><br />
Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies.
</p></blockquote>
<p>Despite the fact that the research was done on papers published in psychology journals, it can teach us a great deal about the state of security research.<br />
<P><br />
First, <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026828">the full paper</a> is available for free online.  Compare and contrast with too many venues in information security.</p>
<p>
Second, the paper considers and tests alternative hypotheses: </p>
<blockquote><p>
Although our results are consistent with the notion that the reluctance to share data is generated by the author&#8217;s fear that reanalysis will expose errors and lead to opposing views on the results, our results are correlational in nature and so they are open to alternative interpretations. Although the two groups of papers are similar in terms of research fields and designs, it is possible that they differ in other regards. Notably, statistically rigorous researchers may archive their data better and may be more attentive towards statistical power than less statistically rigorous researchers. If so, more statistically rigorous researchers will more promptly share their data, conduct more powerful tests, and so report lower p-values. However, a check of the cell sizes in both categories of papers (see Text S2) did not suggest that statistical power was systematically higher in studies from which data were shared.  [Ed: "Text S2" is supplemental data considering the discarded hypothesis.]
</p></blockquote>
<p>But most important, what does it say about the quality of the data we so avariciously hoard in information security?  Could it have something to do with higher prevalence of apparent errors?</p>
<p>
Probably not.  It might surprise you to hear me saying that, but hear me out. We almost never have hypotheses to test, and so our ability to perform statistical re-analysis is almost irrelevant.  We&#8217;re much for fond of saying things like &#8220;It calls the same DLLs as Stuxnet, so it&#8217;s clearly also by the Israelis.&#8221;  Actually, there are several implied hypotheses in there:</p>
<ol>
<li>No code by different authors calls the same DLL
<li>No code calls any undocumented APIs
<li>Stuxnet DLLs are not documented
</ol>
<p>Stuxnet being written by the Israelis is clearly not a hypothesis, but a fact, as documented by Nostradamus.</p>
<p>
More seriously, read the paper, see how good science is done, and ask if anyone is holding us back but ourselves.</p>
<p>
Thanks to Cormac Herley for the pointer.</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2012/01/sharing-research-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The New School of Software Engineering?</title>
		<link>http://newschoolsecurity.com/2012/01/the-new-school-of-software-engineering/</link>
		<comments>http://newschoolsecurity.com/2012/01/the-new-school-of-software-engineering/#comments</comments>
		<pubDate>Wed, 11 Jan 2012 16:33:27 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[blogs & podcasts]]></category>
		<category><![CDATA[Data Analysis]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2447</guid>
		<description><![CDATA[This is a great video about how much of software engineering runs on folk knowledge about how software is built: &#8220;Greg Wilson &#8211; What We Actually Know About Software Development, and Why We Believe It&#8217;s True&#8221; There&#8217;s a very strong New School tie here. We need to study what&#8217;s being done and how well it [...]]]></description>
			<content:encoded><![CDATA[<p>This is a great video about how much of software engineering runs on folk knowledge about how software is built:<br />
<iframe src="http://player.vimeo.com/video/9270320?title=0&#038;byline=0&#038;portrait=0" width="400" height="225" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe></p>
<p>
&#8220;<a href="http://vimeo.com/9270320">Greg Wilson &#8211; What We Actually Know About Software Development, and Why We Believe It&#8217;s True</a>&#8221;</p>
<p>
There&#8217;s a very strong New School tie here.  We need to study what&#8217;s being done and how well it works to figure out how to make better software more reliably.<br />
<P><br />
Incidentally, at around 28 minutes in, Wilson mentions <a href="http://research.microsoft.com/en-us/people/nachin/">Nachi Nagappan</a>&#8216;s work on physical distance versus managerial distance, and then jumps to remote hires at a a startup.  While I&#8217;m not sure of which paper Wilson is discussing, almost all of Nagappan&#8217;s work is done with Microsoft developers and products.  As such, both have to be seen in the context of Microsoft&#8217;s deep and shared experience in shipping software.  By definition, that <em>shared</em> experience doesn&#8217;t exist at a startup.  And as to the managerial distance issue, it&#8217;s satirically discussed <a href="http://www.joeydevilla.com/2011/07/03/org-charts-of-the-big-tech-companies-plus-an-enhancement/">here</a>.  Assuming that his results generalize is a large jump, and one that I&#8217;m not sure I&#8217;d make.</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2012/01/the-new-school-of-software-engineering/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More on Authorization Persistence Threats</title>
		<link>http://newschoolsecurity.com/2011/11/more-on-authorization-persistence-threats/</link>
		<comments>http://newschoolsecurity.com/2011/11/more-on-authorization-persistence-threats/#comments</comments>
		<pubDate>Fri, 18 Nov 2011 16:16:03 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Reports and Data]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2327</guid>
		<description><![CDATA[Wade Baker has a quick response to my &#8220;Thoughts on the 2011 DBIR and APT,&#8221; including the data that I was unable to extract. Thanks!]]></description>
			<content:encoded><![CDATA[<p>Wade Baker has a <a href="http://securityblog.verizonbusiness.com/2011/11/17/quick-response-to-thoughts-on-the-2011-dbir-and-apt/">quick response</a> to my &#8220;<a href="http://newschoolsecurity.com/2011/11/thoughts-on-the-2011-dbir-and-apt-authorization-preservation-threats/">Thoughts on the 2011 DBIR and APT</a>,&#8221; including the data that I was unable to extract.</p>
<p>
Thanks!</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/11/more-on-authorization-persistence-threats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diginotar Quantitative Analysis (&#8220;Black Tulip&#8221;)</title>
		<link>http://newschoolsecurity.com/2011/09/diginotar-quantitative-analysis-black-tulip/</link>
		<comments>http://newschoolsecurity.com/2011/09/diginotar-quantitative-analysis-black-tulip/#comments</comments>
		<pubDate>Tue, 13 Sep 2011 15:12:05 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[disclosure]]></category>
		<category><![CDATA[Doing it Differently]]></category>
		<category><![CDATA[measurement]]></category>
		<category><![CDATA[Reports and Data]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2272</guid>
		<description><![CDATA[Following the Diginotar breach, FOX-IT has released analysis and a nifty video showing OCSP requests. As a result, lots of people are quoting a number of &#8220;300,000&#8243;. Cem Paya has a good analysis of what the OCSP numbers mean, what biases might be introduced at &#8220;DigiNotar: surveying the damage with OCSP.&#8221; To their credit, FoxIt [...]]]></description>
			<content:encoded><![CDATA[<p>Following the Diginotar breach, FOX-IT has released <a href="http://www.rijksoverheid.nl/bestanden/documenten-en-publicaties/rapporten/2011/09/05/diginotar-public-report-version-1/rapport-fox-it-operation-black-tulip-v1-0.pdf">analysis</a> and a nifty <a href="http://www.youtube.com/watch?v=wZsWoSxxwVY">video</a> showing OCSP requests.</p>
<p>
As a result, lots of people are quoting a number of &#8220;300,000&#8243;.  </p>
<p>
Cem Paya has a good analysis of what the OCSP numbers mean, what biases might be introduced at &#8220;<a href="http://randomoracle.wordpress.com/2011/09/11/diginotar-surveying-the-damage-with-ocsp/">DigiNotar: surveying the damage with OCSP</a>.&#8221;</p>
<blockquote><p>
To their credit, FoxIt  tried to investigate the extent of the damage by monitoring OCSP logs for users checking on the status of the forged Google certificate. There is a neat YouTube video showing the geographic distribution of locations around the world over time. Unfortunately while this half-baked attempt at forensics makes for great visualization, it presents a very limited picture of impacted users.
</p></blockquote>
<p>Digitar and Fox-IT released enough that a dedicated secondary analyst like Cem can see methodological flaws in what they did.  What else could we learn if we had more of the raw observations?  When I read the report, I noticed the claim &#8220;A number of malicious/hacker software tools was found. These vary from commonly used tools such a the famous Cain &#038; Abel tool to tailor made software.&#8221;   This claim mixes analysis and observation.  The observation is that there was software with which the analyst was not familiar.  It may be that it was a perl script or other code that can be easily skimmed to see that it was &#8220;tailor made.&#8221;  It may be that it was just something re-compiled to not match a hash.  We don&#8217;t know.  Similarly, the report claims (4.1) &#8220;In at least one script, fingerprints from the hacker are left on purpose, which were also found in the Comodo breach investigation of March 2011.&#8221;  Really?  On purpose?  Perhaps the fingerprints were inserted as a matter of dis-information.  Perhaps the Fox-IT analyst called the intruder on the phone, and he owned up to it.  We don&#8217;t know.</p>
<p>
I want to be clear that I don&#8217;t mean to be picking on Fox-IT here.  My understanding is that the report they prepped came out incredibly quickly, and kudos to them for that.  I&#8217;ve cherry picked two areas where I can ask for better editing, but I&#8217;m very aware that that editing comes at a cost in timeliness.</p>
<p>
Cem&#8217;s article is very much worth reading, as is the Fox-IT report.  But Cem&#8217;s analysis helps illustrate a theme of the New School, which is that we need diverse perspectives and analysis brought to bear on each report.  The more data we see, the more we can learn from it.  No single analysis will tell us everything we might learn.  (I made a similar point <a href="http://newschoolsecurity.com/2011/06/how-the-epsilon-breach-hurts-consumers/">here</a>.)</p>
<p>
I am left with a question for Cem, which I would have added to his post, but couldn&#8217;t comment there.  My question is, having given all that thought to all the biases, what do you think is the probably true number (or range) of affected people?</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/09/diginotar-quantitative-analysis-black-tulip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Securosis goes New School</title>
		<link>http://newschoolsecurity.com/2011/08/securiosis-goes-new-school/</link>
		<comments>http://newschoolsecurity.com/2011/08/securiosis-goes-new-school/#comments</comments>
		<pubDate>Wed, 10 Aug 2011 20:12:42 +0000</pubDate>
		<dc:creator>Russell</dc:creator>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Doing it Differently]]></category>
		<category><![CDATA[metrics]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2254</guid>
		<description><![CDATA[The fine folks at Securosis are starting a blog series on &#8220;Fact-based Network Security: Metrics and the Pursuit of Prioritization&#8220;, starting in a couple of weeks.  Sounds pretty New School to me!  I suggest that you all check it out and participate in the dialog.  Should be interesting and thought provoking. [Edit -- fixed my [...]]]></description>
			<content:encoded><![CDATA[<p>The fine folks at Securosis are starting a blog series on &#8220;<a href="http://www.securosis.com/blog/new-blog-series-fact-based-network-security-metrics-and-the-pursuit-of-prio">Fact-based Network Security: Metrics and the Pursuit of Prioritization</a>&#8220;, starting in a couple of weeks.  Sounds pretty New School to me!  I suggest that you all check it out and participate in the dialog.  Should be interesting and thought provoking.</p>
<p><em>[Edit -- fixed my mispelling of company name.  D'oh!]</em></p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/08/securiosis-goes-new-school/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Fixes to Wysopal’s Application Security Debt Metric</title>
		<link>http://newschoolsecurity.com/2011/03/fixes-to-wysophal%e2%80%99s-application-security-debt-metric/</link>
		<comments>http://newschoolsecurity.com/2011/03/fixes-to-wysophal%e2%80%99s-application-security-debt-metric/#comments</comments>
		<pubDate>Sat, 05 Mar 2011 09:47:27 +0000</pubDate>
		<dc:creator>Russell</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[metrics]]></category>
		<category><![CDATA[Science of Risk Management]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2099</guid>
		<description><![CDATA[In two recent blog posts (here and here), Chris Wysopal (CTO of Veracode) proposed a metric called “Application Security Debt”.  I like the general idea, but I have found some problems in his method.  In this post, I suggest corrections that will be both more credible and more accurate, at least for half of the [...]]]></description>
			<content:encoded><![CDATA[<p>In two recent blog posts (<a href="http://www.veracode.com/blog/2011/02/application-security-debt-and-application-interest-rates/" target="_blank">here</a> and <a href="http://www.veracode.com/blog/2011/03/a-financial-model-for-application-security-debt/" target="_blank">here</a>), Chris Wysopal (CTO of Veracode) proposed a metric called “Application Security Debt”.  I like the general idea, but I have found some problems in his method.  In this post, I suggest corrections that will be both more credible and more accurate, at least for half of the formula.  The second half is harder to do right and needs more thinking.</p>
<p><span id="more-2099"></span><span style="font-weight: bold;">Overview</span></p>
<p>Application Security Debt is based on the concept of  “technical debt” proposed by Ward Cunningham (a programmer who developed the first wiki program): describes it like this:</p>
<blockquote><p>Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite… The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise.</p></blockquote>
<p>Chris adds:</p>
<blockquote><p>The cost of technical debt is the time and money it will take to rewrite the poor code after you ship and bring it back to the quality required to maintain the software over the long haul.</p></blockquote>
<p>Here is Chris’ summary of <strong>Application Security Debt</strong>:</p>
<blockquote><p>Security debt is similar to technical debt. Both debts are design and implementation constructions that have negative aspects that aggregate over time and the code must be re-worked to get out of debt. Security debt is based on the latent vulnerabilities within an application. Application interest rates are the real world factors outside of the control of the software development team that lead to vulnerabilities having real cost. These factors include the cost of a security breach and attacker motivation to discover and exploit the latent vulnerabilities.</p></blockquote>
<p>Chris’ <a href="http://www.veracode.com/blog/2011/03/a-financial-model-for-application-security-debt/" target="_blank">second post</a> describes a financial model that estimates the cost of Application Security Debt.  Framing the metric in financial terms will presumably help managers compare the cost of the “debt” to the cost of developing more secure software or costs of fixing the vulnerabilities.  (Note: Veracode provides a range of <a href="http://www.veracode.com/solutions/application-security-testing.html" target="_blank">application security testing services</a>, so they have an interest in economically justifying their services.  This isn’t a criticism of Veracode, Chris, or his proposal.  Just a reality.)</p>
<p>Chris’ model is focused on the simplest case where the application developer and application user is the same organization, so that it bears the costs of development, maintenance, and also any security breaches that result.  Starting with the simplest case is a great idea when proposing a new method.  So far so good.</p>
<p>Chris defines his financial model this way:</p>
<blockquote><p>The basic financial model for security debt is monetary risk that can be expressed as <em>expected loss</em>. The formula for expected loss is <strong>event likelihood X impact in dollars</strong>. Event likelihood is based on the makeup of vulnerabilities in the application and the likelihood that the vulnerabilities will be discovered and exploited. The impact is the cost of a security breach based on an exploit of one of those vulnerabilities.  [Emphasis in original]</p></blockquote>
<p>This is, of course, a version of the bottom-up Annualized Loss Expectancy (ALE) formula for individual risk elements:</p>
<ul>
<li>ALE = Single Loss Expectancy X Annual Rate of Occurrence</li>
</ul>
<p>(Mike Rothman recently <a href="http://securosis.com/blog/firestarter-risk-metrics-are-crap" target="_blank">crapped on all “risk metrics”</a> by lumping them all into the ALE formula.  I’ll critique ALE and Mike’s post in a separate blog post.)</p>
<p>ALE issues aside, I think Chris is making mistakes in his definition of Application Security Debt that will lead to serious confusion.</p>
<h4>Debt = Expected Principal + Interest Costs</h4>
<p>Chris made a mistake when he defines monetary value of the Application Security Debt as expected loss due to security breaches.    Instead, the &#8216;Principal&#8217; part of the debt formula is the cost of fixing security problems beyond what is budgeted. Chris had it right in his summary in the first article:</p>
<blockquote><p>The cost of technical debt is the time and money it will take to rewrite the poor code after you ship and bring it back to the quality required to maintain the software over the long haul.</p></blockquote>
<p>Expected losses are in the category of “Interest Costs” as Chris said in his summary:</p>
<blockquote><p>Application interest rates are the real world factors outside of the control of the software development team that lead to vulnerabilities having real cost.</p></blockquote>
<p>Putting this together in simple language:</p>
<p><em>“Application Security Debt is a ‘loan’ with variable principal which could range from 0% to 100% of your original project costs. The &#8216;principal&#8217; is what you&#8217;ll eventually have to pay to fix security bugs or rewrite the code.  It also has varying and uncertain &#8216;interest costs&#8217;, which are the costs of security breaches due to these vulnerabilities. This includes the possibility of the mother-of-all balloon payments (i.e. a huge loss event).”</em></p>
<p>The good news is that Expected Principal is relatively easy to estimate with good accuracy and without a lot of outside data.  The not-so-good-news is that Interest Cost is a bear to estimate.</p>
<h4>Estimating ‘Expected Principal’</h4>
<p>For simplicity, let’s assume that cost of fixing code (above the budgeted costs) occurs in discrete increments, <em>F</em>:</p>
<ol>
<li>Zero  (i.e. your debt is ‘forgiven’)</li>
<li>Minor fixes and patches (&#8216;Principal&#8217; = 10% increase in project cost)</li>
<li>Major fixes and patches  (&#8216;Principal&#8217; = 25% increase in project cost)</li>
<li>Substantial rewrite (&#8216;Principal&#8217; = 50% increase in project cost)</li>
<li>Total rewrite   (&#8216;Principal&#8217; = 100% increase in project cost, or more)</li>
</ol>
<p>Thus, the best case is that you owe no principal and the worst case is that you owe principal equal to the entire cost of the project.  You could include other factors such as external costs of schedule delays, costs of rehiring your programmers after you fire them all <img src='http://newschoolsecurity.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> , or what ever.  My point is that these costs are not open-ended, but are a multiplier on your original development costs.</p>
<p>The Expected Principal (EP) is equal to each of these cost scenarios multiplied by their probability of management choosing that option:</p>
<p><a href="http://newschoolsecurity.com/wp-content/uploads/2011/03/EP-formula.png"><img class="aligncenter size-full wp-image-2100" src="http://newschoolsecurity.com/wp-content/uploads/2011/03/EP-formula.png" alt="" width="272" height="130" /></a></p>
<p>For example, if the original cost of the application development project is $1 million, and there is 5% chance of Zero costs, 80% of Minor code fix costs, and 15% chance of Substantial rewrite costs, then the Expected Principal would be $155,000, or 16% of the original cost.</p>
<p><strong>This is important: </strong>Expected Principal is ultimately determined by management decisions and ‘threshold of pain’.  This means that the value of <em>p(F)</em>, above, is a subjective probability.  It would be an ideal metric to estimate using prediction markets (PMs).   (PMs have been used successfully in software development to estimate shipment dates and defect rates, for example.)</p>
<p><strong>Another implication</strong>: you don’t need to accurately forecast future loss events or their economic impact to get a decent estimate of Expected Principal.  Instead, you only need to estimate the Interest Costs very roughly to determine which code fix scenario is most likely.    You could even estimate <em>p(F)</em> by setting thresholds for the number and severity of vulnerabilities discovered by certain levels of effort.  Better, you could combine these methods to ‘triangulate’ on estimates of <em>p(F).</em></p>
<p>To calibrate these subjective probability estimates, it would be <strong>very helpful to collect historical data on the % of applications that have some level of rewrite or schedule delay due to security problems</strong>.  (Hint hint!)</p>
<h4>Estimating ‘Interest Costs’ on the Debt will be Hard</h4>
<p>The second part of the Application Security Debt formula is ‘Interest Costs’.  This is where things get hairy.   All the members of the ALE family of risk calculations have a similar flaws: 1) prodigious data requirements and 2) propagation of uncertainty through the calculations.  Furthermore, some suffer by using only mean values and ignoring extreme values (i.e. the “tails” of the probability distribution curves).</p>
<p>Chris acknowledges these issues, at least the requirement for more and better data:</p>
<blockquote><p>Now you are probably thinking that this is getting a little tenuous and it is. We need better data on likelihood type and likelihood of an application breach by industry and other factors like company size.</p></blockquote>
<p>Data issues aside, I think there are flaws in his use of ALE and calculation methods.  Here’s one thought experiment to show how it could lead to the wrong conclusions, in my opinion.</p>
<p>Let’s use Chris’ ‘baseline expected loss’ table, where he calculates the expected loss for each type of vulnerability.  Imagine that we are comparing two similar applications, A and B.  Assume that each project is expected to have the same number of vulnerabilities, five each.  Let’s say the development cost of each project is $1 million.  Application A has five SQL injection vulnerabilities while application B has one SQL Injection vulnerability and four Remote File Inclusion vulnerabilities.  Doing the calculations:</p>
<ul>
<li>A’s expected losses = $19,220,000</li>
<li>B’s expected losses = $5,074,080</li>
</ul>
<p><em>Does project A really have four times more risk than project B?</em> Probably not.  From what I know, the number of vulnerabilities in an application is not proportional to the likelihood that the application will be breached.  Instead, I’d guess that the likelihood of being breached is a function of where the application is in the IT architecture, how accessible it is, how important it is to attackers, etc.</p>
<p>Also, there’s the ‘weakest link’ effect: “given enough random attackers or one persistent attacker, it only takes one vulnerability to lead to a breach”.  Assuming all SQL Injection vulnerabilities are equally discoverable and equally exploitable, then we should estimate that application B with one SQL Injection vulnerability is just as likely to get breached as application A with five, all other things being equal.</p>
<p>(I confess I’m not an expert in application security or vulnerability analysis, so these comments are my interpretation of what others have written or said.)</p>
<p>Even if my logic here is flawed somewhat, my main point is that the relation between number of vulnerabilities and likelihood of being breached is non-linear and it may even be indeterminate if contextual factors dominate.</p>
<p>This example also hints at another severe weakness in the ALE method – it ignores correlation and dependence between risk elements and factors.  We know from forensic analysis and the DBIR that severe security breaches involve a sequence of exploits and attacks.  This means that the likelihood of breach in one application is dependent on the likelihood of breach in other applications and systems.  An application might appear unimportant, but it might be a stepping-stone to other applications, databases, and networks.</p>
<p>It’s hard to account for all these factors and influences together without some sort of over-arching model for enterprise-level information security and risk.   Basically, you are looking for the ‘risk contribution’ of those specific application vulnerabilities to total costs, now and in the uncertain future.    Formally, the ‘Interest Cost’ for any given set of application vulnerabilities is the difference between the <a href="http://meritology.com/resources/Total%20Cost%20of%20Cyber%20(In)security.ppt" target="_blank">Total Cost of Security (TCoS)</a> in two possible worlds: World 1) application A has X vulnerabilities, vs. World 2) application A does not have X vulnerabilities (or if application A is not deployed at all).</p>
<p>What we really need are some short-cut approximations for this that doesn’t require a complete data set and risk estimates for the whole enterprise.  One approach I’m interested is in using modern AI methods (data mining, machine learning, inference methods).  This is on-going research.</p>
<h4>Summary</h4>
<p>I’m glad Chris proposed his Application Security Debt metric.  I hope my post has been helpful in correcting some of the errors, as I see them.  The good news is that the “Expected Principal” component of the metric looks like it can be estimated fairly easily and with good accuracy.  On the other hand, the “Interest Cost” component needs a lot of work.  I’m happy to collaborate with Chris or anyone else who wants to work on this.</p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/03/fixes-to-wysophal%e2%80%99s-application-security-debt-metric/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Infosec&#8217;s Flu</title>
		<link>http://newschoolsecurity.com/2011/02/infosecs-flu/</link>
		<comments>http://newschoolsecurity.com/2011/02/infosecs-flu/#comments</comments>
		<pubDate>Fri, 04 Feb 2011 16:53:59 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[best practice]]></category>
		<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[research papers]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2059</guid>
		<description><![CDATA[In &#8220;Close Look at a Flu Outbreak Upends Some Common Wisdom,&#8221; Nicholas Bakalar writes: If you or your child came down with influenza during the H1N1, or swine flu, outbreak in 2009, it may not have happened the way you thought it did. A new study of a 2009 epidemic at a school in Pennsylvania [...]]]></description>
			<content:encoded><![CDATA[<p>In &#8220;<a href="http://www.nytimes.com/2011/02/08/health/research/08flu.html?_r=1">Close Look at a Flu Outbreak Upends Some Common Wisdom</a>,&#8221; Nicholas Bakalar writes:
</p></blockquote>
<p>If you or your child came down with influenza during the H1N1, or swine flu, outbreak in 2009, it may not have happened the way you thought it did.</p>
<p>
A new study of a 2009 epidemic at a school in Pennsylvania has found that children most likely did not catch it by sitting near an infected classmate, and that adults who got sick were probably not infected by their own children.</p>
<p>
Closing the school after the epidemic was under way did little to slow the rate of transmission, the study found, and the most common way the disease spread was a through child’s network of friends.
</p></blockquote>
<p>The work he discusses is &#8220;<a href="http://www.pnas.org/content/early/2011/01/28/1008895108.full.pdf+html">Role of social networks in shaping disease transmission during a community outbreak of 2009<br />
H1N1 pandemic influenza</a>&#8221; by Simon Cauchemeza, Achuyt Bhattaraib, Tiffany L. Marchbanksc, Ryan P. Faganb, Stephen Ostroffc, Neil M. Fergusona, David Swerdlowb, and the Pennsylvania H1N1 working group.</p>
<p>
The first thing that comes to mind is that closing schools is a best practice.  It&#8217;s something that makes so much sense that it&#8217;s hard to argue against, even if it does no good.  The next thing is look at what happens when they have data available to them.  They can study their prescriptions and test to see if they did any good.  But note how detailed the data is: social graphs, seating charts.  This isn&#8217;t something we would obviously get from more detailed breach notices.  It&#8217;s going to require in-depth investigations, and investigators who talk about their methods.  VERIS is a step in this direction, and I&#8217;m looking forward to seeing critiques or even competitors that can help us move forward and learn.<br />
<P><br />
But the data we have is the data we have, and while we work to get more, there&#8217;s a good deal that we can probably learn from what&#8217;s out there.  We just have to be willing to ask if our practices really work.</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/02/infosecs-flu/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Dark Reading Virtual Event &amp; Evidence-Based Risk Management</title>
		<link>http://newschoolsecurity.com/2011/02/dark-reading-virtual-event-evidence-based-risk-management/</link>
		<comments>http://newschoolsecurity.com/2011/02/dark-reading-virtual-event-evidence-based-risk-management/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 13:54:18 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Reports and Data]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2057</guid>
		<description><![CDATA[Hey, I know it&#8217;s late notice, but I&#8217;ll be speaking at 10:30 EST today on EBRM and the Verizon DBIR: https://www.techwebonlineevents.com/ars/eventregistration.do?mode=eventreg&#038;F=1002809&#038;K=CAA1BC&#038;tab=agenda Alex]]></description>
			<content:encoded><![CDATA[<p>Hey, I know it&#8217;s late notice, but I&#8217;ll be speaking at 10:30 EST today on EBRM and the Verizon DBIR:</p>
<p><a href=" https://www.techwebonlineevents.com/ars/eventregistration.do?mode=eventreg&amp;F=1002809&amp;K=CAA1BC&amp;tab=agenda"></p>
<p>https://www.techwebonlineevents.com/ars/eventregistration.do?mode=eventreg&#038;F=1002809&#038;K=CAA1BC&#038;tab=agenda</a></p>
<p>Alex</p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/02/dark-reading-virtual-event-evidence-based-risk-management/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Another critique of Ponemon&#8217;s method for estimating &#8216;cost of data breach&#8217;</title>
		<link>http://newschoolsecurity.com/2011/01/another-critique-of-ponemons-method-for-estimating-cost-of-data-breach/</link>
		<comments>http://newschoolsecurity.com/2011/01/another-critique-of-ponemons-method-for-estimating-cost-of-data-breach/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 01:55:14 +0000</pubDate>
		<dc:creator>Russell</dc:creator>
				<category><![CDATA[breaches]]></category>
		<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Reports and Data]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2031</guid>
		<description><![CDATA[I have fundamental objections to Ponemon's methods used to estimate 'indirect costs' due to lost customers ('abnormal churn') and the cost of replacing them ('customer acquisition costs').  These include sloppy use of terminology, mixing accounting and economic costs, and omitting the most serious cost categories.]]></description>
			<content:encoded><![CDATA[<p>Adam just <a href="http://newschoolsecurity.com/2011/01/a-critique-of-ponemon-institute-methodology-for-churn/">posted is general critiques</a> of the annual <a href="http://www.encryptionreports.com/download/Ponemon_COB_2009_US.pdf">US Cost of Data Breach Study</a>.  I agree with his critique about survey methods, but I have more fundamental objections to their methods used to estimate &#8216;indirect costs&#8217; due to lost customers (&#8216;abnormal churn&#8217;) and the cost of replacing them (&#8216;customer acquisition costs&#8217;).</p>
<h4>A noble effort, but&#8230;</h4>
<p>Before I start chopping it up, let me say that I think their annual survey is a good effort and it&#8217;s positive that they can get sponsorship and also readership for the results.  I think they have good intentions and try to give a fair, balanced, and reasonable estimate.  Our field would be better off if there were similar data gathering efforts in other areas of InfoSec.  I also don&#8217;t believe that any of the errors are due to intentions to &#8216;spin&#8217; or mislead.  It looks like they didn&#8217;t have sufficient expertise on their team in business finance, marketing analysis, and economics.</p>
<p>But I see some serious problems with their methods.  This is a big deal since &#8216;indirect costs&#8217; make up a majority (68%) of their estimate of total costs.</p>
<h4><span id="more-2031"></span>Problem #1: A fog of buzz words</h4>
<p>If their data and analysis were bulletproof, then maybe we could forgive sloppy use of terms.  But it isn&#8217;t bulletproof and their use of terms is actually misleading because it gives the impression that the method is well established and well executed when it really isn&#8217;t.  Furthermore, it&#8217;s a sign that whoever is doing this part of the analysis doesn&#8217;t know what they are talking about.  Examples:</p>
<ul>
<li>&#8220;<strong>The survey design relied upon a <em>shadow costing</em></strong><strong> method used in applied economic research.</strong>&#8221; (p. 36) &#8212; <em>There is no such method as &#8216;shadow costing&#8217;</em>.  Do a web search if you doubt me.  The only examples of &#8216;shadow costing&#8217; are economic studies that use &#8216;<a href="http://en.wikipedia.org/wiki/Shadow_price">shadow prices</a>&#8216; multiplied by input quantities to derive &#8216;shadow costs&#8217; for certain manufacturing or service process.  Having just completed a <a href="http://www.omar.ec/index.php?option=com_content&amp;task=view&amp;id=57&amp;Itemid=0">Mathematical Economics</a> class last semester, I can assure you that the Ponemon method has nothing to do with shadow prices or shadow costs.</li>
<li>&#8220;<strong>Utilizing <em>activity-based costing</em></strong>&#8230;&#8221; (p. 3) and &#8220;T<strong>he diagram below illustrates th<em>e activity-based costing</em></strong><em> </em><strong>schema</strong>&#8230;&#8221; (p. 36) &#8212; <em>They do not use activity-based costing</em>.  Activity-based costing (ABC) is a way of allocating overhead costs by measuring some &#8216;activity&#8217; in operations that are thought to drive those overhead costs.   The ratio of &#8216;activity&#8217; for each business unit to the total is used to allocate the overhead cost to that business unit (or product line or customer segment or what ever).  You can read more about ABC <a href="http://www.sas.com/resources/whitepaper/wp_5073.pdf">here</a> and <a href="http://hbswk.hbs.edu/item/4587.html">here</a>.   How big a flub is this?  Big.  It&#8217;s like labeling a signature-based AV software as an &#8216;expert system&#8217;.  Anyone who uttered such a statement would immediately be dismissed by security experts.  What ever the Ponemon method is, it&#8217;s not ABC.  Just because costs are related to activities does not mean you are doing activity-based costing.</li>
<li>&#8220;&#8230;<strong>most companies experience <em>opportunity costs</em> associated with a breach incident</strong>..<em>.&#8221; &#8212; No, these aren&#8217;t &#8216;opportunity costs&#8217;</em>.  The term &#8216;<a href="http://en.wikipedia.org/wiki/Opportunity_cost">opportunity cost</a>&#8216; has very specific meaning in microeconomics.  Basically, an opportunity cost is the cost of giving up your next-best alternative when you make a decision.  They misuse the term here to refer to costs that expected in the future, i.e. beyond the historical frame of the breach incident and post-incident remediation.  Their use of the term here is just sloppy and could easily mislead someone who doesn&#8217;t know economics.</li>
</ul>
<h4>Problem #2: Mixing accounting costs with economic costs</h4>
<p>This is a subtle but fundamental problem, and it&#8217;s why people get degrees in accounting and economics.  Accounting costs are those that appear in a financial statement somewhere and follow specific costing rules, e.g. GAAP.  They have already occurred (historical costs) or they are forecasted to occur (pro forma costs).  In the Ponemon method, they list four categories of accounting costs:</p>
<ol>
<li>Detection or discovery</li>
<li>Escalation</li>
<li>Notification</li>
<li>Ex-post response</li>
</ol>
<p>Now it&#8217;s probably true that most organizations do not have explicit accounts for these costs so they have to be derived from other accounting costs.  But it&#8217;s pretty easy to slice and dice accounting data (i.e. general ledger entries) to get decent estimates of these costs.  It&#8217;s also possible estimate costs by using per-resource costs (labor cost per hour) multiplied by the usage of those resources (hours to resolve an incident).  In the Ponemon survey, they ask their point-of-contact for their <em>estimate</em> of these costs.  That&#8217;s probably OK, given that the point-of-contact is a privacy/security person directly involved in the incident.</p>
<p>But then they mix in future economic costs (what they mislabel as &#8216;opportunity costs&#8217;):</p>
<ol>
<li>Turnover intentions of existing customers</li>
<li>Diminished new customer acquisition</li>
</ol>
<p>(Leave aside for a moment that they are asking about &#8220;intentions&#8221; of customers to defect.  Adam discussed this in his post.)</p>
<p>These are both economic costs.  (See this <a href="http://www.willamette.edu/~fthompso/501/Ch7.pdf">slide deck, slide #2</a>.  The wikipedia article on this topic is not good.)  Basically, economic costs are all in the future.  There is no such thing as &#8216;historical economic costs&#8217;.   Only cash flows count in economic costs &#8212; no &#8216;intangibles&#8217;, no depreciation, no &#8216;good will&#8217;.  Those can only be included in the form of future cash flows discounted for time and risk (and uncertainty).  Economic costs include opportunity costs (see above), which are the cash flows associated with the next-best alternative.  However, opportunity costs will <em>never</em> appear on a financial statement, now or in any future.  Economic costs are incurred when the commitment is made, not when they are recognized in the accounting system.</p>
<p><strong>Most important:</strong><em> All cash flows are discounted for the time value of money and the riskiness of the cash flow. </em> This feature is essential for rational deicsion-making over time and over risky alternatives, but it also guarantees that no estimate of economic costs will ever equal the corresponding accounting costs because accounting systems to not adjust for the time value of money or risk.  Finally, a full estimate of economic costs includes the present value of &#8216;real options&#8217; and  should be adjusted for risk (i.e. derating by using the costs of insuring against unexpected/extreme events, cost of lowered credit rating, etc.).</p>
<p>The element of their method that specifically invokes economic cost is &#8216;lifetime value of customers&#8217; (LTV).  In their method, the cost associated with lost customers is estimated by multiplying the % of breached customer records that will defect (&#8216;abnormal customer churn rate&#8217;) multiplied by  LTV.  (LTV originated in direct marketing in the 1980s.  Wikipedia has <a href="http://en.wikipedia.org/wiki/Customer_lifetime_value">a decent article</a> that explains it. Here&#8217;s <a href="http://hbsp.harvard.edu/multimedia/flashtools/cltv/index.html">a good demonstration</a>.)   LTV is a net present value, discounted by the cost of capital associated with the riskiness of the cash flow.  It&#8217;s an economic profit, not an accounting profit.</p>
<p>Putting this all together, either the costing method should use <em>only</em> accounting costs (historical and/or pro forma) or it should <em>only</em> use economic costs (prospective discounted cash flows, risk-adjusted).  Otherwise, the numbers don&#8217;t add up, literally.</p>
<h4>Problem #3 : Decision polices matter</h4>
<p><strong>(i.e cheap short-sighted bastards can have lower costs than prudent socially responsible managers)</strong><br />
Here&#8217;s another problem with mixing accounting costs with economic costs.  Let me illustrate this with a story.  There are two companies &#8212; Cheap Bastard, Inc. (CBI) and Nice Guys R Us (NGRS).  CBI has decision policies to spend as little as possible on InfoSec, especially in incident detection and incident response.  They push all liability onto their customers, suppliers, and contractors.  They systematically downplay evidence of breaches, and downplay the severity or costs of breaches.  They avoid forensic analysis if they can get away with it.  And so on.</p>
<p>In contrast, NGRS puts a lot of attention on pro-active security and detection and goes out of it&#8217;s way to mitigate the costs of insecurity on it&#8217;s ecosystem.  They are especially eager to spend money post-breach to restore public trust and to learn from the event to get at root causes.</p>
<p>How would CBI and NGRS show up in the Ponemon survey?  My guess is that NGRS would have cost per record of 2X or 3X greater than CBI, primarily because CBI will have much lower accounting costs (as covered by the survey) by decision policy. It&#8217;s also likely because CBI can &#8216;safely&#8217; ignore the probable future costs of their rapacious behavior (i.e. class action lawsuits, regulatory penalties, even larger security breaches).   I put &#8216;safely&#8217; in quotes because such corporate behavior is only safe until you get caught or get screwed.</p>
<p>I don&#8217;t see a way around this if you only use accounting costs.  If you use a sufficiently broad framework for economic costs, then you stand a chance of understanding the &#8216;total costs&#8217; that exposes the riskiness of CBI&#8217;s decision policy.</p>
<h4>Problem #4 : Do respondents really know anything about customer LTV or &#8216;churn&#8217; intentions?</h4>
<p>I&#8217;m surprised that no one has brought this up before.  As someone who has calculated and published LTV metrics for a business unit, I can say with some confidence that almost no one who didn&#8217;t read those reports would have been able to guess the LTV of customers, including accounting people who knew about the cost and revenue categories but never put them together into LTV.  <a href="http://en.wiktionary.org/wiki/swag">SWAGs (as in def. 5) </a>were could be off by an order of magnitude.</p>
<p>My opinion is that asking a privacy/security/incident response specialist to estimate LTV is fundamentally flawed unless that person has access to their own company&#8217;s management reports that include LTV.  It might be possible to elicit useful estimates from them after their estimates are calibrated through exercises, including exercises that estimate the weighted average cost of capital, average lifetime of a customer, acquisition costs, retention costs, etc.</p>
<p>Same goes for &#8216;churn&#8217; rate (percentage of customers who leave because their records were breached).  To estimate &#8216;abnormal churn&#8217; due to the breach, the point-of-contact would need to know something about &#8216;normal churn rate&#8217; and, as Adam says in his post, the variability of churn rate.  If churn rate varies widely from year to year, then a small increase in churn due to a data breach would be washed out by the other factors driving variability.</p>
<p>It would be <em>much</em> more useful to find out if the company increased their marketing budget as a direct consequence of a given data breach.  If they did, then this would be credible evidence that the number and value of lost customers was great enough for the company to change it&#8217;s spending decisions.</p>
<h4>Problem #5: Leaving out significant cost categories</h4>
<p>This problem may be bigger than all the others combined.  If they left out major categories of cost, then their estimate of cost per breach could be off by 50% or more.</p>
<p>To answer this question, you first need to decide between estimating accounting costs vs. estimating economic costs.  Every economist and every B-school professor will advise you to estimate economic costs.  It may be useful to analyze historical accounting costs as a way to estimate future economic costs, but that is a separate exercise.</p>
<p>As an economic cost analysis, it might be best to frame the decisions this way:</p>
<ul>
<li>Given a breach of customer data of size X records (same size as historical breach), how would the firm&#8217;s economic costs change vs. no breach?</li>
</ul>
<p>I&#8217;ll point out two categories of cost that this analysis would include that are currently excluded in the Ponemon survey.</p>
<p><strong>Cost of additional spending on security</strong><br />
If a firm incurs incremental spending on security due to a breach, shouldn&#8217;t those costs be included in the &#8216;cost of a data breach&#8217;?  This goes back to my story about fictional companies CBI and NGRU, above.  If CBI is likely to spend more to fix their crappy security in the future if they experience a large breach today, then they will be forced to &#8216;pay the piper&#8217; in economic terms and their decision policy of spending as little as possible won&#8217;t help them avoid the full costs of data breaches.  This will also capture the cost of half-measures, since trying to get off cheap on the security upgrade will still show up as higher expected costs for future breaches.</p>
<p>Of course, this raises sensitive political issues with respondents to the survey.  They may be reluctant to answer questions about actual spending on improvements to security or, even more, to speculate about possible future costs.  For example, what if a company&#8217;s outsourcing strategy is hopelessly insecure and the firm is forced to reverse those decisions and insource those processes.  What if a company is forced to exit a line of business because the security risks and costs are too high?  What if a data breach leads to process changes that diminish or eliminate their key competitive advantage?</p>
<p>Factoring these costs could increase the cost of a breach by 0.5X to 10X.  I would make it much harder to do cross-company and cross-industry comparisons.  But wouldn&#8217;t it make the true economic costs of data breaches more relevant to management decision-making?<br />
<strong></strong></p>
<p><strong>Social costs</strong><br />
The other &#8216;elephant in the room&#8217; is the cost to consumers or employees for having their private data breached.   Add these all up and you get &#8216;social cost&#8217;, or appropriately adjusted,  &#8217;<a href="http://en.wikipedia.org/wiki/Welfare_economics">social welfare</a>&#8216;.  I understand that the Ponemon survey is estimating only costs that are incurred by a single organization that experiences the breach, not by any other stakeholders in that firm&#8217;s ecosystem.</p>
<p>There are plenty of studies on the direct and indirect costs of identity theft and the perceived costs of breaches of privacy.  Drawing on these studies might make it possible to estimate the social cost of a breach.  Then the estimation question is &#8220;What portion of social cost will the firm have to bear?&#8221;</p>
<p>The answer to this question depends on firm policy (see CBI and NGRU story, above), the legal system, the regulatory system, and also the legislative system.  Basically, if a firm or collection of firms consistently and egregiously impose large costs on their customers or employees, then one or more of these other social/political mechanisms might kick in to impose an &#8216;equity remedy&#8217;.</p>
<p>The most immediate remedy, from the American firm&#8217;s point of view, is a class action lawsuit.  Of course, estimating the likelihood of getting sued, the damages sought, and the likelihood of losing such a suit is risky business <img src='http://newschoolsecurity.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .  But just because it&#8217;s difficult to estimate with precision should it be excluded?</p>
<p>Again, including this cost category might increase the cost per breach by 2X to 10X in some cases.  But it might also shift management attention to crucial questions such as &#8220;What is our role in our value network regarding information security and risk?&#8221;</p>
<h4>Problem #6: Unsupportable inferences</h4>
<p>Given that their survey method is not statistically robust (see p. 33), they do not have sufficient confidence to make the inferences summarized on p. 28.   I won&#8217;t go through these one by one, but anyone who has done statistical sampling and inference knows how sample size and variability affect confidence intervals.  If the difference in question does not exceed the confidence interval, then you cannot support the inference from the data.  The best they can do is say, &#8220;we say X% of companies report Y, vs. A% of companies reporting B.  This suggests that&#8230;&#8221;.  All such suggestions would then need to be subjected to additional tests.</p>
<h4>Problem #7: Is &#8216;Cost per Record&#8217; the best measure?</h4>
<p>It appears that only a few costs truly vary by the number of records breached.  These include costs of &#8216;notification&#8217; and some of &#8216;ex-post costs&#8217;.  But &#8216;discovery&#8217;, &#8216;escalation&#8217;, and &#8216;indirect costs&#8217; are mostly independent of size of breach measured by number of records.  Some might be fixed costs that are independent of the size of breach.  Some might be increasing functions, perhaps relative to some threshold of that defines &#8216;big&#8217; or &#8216;material&#8217; (to use the accountant&#8217;s term).</p>
<p>This problem may not be significant compared to the others.  I just think it needs to be justified by comparing it to alternative formulations.</p>
<h4>Summary</h4>
<p>The summary result ($204 per record) reported in the Ponemon survey is not reliable.  No one should rely on the absolute value of this measure.  Some of the relative measures might be informative, especially the direct costs that the point-of-contact respondents are qualified to answer.  Trend analysis might be somewhat informative.  None of the recommendations reported (i.e. the value of hiring outside IT security consultants) can be supported by statistically significant inferences.</p>
<p>To get a reliable measure of Cost of a Data Breach will require substantial revision to the survey instrument, sampling method, analysis methods, and reliability controls.  I&#8217;m guessing that this is beyond the appetite of PGP, the sponsor of the survey.</p>
<p><strong>Call to action</strong></p>
<p>What would it take to launch a Version 2.0 of this study with more robust methods and a stronger team of experts to execute it and analyze the results?  There&#8217;s no mystery about how to do Version 2.0.  The only obstical is resources and commitment.</p>
<p><em>&lt;Addendum:  For a related discussion, see my previous post: </em><a href="http://newschoolsecurity.com/2009/10/the-cost-of-a-near-miss-data-breach/"><em>&#8220;Cost of a Near-miss Data Breach</em></a><em>&#8220;&gt;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/01/another-critique-of-ponemons-method-for-estimating-cost-of-data-breach/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>A critique of Ponemon Institute methodology for &#8220;churn&#8221;</title>
		<link>http://newschoolsecurity.com/2011/01/a-critique-of-ponemon-institute-methodology-for-churn/</link>
		<comments>http://newschoolsecurity.com/2011/01/a-critique-of-ponemon-institute-methodology-for-churn/#comments</comments>
		<pubDate>Tue, 25 Jan 2011 16:56:10 +0000</pubDate>
		<dc:creator>adam</dc:creator>
				<category><![CDATA[argument]]></category>
		<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[Reports and Data]]></category>

		<guid isPermaLink="false">http://newschoolsecurity.com/?p=2028</guid>
		<description><![CDATA[Both Dissent and George Hulme took issue with my post Thursday, and pointed to the Ponemon U.S. Cost of a Data Breach Study, which says: Average abnormal churn rates across all incidents in the study were slightly higher than last year (from 3.6 percent in 2008 to 3.7 percent in 2009), which was measured by [...]]]></description>
			<content:encoded><![CDATA[<p>Both Dissent and George Hulme took issue with <a href="http://newschoolsecurity.com/2011/01/a-day-of-reckoning-is-coming/">my post Thursday</a>, and pointed to the Ponemon  <a href="http://www.encryptionreports.com/download/Ponemon_COB_2009_US.pdf">U.S. Cost of a Data Breach Study</a>, which says:</p>
<blockquote><p>
Average abnormal churn rates across all incidents in the study were slightly higher than last year (from 3.6 percent in 2008 to 3.7 percent in 2009), which was measured by the loss of customers who were directly affected by the data breach event (i.e., typically those receiving notification). The industries with the highest churn rate were pharmaceuticals, communications and healthcare (all at 6 percent), followed by financial services and services (both at 5 percent.)
</p></blockquote>
<p>Some comments:</p>
<ul>
<li>126 of the hundreds of organizations that suffered a breach were selected (no word on how) to receive a survey.  45 responded, which might be a decent response rate, but we need to know how the 126 were selected from the set of breached entities.
<li>We don&#8217;t understand the baseline for customer churn.  What is normal turnover?  Is it the median for the last 3 years for that company?  The mean for the sector last year?  If we knew how normal turnover was defined, and its variance, then we could ask questions about what abnormal means.  Is it the difference between <em>management estimates</em> and prior years?  Is it the difference between a standard deviation above the mean for the sector for the past 3 years and the observed?
<li>Most importantly, it&#8217;s not an actual measure of customer churn.  The report states that it measured not actual customer loss, but the results of a survey that asked for:<br />
<blockquote><p>
The estimated number of customers who will most likely terminate their relationship as a result of the breach incident. The incremental loss is abnormal turnover attributable to the breach incident. This number is an annual percentage, which is based on <em>estimates provided by management</em> during the benchmark interview process.  [Emphasis added.]
</p></blockquote>
</ul>
<p>The report has other issues, and I encourage readers to examine its claims and evidence closely.  I encourage this in general, it&#8217;s not a comment unique to the Ponemon report.  Some examples from a number of additional surveys, that George Hulme raised in argment in this <a href="http://www.informationweek.com/blog/main/archives/2011/01/security_doesnt_1.html">blog post</a>:  </p>
<p>Briefly, the CMO council found concern about security, not any knowledge of breaches.  Forrester showed that some folks are scared to shop online, which means brand doesn&#8217;t matter, or they&#8217;d shop online from trusted brands.  Javelin reports 40% of consumers reporting that their relationship &#8220;changed,&#8221; and 30% reporting a choice to not purchase from the organization again.  Which is at odds with even the most &#8216;consumer-concerned&#8217; estimates from Ponemon, and is aligned with the idea that surveys are hard to do well.</p>
<p>
]]></content:encoded>
			<wfw:commentRss>http://newschoolsecurity.com/2011/01/a-critique-of-ponemon-institute-methodology-for-churn/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

