Archive for April 2012
What do you do when you have to make decisions in an uncertain environment with only mediocre data? Startup founders and investors face this question all the time.
I had an interesting email exchange on this topic with Brad Feld of Foundry Group. First, let me say that I like Brad and his firm. If I were the founder of a startup for whom VC funding made sense, Foundry would be on my short list.
Now, Brad has an Master’s in Management Science from MIT and was in the PhD program. I have a Master’s in Engineering-Economic Systems from Stanford, specializing in Decision Theory. So we both have substantial formal training in analyzing data and are both focused on investing in startups.
But we evidently take opposing sides on the question of how data should inform decision-making. Here’s a highly condensed version of our recent conversation on my latest “Seed Bubble” post (don’t worry, I got Brad’s permission to excerpt):
Brad: Do you have a detailed spreadsheet of the angel seed data or are you using aggregated data for this?… I’d be worried if you are basing your analysis… without cleaning the underlying data.
Kevin: It’s aggregated angel data…. I’m generally skeptical of the quality of data collection in both… data sets…. But the only thing worse than using mediocre data is using no data.
Brad: I hope you don’t believe that. Seriously – if the data has selection bias or survivor bias, which this data likely does, any conclusions you draw from it will be invalid.
Kevin: …of course I believe it…. Obviously, you have to assess and take into account the data’s limitations… But there’s always some chance of learning something from a non-empty data set. There’s precisely zero chance of learning something from nothing.
Brad: … As a result, I always apply a qualitative lens to any data (e.g. “does this fit my experience”), which I know breaks the heart of anyone who is purely quantitative (e.g.
“humans make mistakes, they let emotions cloud their analysis and judgement”).
I don’t want to focus on these particular data sets. Suffice it to say that I’ve thought reasonably carefully about their usefulness in the context of diagnosing a seed investment bubble. If anyone is really curious, let me know in the comments.
Rather, I want to focus on Brad’s and my positions in general. I absolutely understand Brad’s concerns. Heck, I’m a huge fan of the “sanity check”. And I, like most people with formal data analysis training, suffer a bit from How The Sausage Is Made Syndrome. We’ve seen the compromises made in practice and know there’s some truth to Mark Twain’s old saw about “lies, damned lies, and statistics.” When data is collected by an industry group rather than an academic group (as is the case with the NVCA data) or an academic group doesn’t disclose the details of their methodology (as is the case with the CVR angel data), it just feeds our suspicions.
I think Brad zeroes in on our key difference in the last sentence quoted above:
…which I know breaks the heart of anyone who is purely quantitative (e.g.
“humans make mistakes, they let emotions cloud their analysis and judgement”).
I’m guessing that Brad thinks the quality of human judgement is mostly a matter of opinion or that it can be dramatically improved with talent/practice. Actually, the general inability of humans to form accurate judgements in uncertain situations has been thoroughly established and highly refined by a large number of rigorous scientific studies, dating back to the 1950s. It’s not quite as “proven” as gravity or evolution, but it’s getting there.
At Stanford, I mostly had to read the original papers on this topic. Many of them are, shall we say, “difficult to digest.” But now, there are several very accessible treatments. For a general audience, I recommend Daniel Kahneman’s Thinking Fast and Slow, where he recounts his journey exploring this area, from young researcher to Nobel Prize winner. For a more academic approach, I recommend Hastie’s and Dawes’ Rational Choice In an Uncertain World. If you need to make decisions in uncertain environments and aren’t already familiar with the literature, I cannot recommend strongly enough reading at least one of these books.
But in the meantime, I will sum up. Human’s are awful at forming accurate judgements in situations where there’s a lot of uncertainty and diversity (known as low validity environments). It doesn’t matter if you’re incredibly smart. It doesn’t matter if you’re highly experienced. It doesn’t even matter if you know a lot about cognitive biases. The fast, intuitive mechanisms your brain uses to reach conclusions just don’t work well in these situations. If the way quantitative data analysis works in practice gives you pause, the way your brain intuitively processes data should have you screaming in horror.
Even the most primitive and ad hoc quantitative methods (such as checklists) generally outperform expert judgements, precisely because they disengage the intuitive judgment mechanisms. So if you actually have a systematically collected data set, even if you think it almost certainly has some issues, I say the smart money still heavily favors the data rather than the expert.
By the way, lots of studies also show that people tend to be overconfident. So thinking that you have a special ability or enough expertise so that this evidence doesn’t apply to you… is probably a cognitive illusion too. I say this as a naturally confident guy who constantly struggles to listen to the evidence rather than my gut.
My recommendation: if you’re in the startup world, by all means, have the confidence to believe you will eventually overcome all obstacles. But when you have to make an important estimate or a decision, please, please, please, sit down and calculate using whatever data is available. Even if it’s just making a checklist of your own beliefs.
Back in April 2011, I crunched the data on seed investing dollars to show there was probably no generalized bubble. Then in November, I updated the numbers for the first half of 2011 and showed that seed investing was pretty flat.
Now that the full year 2011 angel data is out from the CVR, I have once again combined it with the VC data from the NVCA and super angel data from EDGAR listings. (My current collation of the data is available in this Excel file) There is a healthy uptick, but it still looks much more like a recovery than a bubble. Here are the dollar volume charts:
As you can see, angel activity is up substantially. Looking at the detailed CVR reports, seed dollar volume went from a $6.9B annual rate in 1H2011 to a $12.1B annual rate in 2H2011, for a total of $9.5B in 2011. The fraction going to seed and early stage deals ticked up slightly from 39% to 42%. So angel seed/early funding is still down 25% from its peak in 2005 for the year. However, 2H2011 was about the same as the peak years 2004-2006. I’d say that seed funding from angels has recovered and if it continues growing, we might see bubble territory in 2012 or 2013.
VC seed funding dropped dramatically in 2011. Down 47% in just one year! Average “seed” deal size was down from $4.6M to $2.3M. I’m always hesitant to generalize from one year’s data, but it certainly looks like something might be changing for VCs.
Which brings us to the super angels. If you look at my spreadsheet, I’ve gotten a bit more structured in this analysis. Per the comments from the last edition, I now break out the planned versus actual fund sizes when looking at the SEC data.
Interestingly, Jeff Clavier’s SoftTech VC actually exceeded his planned number, hitting $55M instead of $35M. Of course, this doesn’t affect my analysis because the firm is a member of the NVCA and presumably included in their numbers. Roger Ehrenberg ‘s IA Ventures hit $98M out of an originally planned $100M and then increased the planned size to $110M. Ron Conway’s SV Angel only had $12M out of a planned $40M, but I’m pretty confident he can hit whatever number he wants. IMAF looks to only have raised $1.5M out of their planned $13M. Note that super angels are still less than 5% of the seed funding market.
Looking forward to 2012, Dave McClure’s 500 Startups is planning to raise a $50M fund and Chris Sacca’s LOWERCASE Capital is planning to raise $65M. Healthy increases for both of them, but nothing that will fundamentally shift the industry. Individual angels and traditional angel groups are still driving total volume.