I conduct an investment screening performance benchmarking between 111 venture capital (VC) investment professionals and a supervised gradient boosted tree (or “XGBoost”) classification algorithm to create trust in machine learning (ML) -based screening approaches, accelerate the adoption thereof and ultimately enable the traditional VC model to scale. Using a comprehensive dataset of 77,279 European early-stage companies, I train a variety of ML algorithms to predict the success/failure outcome in a 3- to 5-year simulation window. XGBoost algorithms show particularly excellent performance in terms of accuracy and recall, which denote the most important metrics in my setup. I benchmark the performance of the selected algorithm against that of the VC investment professionals by providing equal information in the form of 10 company one-pagers via an online survey and requesting respondents to select the five most promising companies for further evaluation. In addition to finding characteristic- specific performance dependencies for VCs, I find that the XGBoost algorithm outperforms the median VC by 25% and the average VC by 29%. Although I do not suggest replacing humans with ML-based approaches, I recommend an augmented solution where intelligent algorithms narrow down the upper part of the deal-flow funnel, allowing VC investment professionals to focus their manual efforts on the lower part of the funnel. Using this approach, they can rely on a scalable but objective pre-selection and focus their manual resources on evaluating the most promising opportunities and putting themselves into the best position to secure these deals.
Authors
Andre Retterath
Technical University Munich