3 Dataset Overview, Preprocessing, and Features
3.1 Successful Companies Dataset and 3.2 Unsuccessful Companies Dataset
4 Model Training, Evaluation, and Portfolio Simulation and 4.1 Backtest
5 Other approaches
5.2 Founders ranking model and 5.3 Unicorn recommendation model
7 Further Research, References and Appendix
In terms of further work, a promising direction is the usage of different sources of text data about companies, founders, and investors. This could involve leveraging social media platforms such as Twitter and LinkedIn, as well as parsing the websites of the companies themselves.
Additionally, it may be worth adjusting the foundation date filter to include companies founded in 1995, rather than the current start date of 2000-01-01. However, this could potentially result in an influx of companies from the "dotcom bubble" period.
The current strict filters used to determine successful companies (IPO/ACQ/UNICORN) could also be loosened to potentially capture more companies in the "gray area" between success and failure.
Finally, it may be worth conducting experiments to determine the optimal threshold value for adding companies to the portfolio, taking into account the size of the portfolio.
These additional tasks can provide valuable insights and enhance the effectiveness of the AI investor backtest model. Analyzing the presentation materials, video interviews, and source code of software companies can provide a better understanding of the company’s strategy, goals, and potential. Developing information collection systems to automate this process can save time and improve accuracy.
Evaluating the influence of macroeconomic elements and technological trajectories on startups may facilitate the identification of potential risks and opportunities. It can also aid in the development of exit strategies. Additionally, analyzing competing studies can provide insights into the market and competition, which can inform investment decisions.
[1] Shamima Ahmed, Muneer M Alshater, Anis El Ammari, and Helmi Hammami. Artificial intelligence and machine learning in finance: A bibliometric review. Research in International Business and Finance, 61:101646, 2022.
[2] Si Shi, Rita Tse, Wuman Luo, Stefano D’Addona, and Giovanni Pau. Machine learning-driven credit risk: a systemic review. Neural Computing and Applications, 34(17):14327–14339, 2022.
[3] Dale W Jorgenson, Martin L Weitzman, Yoshua X ZXhang, Yann M Haxo, and Ying X Mat. Can neural networks predict stock market? AC Investment Research Journal, 220(44), 2023.
[4] Greg Ross, Sanjiv Das, Daniel Sciro, and Hussain Raza. Capitalvx: A machine learning model for startup selection and exit prediction. The Journal of Finance and Data Science, 7:94–114, 2021.
[5] Ajai Mishra, Dharm Singh Jat, and Durgesh Kumar Mishra. Machine intelligence for predicting new start-ups success: A survey. In Proceedings of the International Conference on Data Science, Machine Learning and Artificial Intelligence, DSMLAI ’21’, page 99–105, New York, NY, USA, 2022. Association for Computing Machinery.
[6] Javier Arroyo, Francesco Corea, Guillermo Jimenez-Diaz, and Juan A Recio-Garcia. Assessment of machine learning performance for decision support in venture capital investments. Ieee Access, 7:124233–124243, 2019.
[7] Ivo Blohm, Torben Antretter, Charlotta Sirén, Dietmar Grichnik, and Joakim Wincent. It’sa peoples game, isn’t it?! a comparison between the investment returns of business angels and machine learning algorithms. Entrepreneurship Theory and Practice, 46(4):1054–1091, 2022.
[8] Francesco Corea, Giorgio Bertinetti, and Enrico Maria Cervellati. Hacking the venture industry: An early-stage startups investment framework for data-driven investors. Machine Learning with Applications, 5:100062, 2021.
[9] Francesco Corea and Francesco Corea. Ai and venture capital. An Introduction to Data: Everything You Need to Know About AI, Big Data and Data Science, pages 101–110, 2019.
[10] Kamil Zbikowski and Piotr Antosiuk. A machine learning, bias-free approach for predicting business success ˙ using crunchbase data. Information Processing & Management, 58(4):102555, 2021.
[11] Boris Sharchilev, Michael Roizner, Andrey Rumyantsev, Denis Ozornin, Pavel Serdyukov, and Maarten de Rijke. Web-based startup success prediction. In Proceedings of the 27th ACM international conference on information and knowledge management, pages 2283–2291, 2018.
[12] Jaiesh Singhal, Chinmayi Rane, Yash Wadalkar, Mohit Joshi, and Amol Deshpande. Data driven analysis for startup investments for venture capitalists. In 2022 International Conference for Advancement in Technology (ICONAT), pages 1–6. IEEE, 2022.
[13] Jongwoo Kim, Hongil Kim, and Youngjung Geum. How to succeed in the market? predicting startup success using a machine learning approach. Technological Forecasting and Social Change, 193:122614, 2023.
[14] Torben Antretter, Ivo Blohm, Dietmar Grichnik, and Joakim Wincent. Predicting new venture survival: A twitter-based machine learning approach to measuring online legitimacy. Journal of Business Venturing Insights, 11:e00109, 2019.
[15] Andranik Tumasjan, Reiner Braun, and Barbara Stolz. Twitter sentiment as a weak signal in venture capital financing. Journal of Business Venturing, 36(2):106062, 2021.
[16] Crunchbase. Crunchbase Unicorn Company List. https://news.crunchbase.com/ unicorn-company-list/. Accessed: .
[17] PitchBook. https://pitchbook.com.