Startup Success Prediction

Will your startup get acquired or close? Seven classification models trained on 5,500 startup outcomes.

Select a market, country, funding range and number of rounds below. The model returns the predicted probability of acquisition based on historical patterns from the Crunchbase dataset. This is not financial advice: it's a statistical lookup based on what happened to similar companies.

Your Startup Profile

Market

Country

Total Funding

Funding Rounds

Probability of Acquisition

55.0%

Acquired: 55.0%Closed: 45.0%

Based on Software startups in United States with 1M-10M funding across 2 round(s)

Model Performance Comparison

5-fold cross-validated metrics: seven models from linear to ensemble to instance-based

Model↕	Accuracy↕	Precision↕	Recall↕	F1↓	ROC-AUC↕
Random ForestBest	74.9%	74.5%	75.7%	75.1%	81.8%
XGBoost	74.7%	74.4%	75.4%	74.9%	82.4%
Decision Tree	69.1%	69.2%	69.0%	69.1%	72.7%
SVM (RBF)	68.2%	69.4%	65.1%	67.2%	74.9%
Logistic Regression	67.7%	68.6%	65.4%	66.9%	73.5%
K-Nearest Neighbors	68.6%	71.1%	62.9%	66.7%	74.8%
Gaussian Naive Bayes	59.3%	76.3%	27.3%	40.0%	71.2%

Feature Importances

What matters most in predicting startup success

How This Works

The prediction is a pre-computed lookup. We trained seven classifiers on ~5,500 startups that had a definitive outcome (acquired or closed), then selected the best performer by F1 score to generate predictions for every valid combination of market, country, funding bucket and round count.

Why these seven models? They cover every major paradigm in machine learning. Logistic Regression is the linear baseline: interpretable coefficients, calibrated probabilities. Decision Tree is a single tree: pure interpretability, shows what splits matter. Random Forest averages many trees to reduce variance. XGBoost builds trees sequentially, correcting errors (current SOTA for tabular data). SVM (RBF) finds a non-linear boundary in kernel space; different inductive bias from trees. K-Nearest Neighbors is instance-based: no model at all, just similarity lookup. Gaussian Naive Bayes is probabilistic: fast, assumes feature independence, useful as a calibrated baseline.

Accuracy is the overall correct prediction rate. Precision measures how many predicted acquisitions were actually acquired. Recall measures how many actual acquisitions the model caught. F1 is the harmonic mean of precision and recall. ROC-AUC measures the model's ability to distinguish between outcomes across all thresholds.