How Smart is Smart Money? An Empirical Two-Sided Matching ...

59
How Smart is Smart Money? An Empirical Two-Sided Matching Model of Venture Capital Morten Sørensen * November 19, 2003 Abstract In capital markets, top-tier investors may have better abilities to monitor and manage their investments. In addition, there may be sorting in these markets, with top-tier in- vestors investing in the best deals, second-tier investors investing in the second-best deals, and so forth. To separate and quantify these two effects, a structural model of the market for venture capital is developed and estimated. The model is a two-sided matching model that allows for sorting in equilibrium. It is found that more experienced venture capitalists make more successful investments. This is explained both by their value-adding influence on their investments, and by their access to late stage and biotechnology companies, com- panies that are more successful on average. Sorting is found to be prevalent and has general implications for the interpretation of empirical evidence of the impact of investors on their investments. KEYWORDS: Venture Capital, Two-Sided Matching, Simultaneous Equations, Bayesian Inference, Markov Chain Monte Carlo * Department of Economics, Stanford University, Email: [email protected]. Tim Bresnahan provided much inspiration and encouragement through this project. I want to thank Susan Athey, Patrick Bajari, Peter Coles, Liran Einav, Thomas Hellmann, Yael Hochberg, Laura Lindsey, Joao Manoel de Mello, Garth Saloner, Pai-Ling Yin and seminar participants at the Stanford University Industrial Organization Seminar for many helpful discussions and comments. I gratefully acknowledge support by the Danish Research Academy, the Forman Fellowship, and the Kapnick Foundation through a grant to the Stanford Institute for Economic Policy Research. 1

Transcript of How Smart is Smart Money? An Empirical Two-Sided Matching ...

Page 1: How Smart is Smart Money? An Empirical Two-Sided Matching ...

How Smart is Smart Money? An Empirical Two-Sided Matching

Model of Venture Capital

Morten Sørensen∗

November 19, 2003

Abstract

In capital markets, top-tier investors may have better abilities to monitor and manage

their investments. In addition, there may be sorting in these markets, with top-tier in-

vestors investing in the best deals, second-tier investors investing in the second-best deals,

and so forth. To separate and quantify these two effects, a structural model of the market

for venture capital is developed and estimated. The model is a two-sided matching model

that allows for sorting in equilibrium. It is found that more experienced venture capitalists

make more successful investments. This is explained both by their value-adding influence

on their investments, and by their access to late stage and biotechnology companies, com-

panies that are more successful on average. Sorting is found to be prevalent and has general

implications for the interpretation of empirical evidence of the impact of investors on their

investments.

KEYWORDS: Venture Capital, Two-Sided Matching, Simultaneous Equations, Bayesian

Inference, Markov Chain Monte Carlo

∗Department of Economics, Stanford University, Email: [email protected]. Tim Bresnahan providedmuch inspiration and encouragement through this project. I want to thank Susan Athey, Patrick Bajari, PeterColes, Liran Einav, Thomas Hellmann, Yael Hochberg, Laura Lindsey, Joao Manoel de Mello, Garth Saloner,Pai-Ling Yin and seminar participants at the Stanford University Industrial Organization Seminar for manyhelpful discussions and comments. I gratefully acknowledge support by the Danish Research Academy, theForman Fellowship, and the Kapnick Foundation through a grant to the Stanford Institute for Economic PolicyResearch.

1

Page 2: How Smart is Smart Money? An Empirical Two-Sided Matching ...

1 Introduction

Active investors have two main roles: They locate more valuable investments and direct capital

to them, and they monitor and manage their investments to increase their return. For example,

investment banks perform comprehensive due diligence before underwriting a security, venture

capitalists hold seats on the boards of the companies in their portfolios (e.g. Lerner (1995)),

and large shareholders monitor the companies in which they invest (see Shleifer and Vishny

(1997) for a survey of this literature). The objective of this paper is to investigate the investors’

access to different investments in the market, and quantify their influence on their investments.

Investors differ in their abilities to monitor and manage their investments and in their

reputations, and are often ranked based on these. For investment banks this ranking is explicit,

with bankers referring to the top-tier banks as the “bulge bracket”. For venture capitalists the

ranking is implicit, but widely recognized in the market.

An investment by a top-tier investor may benefit the company receiving the investment (the

deal) in two ways. A top-tier investor may have better abilities to monitor and manage the

deal, and directly influence the value of the deal in this manner (Gompers (1995) investigates

monitoring and management by venture capitalists). Further, a top-tier investor may certify the

quality of the deal. This benefits the deal indirectly, by conveying positive information to the

market (Megginson and Weiss (1991), and Barry et. al. (1990) find evidence of certification

for venture capitalists; Carter and Manaster (1990), and Puri (1996) find evidence of the

certification role for underwriters).

Deals generally prefer top-tier investors, and these investors may, in turn, have a position

in the market with access to better deals and a larger set of feasible investments. If top-tier

investors are unable, or unwilling, to serve all deals, there will be sorting in the market. With

sorting, the top-tier investors match with the best deals, the second-tier investors match with

the second-best deals, and so forth. Capital constraints or other constraints may limit the

number of deals an investor can fund. Reputation concerns may limit the number of deals the

investor is willing to fund. Reputation is critical for the certification of deals, and an investor

will be unwilling to invest in a marginal deal, if the investor’s cost of lost reputation exceeds

the marginal benefit of the investor to the deal.

Different abilities of investors and differences across deals may create synergies between

2

Page 3: How Smart is Smart Money? An Empirical Two-Sided Matching ...

investors and deals in the market. For venture capitalists the synergies may arise from an

investor’s experience within certain industries. Corporate investors may have stronger strategic

complementarities with deals that are more closely related to the corporation’s core business

(Gompers and Lerner (1998)).

Four features of markets for investors and deals have been observed above: Investors may

be divided into tiers, investors may have different abilities to influence their investments, there

may be sorting in the market, and synergies may exist between certain investors and deals.

These features raise several fundamental questions: Is the performance of top-tier investors a

result of their market position or their influence on their investments? What is the value to a

deal of having a top-tier investor? How significant are synergies in these markets? What are

the appropriate empirical methods to quantify these effects?

To answer these questions, I develop and estimate a model for the market for venture

capital. In this model, companies prefer venture capitalists with stronger synergies and better

value-adding abilities. Venture capitalists prefer to invest in companies with stronger synergies

and higher expected returns. Each venture capital firm makes a limited number of investments.

The investors that are most preferred by the companies (the top-tier VCs) are better positioned

in the market and can choose from larger sets of feasible investments. This allows for sorting

in equilibrium.

The market for venture capital is particularly suited for the study of investor behavior.

The companies that receive funding from venture capital investors are fairly similar, typically

young entrepreneurial high-tech companies. The investments are typically structured as staged

equity financing, giving the venture capitalists extensive control and cash-flow rights. While

most other capital markets have heterogeneous investors, venture capital firms are almost al-

ways organized as limited partnerships with similar incentive and investment structures. The

homogenous nature of the market makes it possible to focus on the behavior and impact of the

investors, without having to control for institutional differences and differences in investment

structures. Gorman and Sahlman (1989) and Sahlman (1990) describe the structure of this

market. Kaplan and Stromberg (2003) stress that venture capitalists are the investors that

most closely approximate investors of theory.

The venture capital literature has struggled to separate the investors’ influence from their

selection of the companies they invest in. The literature has demonstrated several differences

3

Page 4: How Smart is Smart Money? An Empirical Two-Sided Matching ...

between VC backed companies and companies financed by other investors. VC backed compa-

nies bring products to the market faster (Hellmann and Puri (2000)), hire key employees faster

and replace their management more frequently (Hellmann and Puri (2002)). These companies

are priced higher when they go public, and have better long term performance (Megginson and

Weiss (1991), and Brav and Gompers (1997)). When they form strategic alliances, they form

them with companies that are also funded by the same investors (Lindsey (2003)). Even after

going public and becoming independent financial entities, significant differences in their gov-

ernance structures remain (Hochberg (2003)). Gompers and Lerner (1998) find mixed results

when comparing the outcomes of investments by venture capitalists to investments by corporate

investors.

In contrast to the above literature, this paper is concerned with differences between com-

panies that are funded by different venture capitalists. When comparing investments, the fact

that investors do not have access to the same investment opportunities presents a challenge for

the empirical analysis. It is difficult to distinguish whether observed differences are caused by

different investors, or are a result of differences in their initial investments. This problem is

recognized in the literature and different solutions have been proposed. Kaplan and Stromberg

(2002) investigates investment analyses by the venture capitalists, and finds direct evidence of

their influence on shaping and recruiting senior management. Hellman and Puri (2002) and

Hochberg (2003) use a selection model to control for the endogeneity of receiving venture capital

financing. However, the lack of instrumental variables, and the inability of this model to fully

capture the features of the market, cast some doubt on the interpretation of their estimates.

The absence of definite evidence on the contribution of venture capitalists has led to persis-

tent disagreement about the role of these investors in the economy. While the venture capitalists

stress their positive value-adding influence on their investments, some entrepreneurs describe

them as purely passive investors, or even “vulture capitalists,” providing little more than an

expensive source of capital. One of the main objectives of the present work is to add to the

understanding of this issue.

The economic model developed here is a variation of a game-theoretic two-sided matching

model. The model is a special case of the one-to-many College Admissions Model (see Roth and

Sotomayor (1990)). This matching model is used to derive an econometric model of the market,

and the model is combined with a simple specification of the outcomes of the investments. The

4

Page 5: How Smart is Smart Money? An Empirical Two-Sided Matching ...

combined model is estimated simultaneously, and the estimated parameters are used to quantify

the investors’ influence, the sorting in the market, and the extent of synergies between investors

and deals.

Exogenous variation is required to identify the economic factors that influence the invest-

ment outcomes. This variation must influence investment decisions, yet be excluded from the

investment outcomes. Finance theory predicts that the economic factors that determine invest-

ment outcomes are the same factors that determine investment decisions, and none of these

present valid instruments. Instead, the sorting in the market is exploited for the identification

of the model. Sorting implies that the feasible investments for each investor are, in part, deter-

mined by the characteristics of the other agents in the market. These characteristics are thus

related to the investment decision, and unrelated to the investment outcome, and they provide

the exogenous variation used for identification of the model. The idea that characteristics of

other agents can provide exogenous variation has been successfully used in other studies, al-

though the particular models and their implementations differ fundamentally (e.g. Bresnahan

(1987), and Berry, Levinsohn, and Pakes (1995)).

Sorting causes interaction between investment decisions by different venture capitalists, and

this creates numerical difficulties for the estimation of the model. Bayesian estimation using

Markov Chain Monte Carlo simulation provides a computationally feasible estimation strategy

(Gelfand and Smith (1990), Albert and Chib (1993), and Geweke, Gowrisankaran, and Town

(2003) develop and employ the methods used here).

For the empirical analysis, I examine a sample of 1666 investments by 75 venture capital

firms made over a 14 year period. For each venture capital firm, at the time of each of the

investments, I calculate the VC’s experience as the number of previous investment round it

has participated in. This experience measure is taken as a measure of the venture capital

firm’s abilities and reputation, and the empirical analysis finds that more experienced investors

make more successful investments. Investments by the most experienced investors are twice as

likely to result in public offerings as investments by the least experienced investors. Sorting

in the market gives more experienced investors better access to investments in late stage and

biotechnology companies, companies that are more successful on average. Even when investing

in similar companies, more experienced investors are also able to bring these companies public

at a significantly higher rate than less experienced investors. Finally, the analysis shows that

5

Page 6: How Smart is Smart Money? An Empirical Two-Sided Matching ...

significant synergies between investors and companies are present in the market, and that

estimates of investors’ influence that do not account for sorting and synergies (or assume sorting

on observed characteristics only) may overestimate investors’ influence by as much as 60%.

The paper is organized as follows: Section 2 presents the model of the matching of venture

capitalists and companies in the market. In Section 3 the empirical model of the market is

developed. This model is combined with a simple specification of investment outcomes, and

the combined model is estimated simultaneously. Section 4 describes the construction of the

sample and variables used for the estimation. Section 5 presents and interprets the estimated

parameters. Section 6 presents the differences between standard selection models and the

model developed here. Section 7 concludes. There are two appendices: Appendix A contains a

formal description of the economic model, and Appendix B contains details of the estimation

procedure.

2 Economic Model

The objective of the economic model is to provide a simple representation of a market with

synergies and sorting. The model is a special case of the one-to-many two-sided matching

model, also known as the College Admissions Model (Roth and Sotomayor (1990)). The model

developed here imposes a new additional restriction on the agents’ preferences. The restriction

ensures that there is a unique equilibrium and that this equilibrium is characterized by a set of

inequalities. Both of these features are important for the empirical model. Without a unique

equilibrium, the likelihood function is not well defined, and without a simple characterization,

the empirical model is intractable. This section discusses, in turn, the preferences of the

agents, the equilibrium concept employed, and the characterization of the equilibrium. A

formal treatment of the definitions, theorems and proofs is presented in appendix A.

2.1 Agents and Investments

In market m there are two finite and disjoint sets of agents. The set of investors is given by Im,

and the set of companies receiving the investments is given by Jm. Each investor can invest

in up to qm,i companies (known as the quota in the matching literature), and each company

receives an investment from a single investor. Let the set of potential investments (also known

6

Page 7: How Smart is Smart Money? An Empirical Two-Sided Matching ...

as matches in the matching literature, or deals in the venture capital literature) be given by

Mm = Im × Jm. A matching is a subset of this set, µm ⊂ Mm, and will typically contain

the investments observed in the market (so each matching consists of several matches). To

simplify the notation, let the portfolio of companies that investor i invests in be given by

µm(i) = {j ∈ Jm | ij ∈ µm}, and let the investor that invests in company j be denoted

µm(j) = {i ∈ Im | ij ∈ µm}. With this notation, ij ∈ µm is equivalent to j ∈ µm(i), which is

again equivalent to µm(j) = {i}.

The assumption that only a single investor invests in each company is maintained since

the empirical analysis focuses on the investment by the lead investor in the initial round of

investments. It is not uncommon for the lead investor to bring other investors into the deal in

later rounds. These investors serve different purposes than the initial investor, and the market

for later rounds differs fundamentally from the market for initial investments (see Admati and

Pfleiderer (1994) and Lerner (1994a)).

The assumption that the investors make a limited number of investments, rather than invest

a limited amount of capital, is reasonable in the context of venture capital. This is consistent

with anecdotal evidence (Quindlen (2000)) that the scarce resource the investors face is time

and not money. Kaplan and Stromberg (2002) finds that the time requirement of the investment

is a frequent concern for venture capitalists.

2.2 Preferences

To model the agents’ preferences over investments, let there be a valuation for each deal. The

valuation measures the attractiveness of the deal for the investor and the company. Different

venture capital firms have different levels of experience, different skills, and access to different

networks, and will typically have different abilities to monitor and manage a given company.

The valuations will reflect these different synergies. 1 Let Vi,j be the valuation of an investment

by investor i in company j. The collection of valuations in market m is denoted Vm = {Vi,j | ij ∈

Mm}. It is assumed that the valuations are distinct, and that agents are never indifferent

between relevant investment opportunities.1Classical finance theory predicts that the valuation is the net present value of future cash-flows, discounted

at a rate that takes into account the systematic risk of the investment, the value added by the investors, andthe liquidity of the investment. Other theories, specific to the market for venture capital, suggest that betterpositioned investors are more attractive for reasons that go beyond cash-flow considerations. Neither theory isimposed here, and the factors that influence the valuations are left as parameters to be estimated.

7

Page 8: How Smart is Smart Money? An Empirical Two-Sided Matching ...

A company prefers an investor who is able to add more value. Similarly, an investor prefers

to invest in more promising companies in the market, after taking synergies and value-adding

ability into account. These preferences may not be compatible, and the matching between

companies and investors must be treated as the equilibrium of a two-sided game. Let the fraction

of the return that the investor receives be given by λ. Investor i’s preferences over portfolios are

represented by the profit function Πi(µm(i)) = λ∑

j∈µm(i) Vi,j , where µm(i) denotes investor

i’s portfolio.2 Similarly, company j’s preferences over investors are represented by the profit

function Πj(i) = (1− λ)Vi,j .

Compared to the College Admissions Model, the new assumption is that the valuations

determine the preferences for the agents on both sides of the market. This assumption imposes

a restriction on the preferences relative to the College Admissions Model, which allows for

general preferences over matches.

In the specification of the preferences, it is assumed that agents utilities are non-transferable,

and that the division of the return is fixed. The implication of this assumption is a deal with a

lower valuation cannot attract an agent by offering a higher fraction of the return. For venture

capital these assumptions are justified by the uncertainty inherent in the investments. This

uncertainty makes it difficult to specify the division of the return ex-ante, and the division will

be determined by the valuations of later financing round, which are negotiated through ex-post

bargaining between the involved parties (Hellmann (1998) contains a model with bargaining

between the venture capitalist and the company). This assumption is not essential for the

analysis. A model with transferable utility, or endogenous λ, would also produce sorting in

equilibrium, and the analysis of this model would be analogous to the present analysis (this

would be a variation of the Assignment model in Roth and Sotomayor (1990)).

2.3 Equilibrium Matching

The equilibrium concept used for two-sided matching models is stability. Stability is a concept

from cooperative game theory. A matching is stable if no coalition of agents prefers to deviate

and form new matches among them. Two agents can deviate from a given matching in the

following way. An investor and a company that are not currently matched may form a new2The explicit form of the investors’ profit function is given for concreteness only. A known result is that, as

long as the investors’ preferences over portfolios are responsive to their preferences over individual investments(represented by V ), the equilibrium is independent of the precise form of these preferences (see Appendix A)

8

Page 9: How Smart is Smart Money? An Empirical Two-Sided Matching ...

match together. To do this the company must leave its current investor, and the investor must

give up one of its other investments.3 A matching is an equilibrium when there is no pair

of agents that prefer to deviate in this way, and this equilibrium concept is called pair-wise

stability. The alternative concept that considers deviations by coalitions of agents is called

group stability. It is known from the College Admissions Model that pair-wise stability is

equivalent to group stability. Here, a matching that is stable in either of these ways is simply

called an equilibrium. Further, it is known that an equilibrium always exists, and that the set

of equilibria equals the core of the game, as defined by weak dominance (Roth and Sotomayor

(1990)). Naturally, these results carry over to the present special case of the College Admissions

Model. The new results for the present model are that the equilibrium is unique, and that it

can be characterized by a set of inequalities. Proofs of these results are in appendix A.

The intuition behind the characterization is the following. Given an investor i and a com-

pany j that are not matched in µm, define

V i,j = max[Vµm(j),j , min

j′∈µm(i)Vi,j′

]

This expression gives the maximum of the two agents’ opportunity costs of forming a new

match. These opportunity costs are the valuations of the matches they would have to abandon

to form a new match. For both of the agents to deviate and form this new match, Vi,j must

exceed V i,j .

Similarly, given an investor i and company j that are matched in µm, define

V i,j = max[

maxi′∈S(j)

Vi′,j , maxj′∈S(i)

Vi,j′

]

Here S(j) contains the feasible deviations for company j, and S(i) contains the feasible devia-

tions for investor i. These sets are are given by

S(j) ={

i ∈ Im | Vi,j > minj′∈µm(i)

Vi,j′

}and S(i) =

{j ∈ Jm | Vi,j > Vµm(j),j

}The expression for V i,j gives the maximum of the agents’ opportunity costs of remaining in

the match. These opportunity costs are given by the valuation of their most attractive feasible3For simplicity it is assumed that in equilibrium the investors have no unused quota. This assumption is

relaxed in appendix A

9

Page 10: How Smart is Smart Money? An Empirical Two-Sided Matching ...

deviation. If the set of feasible deviations is empty, the opportunity cost is negative infinity.

Company j and investor i both prefer to stay in this match when Vi,j is above V i,j .

In equilibrium, V i,j and V i,j provide upper and lower bounds on the valuations. Formally,

let the set of valuations for which µm is the equilibrium be given by Γµm. The characterization

of the equilibrium then follows from the equivalence of these statements.

Vm ∈ Γµm⇔ Vi,j < V i,j for all ij /∈ µm

⇔ Vi,j > V i,j for all ij ∈ µm

3 Empirical Model

In the empirical model, the two main equations are the equation determining the investment

outcome and the equation determining the valuation of the deal. In the outcome equation the

probability that an investment results in a public offering is a function of the characteristics

of the investor, the company, and the market itself. In the valuation equation the valuation of

each deal is a function of the characteristics of the company and the investor.4 The valuations,

together with the matching model, permit non-random investment decisions and sorting in the

market. The following section first specifies the empirical model, then shows that, while maxi-

mum likelihood estimation is computationally intractable, standard Bayesian methods provide

a straightforward estimation strategy. Finally, the identification of the model is discussed.

3.1 Model Specification

Each market contains the companies that receive their first round of financing during the same

half-year and are located in the same geographical state, as well as the investors that make

these investments. The markets and the data are described in Section 4 below. Let m = 1, ..., N

index the markets in the sample. To repeat the notation from the matching model, the set of

investors in the market is Im, the set of companies is Jm, and the set of potential investments

is Mm = Im × Jm. The investments observed in the market are given by µm ⊂ Mm, so that

ij ∈ µm when investor i is observed to invest in company j.4For reasons given below, the parameters on the characteristics of the market in the valuation equation are

not identified, and these are excluded from the equation. This exclusion is not related to the identification ofthe parameters in the model.

10

Page 11: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Variables

In the empirical model, there are three groups of exogenous variables. The characteristics of

the company include the stage of development of the company and industry dummies. The

characteristics of the investor include his experience and amount of capital he has available.

The characteristic of the market is the year of the investment.

There are two endogenous variables in the model. The investment outcomes are given by the

variable IPOi,j which equals one if the investment results in a public offering, and equals zero

otherwise. The other endogenous variable is each investor’s investment decision, given by the

vector (µm(1), µm(2), ..., µ(|Jm|))′, where µm(j) equals the investor that invests in company j.

Alternatively, it is convenient to represent the investors’ investment decisions as an endogenous

subset of the set of potential investments, µm ∈ Mm, where ij ∈ µm is equivalent to µm(j) = i.

The model contains two latent variables. The latent outcome variable Oi,j is used in the

specification of investment outcomes, and the latent valuation variable Vi,j is used to model

investment decisions.

Outcome Equation

The specification of the investment outcomes is similar to a Probit model. For each deal, let

the latent outcome variable Oi,j be given by the following outcome equation.

Oi,j = X ′i,jβ + εi,j for all ij ∈ Mm (1)

Let the collection of latent outcome variables in the market be given by Om = {Oi,j | ij ∈ Mm}.

For each of the observed investments, the latent outcome variable is restricted to satisfy the

following equation.

IPOi,j = I{Oi,j>0} for all ij ∈ µm

Here I{Oi,j>0} is the indicator function that equals one when Oi,j > 0 , and equals zero otherwise.

Define the set ΓIPOm = {Oi,j | IPOi,j = I{Oi,j>0} for all i, j ∈ µm}. The restriction on the

latent outcome variables can then be stated as Om ∈ ΓIPOm . In the outcome equation, Xi,j

contains the characteristics of the investor and the company, the characteristics of the market,

and a constant term, all evaluated at the time of the investment. Let Xm = {Xi,j | ij ∈ Mm}

11

Page 12: How Smart is Smart Money? An Empirical Two-Sided Matching ...

contain the characteristics of all the investments in the market. Conditional on Xm, the error

terms εi,j are independent N(0, σ2ε ) random variables. The error terms contain factors that

influence the investment outcomes but are unobserved in the data. Some of these are observed

by the agents at the time of the investment, while the remaining are unobserved in the data as

well as to the agents.

The parameters estimated in the outcome equation predict the probability that an invest-

ment by an arbitrary investor in an arbitrary company results in a public offering. If the

estimated parameters show that investments by more experienced investors, in the same com-

pany, have larger probabilities of the company going public, this reflects the influence of these

investors, after controlling for their initial investment decisions.

In the conventional Probit model the observed investments are assumed to be independent of

εi,j , and the probability that an investment results in a public offering is thus P (IPOi,j = 1) =

Φ(X ′i,jβ/σε). This independence assumption does not hold for the error terms in the outcome

equation, since µm is an endogenous variable. The parameters estimated by the conventional

Probit model predict the probability that the observed investments result in public offerings.

If these estimates show that investments by more experienced investors are more likely to go

public, this reflects the combined effect of their influence on the companies and of their access

to better investments in the market.

In the outcome equation, the scale of the parameters β and σ2ε is not identified, since their

scaling can be changed without changing the probability of the outcome. In a conventional

Probit model this problem is often solved by fixing σ2ε = 1. In the present setting, as argued

below, a slightly different normalization is more convenient.

Valuation Equation

The investments observed in the market are the equilibrium of the matching model. In this

model the agents’ preferences are determined by the valuations of the deals. The valuations are

unobserved, and are modelled as latent variables. The investment decisions are observed, and

these are used to infer the preferences. For each potential investment, let investor i’s valuation

of an investment in company j be given by the following valuation equation.

Vi,j = W ′i,jα + ηi,j for all ij ∈ Mm (2)

12

Page 13: How Smart is Smart Money? An Empirical Two-Sided Matching ...

The equilibrium condition imposes the restriction Vm ∈ Γµmon the latent valuation variables.

In the valuation equation, Wi,j includes the characteristics of the investor and the company.

Conditional on the characteristics in Wm (and Xm), the error terms ηi,j are independent N(0, 1)

random variables. The valuations represent preferences, and any positive monotone transfor-

mation leaves the preferences unchanged. The level and the scale of the valuations are thus

unidentified. To fix the level, the constant and the characteristics of the market (which is

constant for investments in the same market) are excluded from Wi,j .5 To fix the scale of the

parameters, the variance of the error term is set to equal one. The error term contains factors

that are not included in the data but that influence the valuations. These are likely related to

the unobserved factors that influence the outcome of the investment, and it is expected that

the error term in the valuation equation is positively correlated with the error term in the cor-

responding outcome equation. A convenient way to allow for correlation between these errors

is to rewrite the error in the outcome equation as εi,j = ηi,jδ + ξi,j , where ξi,j are independent

N(0, 1) random variables conditional on the characteristics. This implies that cov(ε, η) = δ,

and σ2ε = 1 + δ2, and thus normalizes the scale of the outcome equation. The parameters in

the outcome equation must then be scaled by 1/√

1 + δ2 to be comparable to the parameters

in the conventional Probit model.

The empirical model allows for sorting in the market, but it does not impose it. When α

equals zero, the model reduces to a model with random investment decisions. Similarly, the

model allows for synergies but does not impose them. Synergies are distinguished from random

shocks by their influence on the investment outcomes and the fact that this influence was taken

into account in the valuation of the deal. The strength of synergies is measured by the extent

of the correlation between the error terms in the two equations. When δ is zero, there are no

unobserved factors that influence the investment outcomes and are taken into account in the

valuation.

Assuming normal distributions of the error terms and (below) a normal prior distribution

(conjugate prior) simplifies the analysis of the posterior distribution significantly. These as-

sumptions are not essential for the model, and one straightforward way to introduce more

flexible distributions of the error terms is to use mixtures of normal distributions (see Geweke5This is not an exclusion restriction in the sense that these characteristics are assumed to be independent

of the valuations, and the identification of parameters in the model does not depend on the exclusion of thesevariables from this equation.

13

Page 14: How Smart is Smart Money? An Empirical Two-Sided Matching ...

and Keane (1997), and Keisuke (2002)). The assumption that the error terms for different deals

are independent is also made to simplify the analysis. At the cost of some added complexity,

random or fixed effects could be introduced into the error structure.

3.2 Likelihood Function

For market m, the endogenous variables are µm and IPOi,j for ij ∈ µm. The exogenous

variables are Xi,j and Wi,j for ij ∈ Mm. Let the observed endogenous and exogenous variables

in the market be denoted µm, IPOm, Xm, and Wm. The parameters in the model are collected

in the vector, θ = (α, β, δ). The specification of the outcome equation and the valuation

equation gives the density of the joint distribution of the endogenous and the latent variables

(here and below C is a generic normalization constant).

φm(IPOm, µm, Om, Vm | θ, Xm,Wm) = C × I{Vm∈Γµm} ×∏

ij∈Mm

exp(−.5

(Vi,j −W ′

i,jα)2)

× I{Om∈ΓIPOm} ×∏

ij∈Mm

exp(−.5

(Oi,j −X ′

i,jβ − (Vi,j −W ′i,jα)δ

)2)

The likelihood function of one market is Lm (IPOm, µm | θ, Xm,Wm). The likelihood is the

probability that the latent outcome variables are in ΓIPOm and the latent valuation variables

are in Γµm, conditional on the observed variables. Evaluating the likelihood function requires

integrating out the latent variables in the above density.

Lm (IPOm, µm | θ, Xm,Wm) =∫

η,εφm(IPOm, µm, Om(ε), Vm(η) | θ, Xm,Wm)dFδ(η, ε)

Here Om(ε) = {X ′i,jβ + εi,j | ij ∈ Mm} and Vm(η) = {W ′

i,jα + ηi,j | ij ∈ Mm}. Sorting

implies that the bounds on the valuations for each investor, imposed by Γµm, depend on the

other investors’ valuations. If there were no sorting in the market, each investor’s investment

decision would be independent of the other investors’ valuations. In this case, Γµmcould be

written as a product set, and the integral could be factored into a product of corresponding low-

dimensional integrals. With sorting the integral does not factor out, and in the largest markets

the evaluation of Lm requires integrating over several thousand dimensions. Evaluation of such

integrals with the precision and speed required for maximum likelihood estimation is currently

not numerically feasible.

14

Page 15: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Bayesian estimation and Simulated Method of Moments present feasible alternative estima-

tion methods. Boyd et. al. (2003) follow Berry (1992) and use Method of Simulated Moments

to estimate a (slightly different) two-sided matching model of teacher employment. Geweke,

Keane and Runkle (1994) compare Bayesian estimation, Method of Simulated Moments, and

Simulated Maximum Likelihood for the estimation of discrete choice models. They find that

Bayesian estimation has attractive properties, and this method is used for the empirical analysis

below.

3.3 Determining the Quotas

The bounds V i,j and V i,j depend on whether the investors have unused quotas or not. For the

empirical analysis, it is assumed that the investors’ quotas equal the number of investments they

make in the market. This assumption is reasonable, since the sample used for the estimation

includes only the most active venture capitalists. (Section 4 describes the investors included

in the sample.) These investors typically consider several hundred investment opportunities

for each investment they make, and it is reasonable to expect that they are able to invest up

to their full quota. In a more general model, that does not assume that the whole market

is observed, the quotas are parameters to be estimated. In this model (not presented here),

the maximum likelihood estimate of each investor’s quota is also the number of investments

the investor makes. From this perspective, the present likelihood function is a concentrated

likelihood function of this more general model. Alternatively, a prior distribution could be

imposed on the quotas, and their posterior distributions could be derived. This approach

would introduce further complications by requiring a model of the investors’ outside options,

which depend on unobserved investment opportunities.

The lower the quotas are, the more sorting is present in the market. In the limit, when

the investors’ quotas exceed the number of companies in the market (and the investors’ outside

options are sufficiently low), the equilibrium matches each company with its most preferred

investor, independently of the other matches in the market (see appendix A). In this limit

case, the empirical matching model reduces to a conventional Probit model without sorting.

15

Page 16: How Smart is Smart Money? An Empirical Two-Sided Matching ...

3.4 Prior Distribution

The prior distribution of the parameters is a normal distribution. The choice of normal dis-

tributions for the error terms together with a normal prior (conjugate prior) implies that

the posterior distributions are normal (or truncated normal), and simplifies the analysis of

these distributions significantly. The prior distributions of α, β and δ are three indepen-

dent normal distributions, N(α, Σα), N(β, Σβ) and N(δ, Σδ). The joint density is π0(θ) =

C × exp(−.5

(θ − θ

)′ Σ−1θ

(θ − θ

)). For the estimation, the prior distributions have mean of 0

and variances of 10. For all parameters, this variance is at least 380 times the variance of the

resulting posterior distribution. This indicates that the posterior distributions and the param-

eter estimates reflect information contained in the data. Further increases in the variance and

other changes in the prior distributions left the parameter estimates virtually unchanged.

3.5 Posterior Distribution

For Bayesian estimation inference about the parameters is derived from the shape of the

posterior distributions. The posterior distribution is φ(θ | IPO, µ,X, W ). Albert and Chib

(1993) argue that for discrete choice models a more tractable distribution is the augmented

posterior distribution that also includes the latent variables. The augmented posterior dis-

tribution is φ(O, V, θ | IPO, µ,X, W ). The empirical model directly determines the density

φm(IPOm, µm, Om, Vm | θ, Xm,Wm) for each market. The product of these densities gives the

joint density φ(IPO, µ,O, V | θ, X, W ) for the variables in all markets. Using Bayes rule, the

augmented posterior distribution is

φ(O, V, θ | IPO, µ, X, W ) = π0(θ)× φ(IPO, µ, O, V | θ, X, W )/φ(IPO, µ | X, W )

= C × π0(θ)× φ(IPO, µ,O, V | θ, X, W )

Draws from the augmented posterior distribution are simulated with a standard Gibbs sampling

procedure (described in appendix B). The projection of the simulated augmented distributions

on the parameters (by simply discarding the simulated latent variables) gives the posterior

distributions of the parameters.

16

Page 17: How Smart is Smart Money? An Empirical Two-Sided Matching ...

3.6 Identification of the Outcome Equation

The estimated parameters in the outcome equation reflect the probability that an investment

by an arbitrary investor in an arbitrary company results in a public offering. This probability

cannot be inferred directly from the observed investments, since the observed investments are

systematically selected. To identify the parameters in the outcome equation and the influence

of the investors, it is necessary to have exogenous variation in the identity of the investors in

otherwise similar companies. The instrumental variables providing this variation must thus be

related to the investment decisions but independent of the investment outcomes.

Finance theory predicts that an investor’s decision to invest in a given company depends

exclusively on the expected return of the investment. Under this theory, all characteristics of

the investor and the company that are related to the investment decision must also be related

to the outcome of the investment, and none of these characteristics present valid instruments.

Sorting in the market means that the investments that are feasible for each investor depend

on the characteristics of the other agents in the market. For any given investment, the char-

acteristics of the other agents are arguably independent of its outcome. These characteristics

thus provide valid instruments for the identification of the outcome equation.

The model gives a systematic way to summarize the characteristics of the other agents

in the market into instrumental variables. The probability of observing each of the observed

investments is determined by the distribution V i,j . This distribution determines the probability

that Vi,j > V i,j , and that the investment is made. With a normal distribution of Vi,j , the mean

of εi.j conditional on V i,j is E[εi,j | ηi,j > V i,j −W ′

i,jα, V i,j

]= δ√

1+δ2

(φ(Zi,j)

Φ(−Zi,j)

), where Zi,j =

V i,j−W ′i,jα√

1+δ2. For the observed investments, the mean of εi,j is thus E

[εi,j | ηi,j > V i,j −W ′

i,jα]

= δ√1+δ2

EV i,j

[φ(Zi,j)

Φ(−Zi,j)

]. Variation in this expression identifies the parameters in the outcome

equation. The expression varies with the distribution of V i,j , which is determined by the

number of agents in each market and their characteristics. The variation in the number of

agents, the investors’ average experience, and the number of public offerings in each market is

illustrated in Figure 1 to 4.

The assumption that each valuation follows a normal distribution is not essential for the

identification argument, with sufficient variation in the characteristics, more flexible functional

forms could be used. The argument is analogous to the argument that the variation in the

17

Page 18: How Smart is Smart Money? An Empirical Two-Sided Matching ...

inverse Mills ratio identifies the parameters in the standard Heckman selection model (Heck-

man (1979)). The present model further recognizes that in the market for venture capital

each investment depends on the other investment opportunities in the market, and uses this

dependence to generate instruments.

The identifying assumption is that the agents that are present in each market are exoge-

nously given, and that their characteristics are independent of the error terms in the model.

The assumption is reasonable for venture capital. The time when companies reach the point

where they need outside capital is independent of the investors and other companies that are

present in the market. Venture capital funds have a limited lifespan and needs to make their

investments during their early years. Again, supporting the assumption that the presence of

the agents in each market is determined by exogenous forces.

4 Data

The data set used for estimation of the model is the SDC Venture Intelligence (now Venture

Xpert) database. SDC began compiling data about Venture Capital investments in 1977 and

has supplemented their data with investments dating back to the early 1960s. These data have

been used in several previous studies of venture capital (e.g. Lerner (1994a) and Lerner (1995)).

Lerner (1994b) and Gompers and Lerner (1999) investigate the completeness of the database

by comparing it to an alternative database with investments in the biotechnology sector made

during the period 1978 to 1989. They find that the SDC data contain more than 90% of all

investments, and that the missing investments are the less significant ones. The sample used

for the estimation is a narrow set of investments involving homogenous groups of investors

and companies. The use of this narrow set of investments eliminates most organizational and

contractual differences between the agents and the investments in the sample. This allows the

analysis to focus more directly on influence of the investors on their investments. Further, the

construction of the experience measure makes it more informative when it is used to compare

experiences of similar investors.

18

Page 19: How Smart is Smart Money? An Empirical Two-Sided Matching ...

4.1 Construction of the Sample

The database contains 22,747 venture capital investments made in the period from 1975 to

1995. The sample used for the estimation is constructed by restricting this database to a

narrow set of homogenous investments and companies. The sample is restricted to the first

investment received by young companies.6 This restriction eliminates investments in relation

to restructurings or buy-outs of mature companies. By restricting the sample to the first

investment received by each company, it is ensured that only one investment for each company

is included in the sample. This eliminates the potential bias that could arise if more successful

companies receive more rounds of financing. In cases where several investors participate in

the first investment, the sample is limited to the investment by the lead investor. The lead

investor is the investor that is most involved in the management of the company, and the

investor typically responsible for bringing the other investors into the deal (Brander, Amit, and

Antweiler (2002)). The experience of the lead investor is thus the most relevant experience for

the analysis.7 In the data, the lead investor is identified as the VC firm participating in the

initial round that makes the largest total investment in the company.8 The sample is finally

restricted to investments made in the period from 1982 to 1995. It is necessary to limit the

sample so that we may observe outcomes of the investments; it takes on average three to four

years for companies to go public after their initial investment.

Restricting the sample to investments made before 1995 thus gives the companies seven

years to go public (the status of the companies is current as of March 2003). The fourteen year

period is chosen to ensure a sufficient sample size.

The investments are divided into markets. Each market contains companies located in the

same geographical state that received investments within the same half-year (January to June,

and July to December). The two states with most investments are California and Massachusetts,

and the sample is restricted to investments in companies located in either of these two states.

This leaves 56 markets: two states, in fourteen years, with two markets in each year. Finally,

the sample is restricted to venture capital firms making a total of 10 or more investments in6Young companies are companies with a SDC stage code of 22 or less.7Admati and Pfleiderer (1994), Lerner (1994a), and Brander, Amit, Antweiler (2002) investigate different

explanations for the syndication of investments.8Megginson and Weiss (1991) and Barry, Muscarella, Peavy, and Vetsuypens (1990) use a similar classification

of the lead investor. Gompers (1996) defines the lead investor as the investor that has been present on the boardlongest.

19

Page 20: How Smart is Smart Money? An Empirical Two-Sided Matching ...

these 56 markets. This restriction ensures that the investors in the final sample are active

venture capital investors, and excludes investments by smaller more idiosyncratic investors. In

the final sample there are 75 investors who makes a total of 1666 investments.

4.2 Variables

There are two endogenous variables. The variable IPO contains the investment outcomes, and

equals one if the investment results in a public offering, and is zero otherwise. This measure

of the investment outcomes is frequently used in the literature (e.g. by Gompers and Lerner

(1998), and Brander, Amit, and Antweiler (2002)). Gompers and Lerner (1998) compares this

measure to a broader measure of investment outcomes that includes whether the company was

acquired or is in registration. They find that these different measures yield virtually identical

results. The second endogenous variable is the matching of investors and companies. This

matching given by the identity of the agents in each investment.

A number of variables are used to control for market and company characteristics. One is

the year of the investment (YEAR), which varies from 1982 to 1995. The characteristics of the

company are STAGE and INDUSTRY GROUP. STAGE contains the stage of development of

the company at the time of the investment. The company is labelled as an early stage company

when it is either at the seed or the start-up stage. It is labelled as a late stage company when

it is either at the expansion stage or the late stage. This distinction roughly corresponds to

whether the company has regular revenue or not.9 In the sample, 82% of the companies are early

stage companies. INDUSTRY GROUP divides the companies into six major industry groups.

These are “Communications and Media”, “Computer Related”, “Semiconductors and other

Electronics”, “Biotechnology”, “Medical, Health, and Life Sciences”, and “Other”. Dummy

variables for these six groups are: I COMMUNICATION, I COMPUTERS, I ELECTRONICS,

I BIOTECHNOLOGY, I MEDICAL, and I OTHER.

The characteristics of the investors are FUND SIZE and the different measures of the

investors’ experience. FUND SIZE contains the size of the Venture Capital fund making the

investment, measured in millions of dollars. Venture capital firms are organized as a collection

of independent venture capital funds. These funds are usually prevented from investing in the

same companies to avoid conflicts of interests between the limited partners in each fund. The9Early stage companies thus have SDC Stage codes of 11, 12, or 13.

20

Page 21: How Smart is Smart Money? An Empirical Two-Sided Matching ...

size of the fund is thus a better measure of the capital available for investments than the total

size of the venture capital firm. The fund size is defined as the fund’s total investments. For a

number of investments in the data only the firm and not the fund itself is disclosed. In these

cases, the fund size is taken to be the average of the sizes of the firm’s funds. There are also

a few cases where a venture capital firm makes investments with two different funds in the

same market. This creates ambiguity for the construction of the characteristics of the potential

investments (discussed below), since it is unclear which of the two fund sizes to use for these

investments. In these cases the average of the two fund sizes is used.

Three different measures of the investors’ experience are constructed. At the time of each

investment, each venture capital firm’s experience is constructed as the number of investments

it has participated in since 1975. This variable is called TOTAL EXPERIENCE. Investments

in companies located in all states, and at all stages of developments are counted, as well as

investments in early and late rounds of investments. The venture capital industry expanded

significantly in the early eighties10 and little is lost by excluding investments prior to 1975 from

the experience measure.

The variables EARLY STAGE EXPERIENCE and LATE STAGE EXPERIENCE contain

the stage-specific experience. They are constructed as TOTAL EXPERIENCE, but only in-

vestments in companies at either the early or the late stage are counted. Similarly, the variables

I XXXXX EXPERIENCE contain six industry specific experience measures. These variables

are interacted with the stage and industry of each company to create the two experience mea-

sures, STAGE EXPERIENCE and INDUSTRY EXPERIENCE. These variables measure the

investor’s experience with investments in companies at the same stage and in the same industry

as the present company.

For the observed investments the characteristics are observed in the data. For the remaining

potential investments, the characteristics are constructed by recombining these variables. For

example, let a market contain just two investors, I = {1, 2} , and two companies, J = {A,B}.

Let investor 1 invest in company A, and let investor 2 invest in company B, so µm = {1A, 2B}.

The characteristics of these deals are in the data. The characteristics of the two remaining

deals, 1B and 2A, are then constructed by combining the observed characteristics of investor10In 1979 the Department of Labor’s clarification of the “Prudent Man Rule” allowed pension funds shift

investments into venture capital firms, leading to a significant increase in the number and sizes of these.

21

Page 22: How Smart is Smart Money? An Empirical Two-Sided Matching ...

1 with those for company B, and the characteristics of investor 2 with those for company A.

Descriptive statistics for the observed investments and the potential investments are presented

in Tables 1 and 2.

5 Findings

Each venture capital firm’s experience is taken as a measure of its ability to monitor and

manage its investments, and as a measure of its reputation. In this section, it is shown that

investments by more experienced investors are more successful. It is then argued that there

is substantial sorting in the market, and that more experienced investors have better access

to investments in late stage and biotechnology companies. After controlling for the sorting,

more experienced investors are still able to achieve significantly higher rates of success for their

investments. Finally, the effects of the sorting, the relative importance of sorting and the

investors’ influence, and the synergies in the market are quantified.

5.1 Experience and Investment Outcomes

To verify that investments by more experienced investors are more successful, the 1666 invest-

ments in the sample are divided into 10 groups according to the investor’s experience. The

first group contains investments by investors with an experience between 0 and 24, the second

group contains the range from 25 to 49, and so forth. The number of investments in each group

is illustrated in Figure 6. In the sample, 95% of the investments are made by investors with an

experience between 0 and 225. The fraction of public offerings in each group is illustrated in

Figure 5. This figure shows that investments by more experienced investors tend to go public

more frequently than investments by less experienced investors.

This finding is supported by a Probit analysis. This analysis determines the probability that

each of the observed investments results in a public offering as a function of the characteristics

of the company, the investor, and the market. The estimated parameters for four different

specifications are reported in Tables 4.1 and 4.2.11 Again, experience has a positive and signif-

icant effect on the investment outcome (the additional experience measures in specification 4

causes the coefficients to be insignificant, as discussed below). With the estimated parameters11The specification in Table 4.1 is specification 1 in Table 4.2.

22

Page 23: How Smart is Smart Money? An Empirical Two-Sided Matching ...

in Table 4.1, the predicted probability of success for an investor with no prior experience is

21.4% (the predicted probabilities are presented in Table 6). The probability of success for an

investor with an experience of 225 is 38.9%. In the sample, investment by the most experienced

investors are 82% more likely to be successful, and this is a substantial difference.

In Tables 4.1 and 4.2, the other estimated parameters are as expected. The amount of

capital available to the investor has a small but significant positive effect. In the specifications

that include the location of the company, in either California or Massachusetts, this location

has no effect. The year of the financing also has no effect on the outcomes. The base year is

1982, and there is no apparent trend in the estimated coefficients on the individual years. Only

the coefficient for the year 1985 is significant, but with 14 years, random variation in the data

may well cause one coefficient to be significant at the 10% level.

In Table 4.2, comparing the estimated parameters in specification 2 and specification 3

shows that replacing the individual year dummies and the state dummy with a single linear

time trend leaves the estimated coefficients largely unchanged. Conditional on the linear time

trend, the experience measure is uncorrelated with the year and state dummies. This shows that

little is information is lost by focusing on this parsimonious specification. For computational

reasons, this specification is used for the estimation of the structural model below.

5.1.1 Alternative Experience Measures

The analysis takes each venture capital firm as the unit of analysis for the investor’s experience.

Individual partners in venture capital firms regularly meet to discuss and resolve problems facing

the companies in their portfolio, and it is reasonable to think that their experience is shared

collectively within the firm, and does not follow individual partners entering or leaving the firm.

Gompers (1996) uses age as a measure of the reputation of venture capital firms. The measure

based on the number of investments has the advantage that it adjusts for the investor’s activity

level, as well as the investor’s involvement in the investments. Early investors will typically

gain more experience and reputation, and they participate in more rounds that later investors.

It would be preferable to have a measure of experience that is specific to the match between

the VC and the entrepreneur. A first attempt to improve the measure is to divide the investors’

experience into their experience with companies at the different stages and in the different

industries. The variables STAGE EXPERIENCE and INDUSTRY EXPERIENCE contain

23

Page 24: How Smart is Smart Money? An Empirical Two-Sided Matching ...

the number of previous investments the investor has made in companies at the same stage of

development and in the same industry as the present company. In specification 4 in Table 4.2

these variables are included in the analysis. Unfortunately, the measures are highly correlated,

and the data contain insufficient information to separate their effects. This causes the estimated

coefficient to be statistically insignificant, and the coefficient on INDUSTRY EXPERIENCE

to be negative. If anything, this preliminary analysis suggests that stage-specific experience is

more important than industry-specific.

5.2 Sorting in the Market

Top-tier venture capitalists may be in a position in the market with access to better investments.

For each investor, the market position and the feasible investments are determined by the

preferences of the companies in the market. The more companies that prefer an investment by

the investor, the more feasible investments are available.

The preferences are represented by the valuations of the deals. In equation (2) the valua-

tion of each deal is specified as a function of the observed characteristics, and the estimated

coefficients in this equation represent the agents’ preferences over these characteristics. The co-

efficients are presented in Table 5 (the coefficients are the means of the posterior distributions).

The coefficients on experience and fund size are positive and significant. From the perspective

of the companies, venture capitalists with more experience or more capital are more attractive

investors. From the perspective of the venture capitalists, the positive and significant coeffi-

cients on the stage and biotechnology variables show a preference for investments in companies

at this stage and in this industry (in the sample, late stage and biotechnology are negatively

correlated, and these are two distinct preferences).

To quantify the strength of the agents’ preferences, consider an investor facing a choice

between investments in two different companies. If the observed characteristics of the companies

are the same, the choice will depend entirely on the unobserved factors, and the valuation

equation predicts that the probability that either company is preferred is 50%. If the stages of

the companies differ, the predicted probability that the investor prefers the late stage company

is 59.5%, a marginal increase of 9.5%. These marginal probabilities are presented in Table 5

in the dP/dW column.12 The probability that the investor prefers to invest in a biotechnology12The probability that deal i′j′ is preferred to deal ij is Prob(W ′

i,jα + ηi,j < W ′i′,j′α + ηi′,j′) = Φ((W ′

i′,j′ −

24

Page 25: How Smart is Smart Money? An Empirical Two-Sided Matching ...

company relative to a company in the “other” industry group is 59.9%. Taking the perspective

of a company comparing two investors, one of which has an additional unit of experience,

the company prefers the more experienced investor with probability 50.2%. This marginal

preference of 0.2% translates into a preference for an investor with an experience level of 225

over an investor with no experience with probability 86.9%.

These preferences lead to substantial sorting in the market. To confirm this, the experience

of the investor in each investment is regressed on the other characteristics of the deal. The

estimated parameters are reported in Table 3. The three specifications control for the year of

the investment and the location of the company in different ways. The average experience of the

investors in the market and the year dummies are included to control for different experience

levels between markets. Naturally, markets in later years have more experienced investors.

Inclusion of these variables ensures that the estimated parameters reflect sorting within the

markets, rather than differences between different markets.

The coefficient on stage is statistically significant, and is around 16. Investors in late stage

companies have experience levels that are significantly higher than the levels of the investors in

early stage companies. The estimated parameters further show that biotechnology companies

have the most experienced investors, followed by companies in the electronics, medical, and

communication industry groups. This is evidence of sorting on the observed characteristics,

but sorting is also likely to extend to the unobserved factors. The total effect of sorting is

quantified below.

In a related study, Gompers (1996) finds that investments by younger and less experienced

venture capital firms (measured by their age) go public faster, are priced lower, and that the

venture capitalists hold a smaller equity share at the time of the public offering. This finding

is interpreted as “grandstanding” in the sense that younger investors rush their investments

to a public offering to build a reputation. Sorting suggests an alternative interpretation of

this finding. With sorting, the deals that are left for the inexperienced investors are the less

attractive deals, i.e. deals where the investor can secure a smaller ownership fraction, or deals

that are priced lower in a public offering. While this is not evidence against the grandstanding

hypothesis, it does suggest that the wealth loss from this practice may be smaller than found

W ′i,j)α/

√2), and dP/dW is evaluated as φ(0)α/

√2. For dummy variables the marginal effect is evaluated for a

discrete change.

25

Page 26: How Smart is Smart Money? An Empirical Two-Sided Matching ...

in Gompers (1996), and it stresses the importance of controlling for systematic differences in

the investments when investigating this hypothesis.

5.3 Investors’ Influence

For an investment by a given investor in a given company, equation (equation 1) describes

the relationship between the characteristics of the agents and the expected outcome of the

investment. The estimated parameters in this outcome equation are presented in Table 5.

The influence of the investors can be inferred from the estimated coefficients in this equation.

When holding the company fixed and changing the experience of the investor, the coefficient

on experience reflects the resulting change in the expected outcome.

The coefficient on experience is positive and significant, indicating that more experienced

investors have a greater influence on their investments than less experienced investors. The

marginal effect of experience is 0.000417. This marginal effect means that an investment by an

inexperienced investor in an average company leads to a public offering with probability 15.1%.

Holding the company fixed, and replacing the investor with an investor with an experience of

225 increases the probability of a successful investment to 25.1% (these probabilities are given

in Table 6). For an average company, the influence of a more experienced investor increases

the probability of going public by 66.0%, a substantial difference.

The other estimated parameters in the outcome equation are as expected. Fund size has a

small but significant effect on the investment outcomes. Late stage companies have a 4.6% larger

probability of going public than early stage companies. The average biotechnology company

has a 39% larger probability of going public than the average company in the “other” industry

group.

The finding that the investors have a significant influence on their investments comple-

ments previous empirical studies of the impact of venture capital investors on the companies

they fund. As discussed above, the literature has demonstrated systematic differences in the

strategies, governance structures, and financial performance of companies that receive VC fi-

nancing relative to companies that receive financing from other investors. In this literature, it

not obvious whether these differences are caused by the venture capitalists, or whether venture

capitalists invest in companies that are inherently different from the companies funded by other

investors. The present analysis shows that even within a narrow group of venture capitalists,

26

Page 27: How Smart is Smart Money? An Empirical Two-Sided Matching ...

there are significant differences in their impact and influence on the companies they fund. It

is likely that the difference between the influence of VCs and the influence of other investors is

larger than the differences within the sample investigated here, since these investors differ not

only in their experience levels but also in their organizational forms and investment structures.

It is striking to note the close resemblance between the factors that determine preferences

over matches and the factors that determine the investment outcomes.

In the structural model, the valuation equation and the outcome equation are estimated

from different endogenous variables. The parameters in the valuation equation are estimated

from the matching of investors and companies, and the parameters in the outcome equation

are estimated from the outcomes of the observed investments. The model and the empirical

estimation do not necessarily lead to similar coefficients in the two equations. Yet, the signs of

all significant coefficients are the same in the two equations, and their relative magnitudes are

similar (the scale of the parameters in each of the equations is arbitrary, and only their relative

magnitudes and marginal effects are comparable). In the valuation equation, biotechnology is

the most attractive industry group, and this industry group also has the largest coefficient in the

outcome equation. The marginal effect of experience is around three times the marginal effect of

fund size in both equations. In other words, companies are indifferent between an investor with

more experience and an investor with “three times” the capital, and the predicted outcomes of

these two investments are indeed similar. This supports the theory that the valuations reflect

the discounted expected outcomes of the investments. This theory is also consistent with the

finding that the marginal effect of the stage of the company is larger for the valuation of the

investment than for its expected outcome. Late stage companies are more attractive, not only

because they go public more often, but because they go public faster.

5.4 Quantifying the Effects

In Figure 7 the relationship between experience and investment outcome is decomposed into

the matching effect and the influence effect. In this figure, the solid line represents the ob-

served probability of success for investments by investors with different levels of experience.

The probability is given by the estimated parameters of the Probit model, presented in Table

4.1. This probability contains the effect of more experienced investors having access to better

investments and investments with stronger synergies in the market.

27

Page 28: How Smart is Smart Money? An Empirical Two-Sided Matching ...

The broken line represents the probability of success for an investment in a given company,

by an investors with varying levels of experience. This probability is calculated assuming no

sorting or synergies between the investor and the company. The probability is calculated with

the estimated parameters in the outcome equation of the structural model, presented in Table 5.

If the matching of the investors and the companies were random, these would be the predicted

probabilities of successful investments.

The differences between the two lines, denoted “A” and “B” in the figure, represent the

effect of the matching in the market. The difference denoted “C” represents the effect of the

influence of the more experienced investors.

The Effect of Sorting As argued above, sorting gives more experienced investors better

access to investments in late stage companies and companies in the biotechnology industry. In

addition, sorting will tend to match companies and investors with stronger synergies. Compared

to a market with random investments, sorting leads to more successful investments for more

experienced investors. For inexperienced investors, the effect of sorting is ambiguous. On one

hand, sorting leaves the inexperienced investors with less attractive investment opportunities.

On the other hand, sorting takes synergies between the investors and the companies into account

when forming the matches in the market. Compared to a market with random investments,

the inexperienced investors may be better or worse off depending of which effect dominates.

Compared to a market with random investments, sorting raises the probability of a successful

investment from 25.1% to 38.9% for an investor with an experience level of 225, as represented

by B in Figure 7. For an inexperienced investor, sorting raises the probability of a successful

investment from 15.1% to 21.4%, represented by A. The increase in the probability for the

more experienced investors is 13.8%, which is more than twice the increase of 6.3% for the least

experienced investors.

The Relative Importance of Sorting and Influence If an investor without any experience

were to make random investments, the probability of success would be 15.1%. The observed

probability of success for an investor with an experience of 225 is 38.9%. In Figure 7, this

difference is represented by B plus C. The influence of the more experienced investor accounts

for 10.0% of this difference, represented by C. Sorting accounts for the remaining 13.8% of

the difference, represented by B. Sorting thus explains 58% and investors’ influence explains

28

Page 29: How Smart is Smart Money? An Empirical Two-Sided Matching ...

42% of the total increase in the probability of a successful investment for the most experienced

investors in the market.

Synergies in the Market The outcome equation describes the relationship between the

probability of a successful investment and the observed characteristics of the investor and the

company, holding fixed the quality of the match. The impact of synergies in the market can

be quantified by comparing the observed number of public offerings to the number predicted

by the outcome equation. The observed number of public offerings is 448 (26.9% of 1666 in-

vestments). The number of public offerings predicted by the outcome equation is 297 (17.8%).

The equilibrium matching of companies and investors thus raises the number of public offerings

by 151 (or 51%). This increase is attributed to the matching of investors and companies with

stronger synergies in the market.

A final finding is that studies that assume sorting on only the observed characteristics may

overestimate the investors’ influence by as much as 60%. In the Probit analysis that controls

for all observable characteristics (Table 4.2, specification 3), the estimated marginal effect of

investors’ experience is 0.000664. The estimated marginal effect in the outcome equation in the

structural model, which controls for the sorting in the market, is 0.000417 (Table 5). If the first

estimate were interpreted as investors’ influence, this would overestimate the actual influence

by 59.2%. Naturally, this problem is expected to be less severe the more characteristics are

observed.

6 A Standard Selection Model

This section compares the properties of the matching model analyzed above with the properties

of a standard selection model. In a standard selection model the investment in each company

would be considered an independent observation. Since sorting leads more experienced investors

to select better investments, the experience of the investor is endogenous to each investment.

The model then uses instrumental variables to solve the resulting endogeneity problem. For the

investment in company j, let the outcome of the investment and the experience of the investor

29

Page 30: How Smart is Smart Money? An Empirical Two-Sided Matching ...

be given by the two equations:

IPOj = I{X′jβ1+EXPjβ2+ε1>0}

EXPj = X ′jα1 + Z ′

jα2 + ε2

Here, IPOj equals one if company j goes public, and is zero otherwise. Xj contains the

exogenous variables, including a constant. EXPj is the experience of the investor (EXPj =

TOTAL EXPERIENCE i,j), and Zj contains instrumental variables. Let the variance of ε1 be

normalized to one, and let ε2 follow the normal distribution N(0, σ22), truncated to ensure that

the experience is positive. The likelihood function for each observed investment is then

Lj(α, β ; IPOj , EXPj , Xj , Zj) =

φ(

EXPj−X′jα1−Z′

jα2

σ2

)− ln(σ2)

Φ(

X′jα1+Z′

jα2

σ2

)

× Φ

(−

X ′jβ1 + EXPjβ2 + (EXPj −X ′

jα1 − Z ′jα2)σ12/σ2

2√1− σ2

12/σ22

)I[IPOj=0]

× Φ

(X ′

jβ1 + EXPjβ2 + (EXPj −X ′jα1 − Z ′

jα2)σ12/σ22√

1− σ212/σ2

2

)I[IPOj=1]

6.1 Estimates of a Standard Selection Model

To estimate this model, the average experience of the investors in each market is used as

an instrument. This variable is related to the experience of the investor in each deal but is

independent of its outcome.13 Table 7 presents maximum likelihood estimates for three different

specifications of this model. Except for the coefficient on experience, the magnitudes of the

coefficients in the outcome equation are similar to the coefficients estimated both in the Probit

model and the structural model (Tables 4.1, 4.2 and 5). In the selection equation, the estimated

coefficient on the average experience is positive and significant, and the instrument is positively

related to the investors’ experience, as expected.

In the estimates of the standard selection model, the marginal effect of experience is ap-

proximately three times the marginal effect estimated in the Probit model, and the errors in

the two equations of the selection model are negatively correlated. If these estimates were13More formally, this assumptions states that conditional on the experience of the investor, the aggregate level

of experience in the market is unrelated to the outcome of the investment.

30

Page 31: How Smart is Smart Money? An Empirical Two-Sided Matching ...

taken literally, they would show negative selection in the market. With negative selection,

more experienced investors would be investing in worse companies, but their influence would

be sufficiently large that these companies would still have better outcomes than the companies

with less experienced investors. A rather peculiar finding.

In reality, the finding of negative selection is a result of the misspecification of the model

and the lack of variance in the instrument. The model is misspecified, because it does not

account for the sorting in the market, and the resulting correlation between the error terms for

different investments. The lack of variance in the instrument causes the standard errors of the

estimated parameters to be several times larger than the standard errors in the other models.

6.2 Comparison of Selection Model and Matching Model

The advantage of the standard selection model is that it is familiar and well understood, but

in the present setting, the model has several shortcomings. The shortcomings flow from the

inability of the standard selection model to capture the sorting in the market.

First, the standard selection model imposes independence on the error terms for different

investments conditional on the observed exogenous variables. With sorting this assumption

does not hold, and the severity of the problem of this misspecification is uncertain.14 Second,

the shortage of instruments for the standard selection model causes the parameters to be

imprecisely estimated. A shortage of instruments is common for capital markets, since, in these

markets, the economic forces that determine the investment outcomes are closely related to the

economics forces that determine the selection of investments. Third, the investor’s experience

is specified in a reduced form selection equation. This specification is uninformative about

the economic forces that determine the investment decision. The estimated parameters of the

reduced form equation are silent about the extent to which the investors choose the companies

or the companies choose the investors, and the specification does not permit other variables to

be endogenous to the investment. Fourth, in the standard selection model half of the companies

are selected up, in the sense that they have an investor with more experience than predicted

by the selection equation, and the other half is selected down. As observed above, synergies

combined with sorting can cause most companies to be selected up. The reduced form of the14Preliminary unreported simulations suggest that the interaction between the errors for different investments

in a market with sorting may indeed show up as “negative selection” in the estimates of the standard selectionmodel.

31

Page 32: How Smart is Smart Money? An Empirical Two-Sided Matching ...

selection equation normalizes the level of the synergies, and cannot assess the extent these in

the market.

The matching model is developed to overcome these shortcomings. First, the matching

model treats the investments as the equilibrium of a matching market, and the interaction

between the error terms for different investments is endogenous to the model. Second, the

model is able to use the characteristics of the other agents in the market as instruments, and

the estimated parameters are more precise (as reflected by the smaller standard deviations of

the posterior distributions). Third, by recovering the preferences of the agents, the model is

informative about the structure of the matching in the market: The model can quantify the

extent of synergies in the market, and can, in principle, test whether the economic forces that

determine the investment selection are the same forces that determine the investment outcomes.

This would not be possible in a standard selection model that requires exogenous variation in

the selection of the investments in order to identify the model.

7 Conclusion

This article examines the structure of the market for venture capital. It finds that more ex-

perienced investors make more successful investments, and that this is a result of both their

influence on the companies in which they invest, and of sorting in the market which gives them

access to better initial investments.

Sorting is shown to be an important feature of the market, which has implications for the

interpretation of other empirical studies of the market for venture capital. Studies have found

systematic differences between VC backed companies and companies funded by other investors.

Sorting means that care must be taken when interpreting these differences as caused by the

investors. In fact, it is found that ignoring sorting may lead to overestimation of investors’

influence of as much as 60%. On the other hand, the analysis finds that venture capitalists do

have substantial impact on the companies they fund. This finding supports the above mentioned

studies in their investigation of the nature of this impact. Other studies have found differences

between venture capitalists with more and less experience or reputation. Again, sorting implies

that the differences in the investment opportunities available to these investors must be kept

in mind when interpreting these findings.

32

Page 33: How Smart is Smart Money? An Empirical Two-Sided Matching ...

The positive influence of the investors has a direct economic value, but sorting may also

be valuable by facilitating an efficient allocation of capital in the market. With sorting, top-

tier investors are able to sustain a reputation for high quality investments. This reputation

allows them to certify the quality of their investments to the market. Sorting combined with

reputation and certification thus creates a mechanism for funding high quality companies and

credibly communicating their quality. Without this mechanism, a “lemons” effect could cause

the market to unravel and lead the best companies to abandon the market entirely. The move

from a “separating” market to a “pooling” market without sorting would almost certainly

result in a capital market that is less flexible and less efficient in the funding of entrepreneurial

companies.

Two limitations of the present analysis should be noted, which also point to extensions

that may be pursued in future work. First, the developed model is a static equilibrium model,

and the analysis is unable to investigate dynamic features of the market. This also results

in the somewhat arbitrary market definition that takes investments made within the same six

months to be in the same market. Second, the experience measure used to capture the investors’

abilities and reputations is a coarse measure. More specific measures that capture the synergies

between the venture capitalist and the company, would likely provide further insights into this

relationship.

The analysis may apply to capital markets more broadly. The certification effect has been

documented for the underwriting of securities by banks, and sorting and synergies may be

present here as well. Related features are also found in the markets for lending by commercial

banks, corporate investments, and the market for mergers and acquisitions. Going beyond

capital markets, sorting and synergies are present in i.e. the labor market, the market for

education, supplier-customer networks, and strategic alliances. Extensions of the analysis to

these markets provide interesting avenues for further research.

33

Page 34: How Smart is Smart Money? An Empirical Two-Sided Matching ...

References

Admati, Anat, and Paul Pfleiderer (1994) "Robust Financial Contracting and the Role of Venture

Capitalists," Journal of Finance, 49 (2), 371-402

Albert, James, and Siddhartha Chib (1993) "Bayesian Analysis of Binary and Polychotomous

Response Data (in Theory and Methods)," Journal of the American Statistical

Association, vol. 88, no. 442, pp. 669-679

Baker, Malcolm, and Paul Gompers (2000) "The Determinants of Board Structure at the Initial

Public Offering," Journal of Law and Economics, forthcoming

Barry, Christopher, Chris Muscarella, John Peavy, and Michael Vetsuypens (1990) "The Role of

Venture Capital in the Creation of Public Companies: Evidence From the Going-Public

Process," Journal of Financial Economics, 27, 447-472

Berry, Steven (1992) "Estimating a Model of Entry in the Airline Industry," Econometrica, 60,

889-917

Berry, Steven, James Levinsohn, and Ariel Pakes (1995) "Automobile Prices in Market

Equilibrium," Econometrica, 63 (4), 841-890

Boyd, Don, Hamp Lankford, Susanna Loeb, and Jim Wyckoff (2003) "Analyzing the

Determinants of the Matching of Public School Teachers to Jobs," NBER working paper

9878

Brander, James, Raphael Amit, and Werner Antweiler (2002) "Venture-Capital Syndication:

Improved Venture Selection vs. the Value-Added Hypothesis," Journal of Economics &

Management Strategy, 11 (3), 423-452

Brav, Alon, and Paul Gompers (1997) "Myth or Reality? The Long-Run Underperformance of

Initial Public Offerings: Evidence from Venture Capital and Nonventure Capital-Backed

Companies," Journal of Finance, 52, 1791-1822

Bresnahan, Tomothy (1987) "Competition and Collusion in the American Automobile Industry:

The 1955 Price War," Journal of Industrial Economics, 35 (4), 457-482

Bresnahan, Timothy, and Peter Reiss (1991) "Econometric Models of Discrete Games," Journal

of Econometrics, 48, 57-81

Carter, Richard, and Steven Manaster (1990) "Initial Public Offerings and Underwriter

Reputation," Journal of Finance, 45, 1045-1067

Diamond, Douglas (1989) "Reputation Acquisition in Debt Markets," Journal of Political

Economy, 97, 828-862

Page 35: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Gelfand, Alan, and Adrian Smith (1990) "Sampling Based Approaches to Calculating Marginal

Densities," Journal of America Statistical Association, 85, 398-409

Geman, S., and D. Geman (1984) "Stochastic Relaxation, Gibbs Distributions and the Bayesian

Restoration of Images," IEEE Transactions on Pattern Analysis and Machine Intelligence

6, 721-741

Geweke, John (1991) "Efficient Simulation from the Multivariate Normal and Student-t

Distributions Subject to Linear Constraints," in E.M. Keramidas (ed.) Computing Science

and Statistics: Proceedings of the 23rd Symposium on the Interface, 571-578

Geweke, John, Gautam Gowrisankaran, and Robert Town (2003) "Bayesian Inference for

Hospital Quality in a Selection Model," Econometrica, 71 (4), 1215-1239

Geweke, John, and Michael Keane (1997) "An Empirical Analysis of Income Dynamics Among

Men in the PSID," Federal Reserve Bank of Minneapolis, Research Department Staff

Report 233

Geweke, John, Michael Keane, and D. Runkle (1994) "Alternative Computational Approaches to

Inference in the Multinomial Probit Model," Review of Economics and Statistics, 1994,

76, 609-632

Gompers, Paul (1995) "Optimal Investment, Monitoring, and the Staging of Venture Capital,"

Journal of Finance, 50, 1461-1489.

Gompers, Paul (1996) "Grandstanding in the Venture Capital Industry," Journal of Financial

Economics, 43, 133-156

Gompers, Paul, and Josh Lerner (1996) "The use of Covenants: An Empirical Analysis of

Venture Partnership Agreements," Journal of Law and Economics, 39, 463-498

Gompers, Paul, and Josh Lerner (1998) "The Determinants of Corporate Venture Capital

Successes: Organizational Structure, Incentives, and Complementarities," working paper,

NBER 6725

Gompers, Paul, and Josh Lerner (1999) The Venture Capital Cycle, Cambridge, MA: MIT Press

Gompers, Paul, and Josh Lerner (1999) "An Analysis of Compensation in the U.S. Venture

Capital Partnership," Journal of Financial Economics, 51, 3-44

Gompers, Paul, and Josh Lerner (2000) "Money Chasing Deals? The Impact of Fund Inflows on

Private Equity Valuations," Journal of Financial Economics, 55, 281-325

Gorman, Michael, and William Sahlman (1989) "What do Venture Capitalists Do?" Journal of

Business Venturing, 4, 231-248

Page 36: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Heckman, James (1979) "Sample Selection Bias as a Specification Error," Econometrica, 47,

153-162

Hellmann, Thomas (1998) "The Allocation of Control Rights in Venture Capital Contracts,"

RAND Journal of Economics, 29 (1), 57-76

Hellmann, Thomas, and Manju Puri (2000) "The Interaction Between Product Market and

Financing Strategy: The Role of Venture Capital," The Review of Financial Studies, 13

(4), 959-984

Hellmann, Thomas, and Manju Puri (2002) "Venture Capital and the Professionalization of Start-

Up Firms: Empirical Evidence," Journal of Finance, 57 (1) , 169-197

Hirano, Keisuke (2002) "Semiparametric Bayesian Inference in Autoregressive Panel Data

Models," Econometrica, 70 (2), 781-799

Hochberg, Yael (2003) "Venture Capital and Corporate Governance in the Newly Public Firm,"

working paper, Cornell University

Kaplan Steven, and Per Strömberg (2000) "Venture Capitalists As Principals: Contracting,

Screening, and Monitoring," American Economic Review, 91 (2), 426-430

Kaplan Steven, and Per Strömberg (2002) "Characteristics, Contracts, and Actions: Evidence

From Venture Capital Analyses," Journal of Finance, forthcoming

Kaplan, Steven, and Per Strömberg (2003) "Financial Contracting Meets the Real World: An

Empirical Analysis of Venture Capital Contracts," Review of Economic Studies, 70 (2),

281-315

Keane, Michael (1992) "A Note on Identification in the Multinomial Probit Model," Journal of

Business and Economic Statistics, 10, 193-200

Kelso, Alexander, and Vincent Crawford (1982) "Job Matching, Coalition Formation, and Gross

Substitutes," Econometrica, 50, 1483-1504

Lerner, Josh (1994a) "The syndication of Venture Capital Investments," Financial Management,

23, 16-27

Lerner, Josh (1994b) "Venture Capitalists and the Decision to go Public," Journal of Financial

Economics, 19, 217-235 (35, 293-316?)

Lerner, Josh (1995) "Venture Capitalists and the Oversight of Private Firms," Journal of Finance,

50, 301-318

Lindsey, Jim (1996) Parametric Statistical Inference, Oxford: Claredon Press

Lindsey, Laura (2003) "The Venture Capital Keiretsu Effect: An Empirical Analysis of Strategic

Alliances Among Portfolio Firms," working paper

Page 37: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Megginson, William, and Kathleen Weiss (1991) "Venture Capitalist Certification in Initial

Public Offerings," Journal of Finance, 46, 879-903

Nobile, Agostino (1995) "A Hybrid Markov Chain for the Bayesian Analysis of the Multinomial

Probit Model," technical report #36, National Institute of Statistical Sciences

Puri, Manju (1996) "Commercial Banks in Investment Banking: Conflict of Interests or

Certification role?" Journal of Financial Economics, 40, 373-401

Quindlen, Ruthann (2000) Confessions of a Venture Capitalist, New York, NY: Warner Books

Roberts, G. and Alan Smith (1994) "Simple Conditions for the Convergence of the Gibbs

Sampler and Metropolis-Hastings Algorithms," Stochastic Processes and Their

Applications, 49, 207-216

Roth, Alvin (1985a) "The College Admissions Problem is not Equivalent to the Marriage

Problem," Journal of Economic Theory, 36, 277-288

Roth, Alvin (1985b) "Common and Conflicting Interests in Two-sided Matching Markets,"

European Economic Review, 27, 75-96

Roth, Alvin, and Marilda Sotomayor (1990) Two-Sided Matching: A Study in Game-Theoretic

Modeling and Analysis, Econometric Society Monograph Series, Cambridge University

Press

Sahlman, William (1990) "The Structure and Governance of Venture Capital Organizations,"

Journal of Financial Economics, 27, 473-521

Shleifer, Andrei, and Robert Visnhy (1997) "A Survey of Corporate Governance," Journal of

Finance, 52 (2), 737-783

Tanner, Martin (1996) Tools for Statistical Inference: Methods for the Exploration of Posterior

Distributions and Likelihood Functions, Springer-Verlag, New York

Tanner, Martin, and W. Wong (1987) "The Calculation of Posterior Distributions by Data

Augmentation," Journal of the American Statistical Association, 82, 528-549

Page 38: How Smart is Smart Money? An Empirical Two-Sided Matching ...

A Analysis of the Two-Sided Matching Model

This appendix contains a formal description of the economic model. The model is a two-sided

matching model. It is a new special case of the College Admissions model (see Roth and

Sotomayor (1990)) with an additional restriction on the agents’ preferences.

For the College Admissions Model it is known that an equilibrium always exists, that the

set of equilibria is structured as a lattice under a particular ordering, and that the equilibrium

is determined by the agents’ preferences over individual matches when the preferences on the

“many-side” of the market are responsive (defined below). Obviously, these properties of the

general model also hold for the present special case.

Compared to the College Admissions Model, the present model imposes the additional as-

sumption that the agents’ preferences are aligned. As shown below, aligned preferences further

imply that the model has a unique equilibrium, and that this equilibrium can be characterized

with a number of inequalities.

Although the model is a static equilibrium model, and the results and proofs are derived for

a single market, the subscript m will sometimes be used to keep the notation consistent with

the notation in the rest of the paper.

A.1 Two-Sided Matching Model

There are two finite and disjoint sets of agents: investors i ∈ Im and companies j ∈ Jm.

Investor i can invest in up to qm,i companies (called his quota), and each company can receive

an investment from a single investor only. The set of potential investments is Mm = Im × Jm.

To allow companies to remain unmatched, and investors to leave quota unused, define the set

of extended potential investments to be M̃m = (Im ∪ {0})× (Jm ∪ {0}). A matching is a subset

of this set, µm ⊂ M̃m. If investor i is matched with company j, then ij ∈ µm. If company j is

not matched with any investor, then 0j ∈ µm. If investor i has unused quota, then i0 ∈ µm.

Definition 1 A matching µm is a subset of M̃m such that, for all i′ ∈ Im and j′ ∈ Jm,

|{ij ∈ µm | i = i′}| ≤ qm,i′ and |{ij ∈ µm | j = j′}| = 1. If |{ij ∈ µm | i = i′}| < qm,i′ then

i′0 ∈ µm.

To simplify notation, let the set of companies that investor i invests in (his portfolio) be

given by µm(i) = {j ∈ Jm ∪ {0} | ij ∈ µm}, and let the investor that invests in company j be

Page 39: How Smart is Smart Money? An Empirical Two-Sided Matching ...

given by µm(j) = {i ∈ Im ∪ {0} | ij ∈ µm}. Now ij ∈ µm is equivalent to j ∈ µm(i), which is

again equivalent to µm(j) = {i}.

A.1.1 Preferences

The assumption that preferences are aligned states that each (extended) potential investment

has a valuation Vi,j . Vi,j reflects the attractiveness of the investment by investor i in company j.

Vi0 reflects the value (or outside option) for investor i of remaining unmatched. V0j reflects the

value (or outside option) for company j of remaining unmatched. The valuations are assumed to

be distinct. The valuations determine the agents’ preferences, and the important assumption is

that the same valuations determine the preferences for the agents on both sides of the market.

Company j prefers to match with investor i′ to a match with investor i when Vi′,j > Vi,j .

Similarly, investor i prefers (according to his preferences over individual investments) investing

in company j′ to investing in company j when Vi,j′ > Vi,j .

The investor chooses between portfolios of investments, and in addition to his preferences

over matches with individual companies, he has preferences over portfolios. Roth (1984) shows

that as long as the investors’ preferences over portfolios are responsive to their preferences

over individual investments, the equilibrium is determined entirely by their preferences over

individual investments. Responsiveness restricts the investors’ preferences over portfolios that

differ by only a single investment, and requires the investors to prefer the portfolio with the

most preferred individual investment. The preferences used in the paper, represented by the

profit function Πi(µ(i)) = λ∑

j′∈µ(i) Vi,j′ , are one simple example of responsive preferences.

Definition 2 (Roth and Sotomayor (1990), definition 5.2) Let µ′(i) = µ(i) ∪ {j′}\{j} for

some j ∈ µ(i) and some j′ /∈ µ(i). Investor i’s preferences over portfolios of companies, as

represented by �i, are responsive to his preferences over individual companies, as represented

by Vi,j, if it holds that µ′(i) �i µ(i) ⇔ Vi,j′ > Vi,j.

A.2 Definitions and Results from the College Admissions Model

An investor or a company that prefer to deviate from a given matching and remain unmatched

is a blocking agent. An investor and a company that both prefer to deviate from a given

matching and form a new match together are a blocking pair. A matching is pairwise stable

when it contains no blocking agents or blocking pairs.

Page 40: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Definition 3 (Roth and Sotomayor (1990)15) For the matching µm, investor i is a blocking

agent if Vi,0 > minj′∈µm(i)Vi,j′, and company j is a blocking agent if V0,j > Vµm(j),j.

Definition 4 (Roth and Sotomayor (1990)) Investor i and company j are a blocking pair for

the matching µm, if ij /∈ µm and Vi,j > Vµm(j),j, and Vi,j > minj′∈µm(i)Vi,j′.

Definition 5 (Roth and Sotomayor (1990), definition 5.3) A matching is pairwise stable if it

does not contain any blocking agents or blocking pairs.

A group of investors and companies form a blocking coalition when they all prefer to deviate

from a matching and form new matches among themselves (possibly keeping some matches with

agents outside the coalition).

Definition 6 (Roth and Sotomayor (1990)) The coalition of investors I ′ ⊂ Im and companies

J ′ ⊂ Jm is a blocking coalition for the matching µm when there exists µ′m 6= µm such that

µ′m(i) ⊂ J ′ ∪ µm(i) and µ′m(i) �i µm(i) for all i ∈ I ′, and µ′m(j) ∈ I ′ and Vµ′m(j),j > Vµm(j),j

for all j ∈ J ′.

Definition 7 (Roth and Sotomayor (1990), definition 5.4) A matching is group stable if it

does not contain any blocking coalitions.

Since a pair is also a coalition, group stability directly implies pairwise stability. In this

model these two concepts are actually equivalent, and a matching that is either pairwise stable

or group stable will simply be called a stable matching or an equilibrium. The model is a

cooperative game-theoretic model, and the set of equilibria equals the core of the game.

Theorem 8 (Roth and Sotomayor (1990), lemma 5.5) In the College Admissions M odel,

pairwise stability implies group stability.

Theorem 9 (Roth and Sotomayor (1990), lemma 5.6 and theorem 2.8) In the College Admis-

sions Model, an equilibrium always exists.

Theorem 10 (Roth and Sotomayor (1990), proposition 5.36) The set of stable matchings

equals the core, defined by weak dominance.15Roth and Sotomayor (1990) contains a general treatment of these models. I refer to the definitions and

results in this text, these are not the original references. All definitions and results are reformulated to fitthe present model and notation. The formulation of definition 3 and 4 assumes responsive preferences. Moregenerally, they could be stated using the investors’ preferences over portfolios, �i.

Page 41: How Smart is Smart Money? An Empirical Two-Sided Matching ...

A.3 New Results with Aligned Preferences

In the present model with aligned preferences, a simple Top-Down Sorting Algorithm can

locate the equilibrium matching. The algorithm, explained below, serves two purposes. It

provides a natural way to determine the equilibrium, and, although this paper does not consider

implementation of the equilibrium, the algorithm suggests that the equilibrium is not difficult

to implement, even in a decentralized market. Further, the proofs of the following theorems are

based on properties of the algorithm. The algorithm is simpler than the Deferred Acceptance

Algorithm that is often used to analyze the College Admissions Model, because in every iteration

the offer and acceptance is final, nothing is deferred, and the algorithm needs only as many

iterations as there are matches to be made.

A natural way to form a matching in the model is to start with the investment with the

highest valuation, then find the one with the second-highest valuation that is still feasible, then

the third-highest and so on. The Top-Down Sorting Algorithm below formalizes this idea.

Start) Initially, the set of feasible matches is the set of extended potential matches, M1 =

(Im ∪ {0}) × (Jm ∪ {0}). The set of matches already made is the empty set, µ0m = ∅. The

current iteration is the first one, n = 1.

Step 1) Let {injn} be the feasible match with the highest valuation, {injn} = arg maxij∈Mn Vi,j .

The maximizer is unique by the assumption that Vi,j are all distinct.

Step 2) Append this pair to the set of matches made, µnm = µn−1 ∪ {injn}, and let the

valuation of the iteration be the valuation of this match, vn = Vin,jn .

Step 3) Determine the set of matches that are made infeasible by this match, Rn. There

are four mutually exclusive cases:

Case 3.1) If in = {0} then company jn is unmatched, and the company cannot match with

any other investors. The matches made infeasible are Rn = Mn ∩ {ij | j = jn}.

Case 3.2) If jn = {0} then investor in prefers his outside option to the remaining feasible

companies. The matches made infeasible are Rn = Mn ∩ {ij | i = in}.

Case 3.3) If (neither in = {0} nor jn = {0} and) |{ij ∈ µnm|i = in}| = qm,in , then investor

in exhausts his quota, and the infeasible matches are Rn = Mn ∩ {ij | i = in or j = jn}.

Case 3.4) Otherwise, investor in does not exhaust his quota, and the matches made infeasible

are Rn = Mn ∩ {ij | j = jn}.

Step 4) The feasible matches left for the next iteration are Mn+1 = Mn\Rn.

Page 42: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Step 5) If the set of feasible matches is nonempty, i.e. if Mn+1 6= ∅, then reiterate from

step 1).

Let µ̃m be the final matching when the algorithm ends. Obviously, the algorithm ends in

finite time, and ends with a valid matching. Note that since the maximization in step 1) is

taken over a strictly decreasing sequence of sets, the valuations of the iterations, vn, form a

strictly decreasing sequence.

Theorem 11 µ̃m is a stable matching.

Proof. Assume for contradiction that i′j′ is a blocking pair, and thus is a pair that is not

matched in µ̃m. Let n′ be the iteration where j′ is matched with µ̃m(j′) 6= i′. In this iteration

i′j′ must have been in the set of feasible matches, {i′j′} ∈ Mn′. Because, there are only two

ways the pair could have been removed from this set, both of which lead to contradictions.

1) The company could have been matched with another investor (or been unmatched) in a

previous round, but since the valuation of the iteration is decreasing this would contradict the

assumption that company j′ prefers i′ to µ̃m(j′) and the pair would not be a blocking pair. 2)

Investor i′ could have exhausted his quota (or decided to leave his quota unused) in a previous

iteration, but again, since the valuation of the iteration is decreasing, this implies that all the

matches in µ̃m(i′) have valuations that are higher than Vi′,j′ , contradicting that i′j′ is a blocking

pair (by responsiveness). So {i′j′} ∈ Mn′, but this implies that Vµ̃m(j′)j′ > Vi′j′ contradicting

that i′j′ is a blocking pair. Finally, assuming that either i′ or j′ is a blocking agent leads to

contradictions by similar arguments.

Theorem 12 The equilibrium is unique.

Proof. Assume for contradiction that µ′m is a stable matching that differs from µ̃m. Now µ̃m

must contain a match that is not contained in µ′m, so consider the first iteration where the

Top-Down Sorting Algorithm forms a match that is not matched in µ′m. Let this iteration be

n′, and let the match be i′j′. There are two cases: If j′ 6= {0}, the investor is not leaving quota

unused. In this case, since this is the first iteration where the algorithm forms a match that is

not formed in µ′m, it holds that µ′m(j′)j′ ∈ Mn′. This implies that Vi′,j′ > Vµ′

m(j′),j′ and thus

that either j′ is a blocking agent or i′j′ is a blocking pair for µ′m. j′ = {0}, it follows by a

similar argument that investor i′ is a blocking agent for µ′m.

Page 43: How Smart is Smart Money? An Empirical Two-Sided Matching ...

A.3.1 Characterization of the Equilibrium for the Empirical Model

The equilibrium imposes bounds on the valuations in equilibrium. These bounds are used to

estimate the empirical model. Companies that are unmatched provide a slight problem for the

empirical model, since they are not observed in the data. Similarly, the investors’ unused quota

are not observed. To ensure consistency between the theoretical and the empirical model, let

Vmin = minij∈Mm Vi,j , and below a maintained consistency assumption is that V0,j < Vmin and

Vi,0 < Vmin. This assumption ensures that the agents prefer any match to being unmatched.

Theorem 13 The matching µm is stable if and only if Vi,j < max[Vµm(j),j ,minj′∈µm(i) Vi,j′

]for all i ∈ Im and j ∈ Jm such that ij /∈ µm.

Proof. The consistency assumption ensures that there are no blocking agents. The inequalities

directly imply that there are no blocking pairs either.

For a given stable matching, the above theorem gives upper bounds on the valuations of

matches that are not formed in equilibrium. These bounds are increasing functions of the

valuations of the matches that are made. The functions can thus be inverted to express the

inequalities as lower bounds on the valuations of the investments that are made. The following

theorem presents the inequalities in this form.

Theorem 14 The matching µm is stable if and only if Vi,j > max[maxi′∈S(j) Vi′,j ,maxj′∈S(i) Vi,j′

]for all ij ∈ µm ∩Mm, where S(j) =

{i|Vi,j > minj′∈µm(i) Vi,j′

}and S(i) =

{j|Vi,j > Vµm(j),j

}.

Proof. The proof again follows directly from the definition of stability, by rephrasing the

definition as: “Consider all agents who want to deviate and match with me. Pairwise stability

requires me to not be willing to deviate and match with them.”

The final theorem shows that as the quotas increase, the equilibrium assigns each company

to its most preferred investor. This, in turn, implies that with unrestricted quotas, the empirical

model reduces to a conventional Probit model.

Theorem 15 When the quotas, q, are sufficiently large (i.e. exceed the number of companies

in the market), µm(j) =argmaxi∈ImVi,j.

Proof. Follows directly from the Top-Down Sorting Algorithm.

Page 44: How Smart is Smart Money? An Empirical Two-Sided Matching ...

B Simulation of Augmented Posterior Distribution

Sampling from the joint posterior distribution can be accomplished by sampling from each

dimension at the time from the posterior distribution conditional on all the other parameters

and latent variables. This sampling procedure is known as Gibbs sampling. In this appendix

the conditional posterior distributions used for the sampling are derived, and the sampling

procedure is described.

B.1 Comment About Latent Outcome Variables

For every potential investment there is an associated outcome equation. For the outcome

equation, the identifying assumption is that the error term is independent of all the exoge-

nous variables. This assumption implies that the estimated parameters predict the outcome

probabilities for arbitrary deals in the market, and not just the observed ones.

The model imposes no restrictions on the latent outcome variables for the unmatched pairs,

i.e. the investments that are not made. It is thus possible to “integrate out” these variables,

and just simulate the latent variables that the model imposes restrictions on. By reducing the

number of random variables simulated, this increases the efficiency of the simulation procedure.

Let O∗ = {Oij | ij ∈ µ} contain the latent outcome variables for the matched pairs. O∗ equals

to O where defined, but the identifying independence assumption imposed on O does not hold

for O∗, since the variables are selected by the endogenous variable µ. The resulting posterior

distributions are unchanged, but the specifics of the distributions are slightly different when

sampling O or O∗.

B.2 Joint Distributions of Latent Variables and Parameters

Let Vm and O∗m be vectors containing the latent valuation and outcome variables in market

m, so Vm = {Vi,j | ij ∈ Mm} and O∗m = {Oi,j | ij ∈ µm}. Let V and O∗ contain all these

variables across all markets. Let O∗−i,j contain all outcome variables, except O∗

i,j , and define

V−i,j similarly. Let Wi,j denote the exogenous variables in the valuation equation for the

investment by investor i in company j, let Wm denote all these variables in market m, and let

W denote all these variables in all markets. Define Xi,j , Xm, and X similarly for the exogenous

variables in the outcome equation. The endogenous variables are the investment outcomes and

Page 45: How Smart is Smart Money? An Empirical Two-Sided Matching ...

the investment decisions, and using similar notation these are given by IPOi,j , IPOm, IPO

and µi,j , µm, and µ, respectively. The parameters in the model are collected in the vector

θ = (α, β, δ). As shown in appendix A, the equilibrium in the economic model imposes a set of

restrictions on the latent valuation variables, represented by Vm ∈ Γµm. Similarly, the Probit

specification of the outcomes imposes restrictions on the latent outcome variables, represented

by O∗m ∈ ΓIPOm . The specification of the latent variables are given by the valuation and

outcome equations.

Vi,j = W ′i,jα + ηi,j , for all ij ∈ Mm

Oi,j = X ′i,jβ + εi,j , for all ij ∈ Mm

Here ηi,j and εi,j are normal distributed errors, independent of the exogenous variables, and

the variance of ηi,j is fixed to one to normalize the valuation equation. A convenient way to

model the covariance between the two errors is to rewrite εi,j as εi,j = ηi,jδ + ξi,j , where ξi,j

is an normal error, independent of the exogenous variables, and with a variance fixed at one.

Now cov(ηi,j , εi,j

)= δ, and var(εi,j) = 1 + δ2, and this normalizes the scale of the outcome

equation.

The prior distributions of the parameters are independent normal distributions with means

θ = (α, β, δ) and covariances (Σα,Σβ ,Σδ). Let the joint prior density be π0(θ).

The joint posterior density of the parameters and the latent variables (after integrating out

the unobserved latent outcome variables) in the model conditional on the observed endogenous

and exogenous variables is given below. The following conditional posterior densities derived

are proportional to different terms in this density.

φ(V,O∗, θ | IPO, µ,X, W ) = C × π0(θ)

×M∏

m=1

(I{O∗

m∈ΓIPOm} × I{Vm∈Γµm} × φm(Vm, O∗m | θ, Xm,Wm)

)

Page 46: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Where φm is given by

φm(Vm, O∗m | θ, Xm,Wm) = C ×

∏i,j∈Mm

exp(−.5

(Vi,j −W ′

i,jα)2)

×∏

i,j∈µm

exp(−.5

(O∗

i,j −X ′i,jβ −

(Vi,j −W ′

i,jα)δ)2)

B.2.1 Conditional Posterior Distribution of Latent Outcome Variables

The conditional posterior density of the latent outcome variable O∗i,j is proportional to the term

in φm that the variable enters, after imposing the restriction in ΓIPOm . There are two cases.

When IPOm equals one, the restriction O∗i,j ∈ ΓIPOm restricts O∗

i,j to being positive, and the

density is

π(O∗i,j | V,O∗

−i,j , θ, IPO, µ,X,W ) = C× I[O∗i,j>0]×exp

(−.5

(O∗

i,j −X ′i,jβ −

(Vi,j −W ′

i,jα)δ)2)

This is the normal distribution N(X ′

i,jβ + (Vi,j −W ′i,jα)δ, 1

)truncated below at zero. When

IPOi,j equals zero, the distribution is the same, but with the truncation reversed.

B.2.2 Conditional Posterior Distribution of Latent Valuation Variables

The conditional posterior distribution of the latent variable Vi,j depends on whether ij ∈ µ or

not. When ij /∈ µ, no outcome is observed for this potential investment, and the density is

π(Vi,j | V−i,j , O∗, θ, IPO, µ,X, W ) = C × I[Vi,j<V i,j] exp

(−.5

(Vi,j −W ′

i,jα)2)

(Remember that V i,j and V i,j are functions of V−i,j and µ.) When ij ∈ µ, the outcome of

the investment is observed and contains additional information about the error term in the

valuation equation. The density is

π(Vi,j | V−i,j , O∗, θ, IPO, µ,X, W ) = C × I[Vi,j>V i,j]

× exp

(−.5

(Vi,j −W ′

i,jα−(O∗

i,j −X ′i,jβ)δ

1 + δ2

)2

×(1 + δ2

))

Page 47: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Both of these distributions are truncated normal distributions. The first is the normal distri-

bution N(W ′

i,jα, 1)

and the second is N(W ′

i,jα + (O∗i,j −X ′

i,jβ)δ/(1 + δ2), 1/(1 + δ2)), trun-

cated at V i,j and V i,j respectively.

B.2.3 Conditional Posterior Distribution of Parameters

The conditional posterior distributions of the parameters are normal distributions (not trun-

cated). Since each parameter enters all the factors in φ, the derivation of these distributions

requires “completing the square” in a product of several normal densities. To illustrate, let γ

be a random vector, distributed with a density that can be written on the form

π(γ) = C1 × exp(−.5

(γ′Mγγ + 2γ′Nγ + C2

))where Mγ is a corresponding matrix and Nγ a corresponding vector (C1 and C2 could, of course,

be combined into the single normalization constant C = C1 exp(C2)). Completing the square in

this expression, shows that the distribution of γ (conditional on C1, and C2, and any variables

that enter these) is the normal distribution N(−M−1

γ Nγ ,M−1γ

). The matrix Mγ and the vector

Nγ determines the mean and covariance of the normal distribution, and the distributions below

are expressed in terms of M and N .

Collecting the terms in φ that involve the parameter α gives φ(α | V,O∗, β, δ, IPO, µ,X, W )

equals N(−M−1α Nα,M−1

α ) where

Mα = Σ−1α +

M∑m=1

∑ij∈Mm

Wi,jW′i,j +

∑ij∈µm

δ2Wi,jW′i,j

Nα = −Σ−1

α α +M∑

m=1

∑ij∈Mm

−Wi,jVi,j +∑

ij∈µm

Wi,jδi,j

(O∗

i,j −X ′i,jβ − Vi,jδ

)Similarly, collecting the terms in φ that contain β gives

Mβ = Σ−1β +

M∑m=1

∑ij∈µm

Xi,jX′i,j

Nβ = −Σ−1β β −

M∑m=1

∑ij∈µm

Xi,j

(O∗

i,j − Vi,jδ + W ′i,jαδ

)

Page 48: How Smart is Smart Money? An Empirical Two-Sided Matching ...

Finally, for δ, collecting terms in φ gives

Mδ = Σ−1δ +

M∑m=1

∑ij∈µm

(Vi,j −W ′

i,jα)2

Nδ = −Σ−1δ δ −

M∑m=1

∑ij∈µm

(O∗

i,j −X ′i,jβ) (

Vi,j −W ′i,jα)

B.3 Numerical Simulation of Augmented Posterior

Draws from the augmented posterior distribution are simulated with a standard Gibbs sampling

procedure (e.g. Gelfand and Smith (1990)). In Gibbs sampling the random variables are

partitioned into blocks. The random variables in each block are then simulated conditional on

the values of the variables in the other blocks.

In the model the two parameter vectors α and β are separate blocks. The covariance

parameter δ is a separate block, and each of the latent variables is a separate block. The

conditional distributions of these blocks are derived above. Under weak regularity conditions,

which amount to checking irreducibility and aperiodicity of the Markov chain (Roberts and

Smith (1994)), sequential draws from these blocks converge to draws from the joint distribution

of the augmented posterior.

Procedures for sampling joint normal vectors are readily available. Sampling a severely

truncated normal random variable is not entirely straightforward. Geweke (1991) provides a

simple procedure using importance sampling with the exponential distribution as base distri-

bution. The estimates are based on 1,000,000 iterations of the procedure. These iterations take

36 hours on a personal computer running Windows with a 2.4 GHz Intel processor. Fortran 90

code is available from the author.

The initial 400,000 draws are discarded to allow the distribution to ”burn-in”. Visual

inspection of the draws shows that convergence to the posterior distribution occurs within

the first 100,000 draws. Convergence is further confirmed by comparing the first and second

moments of draws number 200,000 to 400,000, to the first and second moments of draws number

400,000 to 1,000,000. The differences are insignificant.

Page 49: How Smart is Smart Money? An Empirical Two-Sided Matching ...

OBS MEAN STD DEV MIN MAXIPO 1666 0.27 0.44 0 1

YEAR 1666 1988.34 4.22 1982 1995FUND_SIZE ($ MIL) 1666 128.09 206.53 1.7 2113.9

STAGE 1666 0.18 0.38 0 1I_COMMUNICATION 1666 0.13 0.34 0 1I_COMPUTER 1666 0.42 0.49 0 1I_ELECTONICS 1666 0.10 0.30 0 1I_BIOTECHNOLOGY 1666 0.06 0.24 0 1I_MEDICAL 1666 0.13 0.34 0 1I_OTHER 1666 0.15 0.36 0 1

TOTAL_EXPERIENCE 1666 69.56 66.18 0 443EARLY_STAGE_EXPERIENCE 1666 45.52 40.33 0 223LATE_STAGE_EXPERIENCE 1666 24.05 35.30 0 299I_COMM_EXPERIENCE 1666 9.86 15.68 0 118I_COMPUTER_EXPERIENCE 1666 25.48 24.13 0 152I_ELECTRONICS_EXPERIENCE 1666 8.14 8.80 0 49I_BIOTECH_EXPERIENCE 1666 3.34 4.77 0 32I_MEDICAL_EXPERIENCE 1666 7.61 8.35 0 43I_OTHER_EXPERIENCE 1666 15.12 17.05 0 108

STAGE_EXPERIENCE 1666 44.35 42.45 0 299INDUSTRY_EXPERIENCE 1666 17.66 20.46 0 130

TABLE 1DESCRIPTIVE STATISTICS FOR OBSERVED INVESTMENTS

The reported figures are averages over the observed investments in the sample. Theconstruction of the variables is described in Section 4 in the paper. IPO equals one if theinvestment results in a public offering and is zero otherwise. YEAR contains the year of theinvestment. FUND_SIZE contains the capital available to the investor. STAGE equals one ifthe company is late stage and is zero otherwise. I_XXXXX are six industry dummies.TOTAL_EXPERIENCE contains the number of previous investments by VC firm.XXXXX_EXPERIENCE contain other experience measures.

Page 50: How Smart is Smart Money? An Empirical Two-Sided Matching ...

OBS MEAN STD DEV MIN MAXYEAR 42213 1988.436 4.249 1982 1995FUND_SIZE ($ MIL) 42213 130.699 219.576 1.772 2113.864

STAGE 42213 0.173 0.378 0 1I_COMMUNICATION 42213 0.129 0.335 0 1I_COMPUTER 42213 0.427 0.495 0 1I_ELECTONICS 42213 0.101 0.302 0 1I_BIOTECHNOLOGY 42213 0.061 0.240 0 1I_MEDICAL 42213 0.136 0.343 0 1I_OTHER 42213 0.146 0.353 0 1

TOTAL_EXPERIENCE 42213 65.760 63.676 0 443EARLY_STAGE_EXPERIENCE 42213 42.842 37.891 0 223LATE_STAGE_EXPERIENCE 42213 22.918 34.169 0 299I_COMM_EXPERIENCE 42213 9.100 14.915 0 118I_COMPUTER_EXPERIENCE 42213 23.665 22.566 0 152I_ELECTRONICS_EXPERIENCE 42213 7.410 8.211 0 49I_BIOTECH_EXPERIENCE 42213 3.157 4.326 0 32I_MEDICAL_EXPERIENCE 42213 7.304 8.314 0 43I_OTHER_EXPERIENCE 42213 15.123 18.070 0 108

STAGE_EXPERIENCE 42213 39.316 37.954 0 299INDUSTRY_EXPERIENCE 42213 15.288 19.042 0 152

DESCRIPTIVE STATISTICS FOR POTENTIAL INVESTMENTSTABLE 2

The reported figures are averages over the potential investments in all makets. Theconstruction of the variables and the markets are described in Section 4 in the paper. YEARcontains the year of the investment. FUND_SIZE contains the capital available to the investor.STAGE equals one if the company is post revenue and is zero otherwise. I_XXXXX are sixindustry dummies. TOTAL_EXPERIENCE contains the number of previous investments byVC firm. XXXXX_EXPERIENCE contain other experience measures.

Page 51: How Smart is Smart Money? An Empirical Two-Sided Matching ...

FIGURE 1NUMBER OF INVESTORS IN EACH MARKET

FIGURE 2NUMBER OF INVESTMENTS MADE IN EACH MARKET

Figure 1 shows the number of investors in each market. The upper line represents the California markets, and the lower line represents the Massachussetts markets. Figure 2 shows the number of investments made in each market. The upper line represents the California markets, and the lower line represents the Massachussetts markets.

0

5

10

15

20

25

30

35

40

1982 1985 1988 1991 1994

0

10

20

30

40

50

60

70

80

1982 1985 1988 1991 1994

MA

CA

MA

CA

Page 52: How Smart is Smart Money? An Empirical Two-Sided Matching ...

FIGURE 3IPO RATE IN EACH MARKET

FIGURE 4AVERAGE EXPERIENCE IN EACH MARKET

Figure 3 shows the IPO rate in each market. Figure 4 shows the average experience in each market. In each graph the two lines represents the markets for California and Massacussets respectively.

0%

10%

20%

30%

40%

50%

60%

1982 1985 1988 1991 1994

0

20

40

60

80

100

120

140

160

180

200

1982 1985 1988 1991 1994

MA

CA

MA

CAMA

MA

Page 53: How Smart is Smart Money? An Empirical Two-Sided Matching ...

IPO FREQUENCY IN EACH GROUPFIGURE 5

In Figure 5 the 1666 investments are divided into 10 groups according to the experience of the investor. The bars show the frequency of public offerings in each group. Figure 6 shows the distribution of the number of investments in the groups.

NUMBER OF INVESTMENTS IN EACH GROUPFIGURE 6

TOTAL EXPERIENCE

TOTAL EXPERIENCE

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

0 25 50 75 100 125 150 175 200 225+

0

100

200

300

400

0 25 50 75 100 125 150 175 200 225+

Page 54: How Smart is Smart Money? An Empirical Two-Sided Matching ...

CO

EF

STD

ER

RC

OE

FST

D E

RR

CO

EF

STD

ER

RF

UN

D_S

IZE

($ M

IL)

0.06

4(.0

07)

***

0.06

1(.0

07)

***

0.06

2(.0

07)

***

YEA

R5.

011

(.355

)**

*0.

010

(.627

)

STA

GE

16.6

10(3

.944

)**

*16

.146

(3.8

41)

***

16.2

20(3

.867

)**

*I_

CO

MM

UN

ICA

TIO

N5.

583

(5.6

64)

4.59

8(5

.516

)4.

623

(5.5

64)

I_C

OM

PU

TER

0.85

6(4

.490

)0.

187

(4.3

73)

0.18

9(4

.440

)I_

ELE

CTR

ON

ICS

6.75

1(6

.083

)5.

011

(5.9

26)

5.04

2(5

.978

)I_

BIO

TEC

HN

OLO

GY

10.3

38(7

.188

)12

.684

(7.0

03)

*12

.808

(7.0

87)

*I_

ME

DIC

AL

6.01

7(5

.609

)4.

695

(5.4

63)

4.71

5(5

.528

)

AV

G_E

XP

ER

IEN

CE

0.96

3(.1

01)

***

0.95

4(.1

13)

***

CO

NST

AN

T-9

908.

1(7

06.6

)**

*-3

0.0

(124

0.4)

-12.

2(7

.4)

STA

TE C

ON

TRO

Lno

noye

sYE

AR

CO

NTR

OLS

nono

yes

N16

6616

6616

66

The

depe

nden

tvar

iabl

eis

TO

TA

L_E

XPE

RIE

NC

Ew

hich

cont

ains

num

beri

nves

tmen

tsby

inve

stor

.FU

ND

_SIZ

E($

MIL

)co

ntai

nsam

ount

ofca

pita

lava

ilabl

eto

inve

stor

.YEA

Rco

ntai

nsye

arof

inve

stm

ent.

STA

GE

equa

lson

eif

com

pany

isla

test

age

and

isze

root

herw

ise.

I_X

XX

XX

are

five

indu

stry

grou

ps,o

mitt

edba

segr

oup

isI_

OT

HE

R.A

VG

_EX

PER

IEN

CE

cont

ains

aver

age

expe

rien

ceof

inve

stor

sin

each

mar

ket.

STA

TE

_CO

NT

RO

Lco

ntro

lsfo

rlo

catio

nof

com

pany

.Y

EA

R_C

ON

TR

OL

Sco

ntro

lsfo

rin

divi

dual

year

s.R

epor

ted

coef

fici

enta

reO

LSes

timat

es.S

TD

ER

Rar

est

anda

rder

rors

of c

oeff

icie

nts.

*, *

*, a

nd *

** d

enot

e st

atis

tical

sig

nific

ance

at t

he 1

0%, 5

%, a

nd 1

% le

vels

resp

ectiv

ely.

Spec

ific

atio

n 3

TA

BL

E 3

RE

GR

ESS

ION

OF

EX

PE

RIE

NC

E O

N C

HA

RA

CT

ER

IST

ICS

Spec

ific

atio

n 1

Spec

ific

atio

n 2

Page 55: How Smart is Smart Money? An Empirical Two-Sided Matching ...

COEF dF/dX STD ERR P-VALUETOTAL_EXPERIENCE 0.002260 0.000738 (.000170) .000 ***

FUND_SIZE ($ MIL) 0.000398 0.000130 (.000052) .012 **

STATE° 0.003601 0.001175 (.026343) .964Y_1983° -0.0826 -0.0264 (.0499) .605Y_1984° -0.0683 -0.0219 (.0533) .687Y_1985° -0.3727 -0.1076 (.0510) .069 *Y_1986° 0.1328 0.0449 (.0612) .449Y_1987° 0.0231 0.0076 (.0554) .890Y_1988° 0.1158 0.0389 (.0594) .500Y_1989° -0.1589 -0.0495 (.0532) .376Y_1990° -0.2142 -0.0655 (.0538) .258Y_1991° 0.2476 0.0861 (.0701) .196Y_1992° -0.2230 -0.0681 (.0499) .204Y_1993° 0.2181 0.0753 (.0665) .235Y_1994° -0.2406 -0.0730 (.0514) .191Y_1995° -0.2763 -0.0835 (.0467) .102

CONSTANT -0.7710N 1666OBS. P .269PRED. P .263

Dependent variable is IPO. TOTAL_EXPERIENCE contains number ofprevious investments by investor. FUND_SIZE ($ MIL) contains capitalavailable to investor. STATE equals one if company is located in Californiaand is zero otherwise. Y_XXXX controls for individual years, omitted baseyear is 1982. Reported coefficients are ML estimates of conventional Probitmodel. Marginal effect for variables marked ° is effect of discrete change. STDERR are standard errors of marginal effects. *, **, and *** denote statisticalsignificance at the 10%, 5%, and 1% levels respectively.

TABLE 4.1PROBIT ESTIMATES OF IPO PROBABILITY

Page 56: How Smart is Smart Money? An Empirical Two-Sided Matching ...

CO

EF

dF/d

XST

D E

RR

CO

EF

dF/d

XST

D E

RR

CO

EF

dF/d

XST

D E

RR

CO

EF

dF/d

XST

D E

RR

TOTA

L_E

XP

ER

IEN

CE

0.00

2260

0.00

0738

(.000

170)

***

0.00

2063

0.00

0664

(.000

170)

***

0.00

2050

0.00

0664

(.000

169)

***

0.00

1576

0.00

0508

(.000

458)

STA

GE

_EX

PE

RIE

NC

E0.

0013

430.

0004

32(.0

0063

0)IN

DU

STR

Y_E

XP

ER

IEN

CE

-0.0

0128

5-0

.000

414

(.000

943)

FU

ND

_SIZ

E ($

MIL

)0.

0003

980.

0001

30(.0

0005

2)**

0.00

0431

0.00

0139

(.000

050)

***

0.00

0435

0.00

0141

(.000

049)

***

0.00

0439

0.00

0141

(.000

050)

***

YEA

R-0

.013

598

-0.0

0440

5(.0

0274

2)

STA

GE

°0.

1851

0.06

18(.0

306)

**0.

1718

0.05

76(.0

303)

**0.

2085

0.06

99(.0

329)

**I_

CO

MM

UN

ICA

TIO

0.75

180.

2738

(.052

6)**

*0.

7348

0.26

81(.0

521)

***

0.73

850.

2686

(.053

8)**

*I_

CO

MP

UTE

0.45

280.

1484

(.038

1)**

*0.

4307

0.14

18(.0

380)

***

0.45

720.

1498

(.038

5)**

*I_

ELE

CTR

ON

ICS°

0.60

470.

2187

(.056

5)**

*0.

5687

0.20

56(.0

558)

***

0.58

780.

2122

(.058

2)**

*I_

BIO

TEC

1.16

350.

4352

(.058

8)**

*1.

1539

0.43

22(.0

582)

***

1.13

410.

4246

(.062

8)**

*I_

ME

DIC

AL°

0.54

310.

1934

(.052

3)**

*0.

5434

0.19

43(.0

519)

***

0.52

540.

1866

(.054

0)**

*

CO

NST

ANT

-0.7

710

-1.2

501

25.6

832

-1.2

497

STA

TE C

ON

TRO

Lye

sye

sno

yes

YEA

R C

ON

TRO

LSye

sye

sno

yes

N16

6616

6616

6616

66O

BS

P.2

69.2

69.2

69.2

69P

RE

D P

.263

.256

.259

.256

Dep

ende

ntva

riab

leis

IPO

.TO

TAL_

EXPE

RIE

NC

Eco

ntai

nsnu

mbe

rof

prev

ious

inve

stm

ents

byin

vest

or.S

TAG

E_EX

PER

IEN

CE

and

IND

UST

RY

_EX

PER

IEN

CE

cont

ain

num

ber

ofpr

evio

usin

vest

men

tsby

inve

stor

inco

mpa

nies

atsa

me

stag

eor

insa

me

indu

stry

aspr

esen

tcom

pany

.FU

ND

_SIZ

E($

MIL

)co

ntai

nsca

pita

lava

ilabl

eto

inve

stor

.YEA

Rco

ntai

nsye

arof

inve

stm

ent.

STA

GE

equa

lson

eif

com

pany

isla

test

age

and

isze

root

herw

ise.

I_X

XX

XX

are

five

indu

stry

grou

ps,o

mitt

edba

segr

oup

isI_

OTH

ER.S

TATE

_CO

NTR

OL

isco

ntro

lfor

loca

tion

ofco

mpa

ny.Y

EAR

_CO

NTR

OLS

are

cont

rols

fori

ndiv

idua

lyea

rs.R

epor

ted

coef

ficie

nts

are

ML

estim

ates

ofco

nven

tiona

lPro

bitm

odel

.Mar

gina

leff

ectf

orva

riab

les

mar

ked

°is

effe

ctof

disc

rete

chan

ge.S

TDER

Rar

est

anda

rder

rors

ofm

argi

nale

ffec

ts.*

,**,

and

***

deno

test

atis

tical

sign

ifica

nce

atth

e10

%,5

%,a

nd1%

leve

ls re

spec

tivel

y.

Spec

ific

atio

n 3

TA

BL

E 4

.2PR

OB

IT E

STIM

AT

ES

OF

IPO

PR

OB

AB

ILIT

Y -

AL

TE

RN

AT

IVE

SPE

CIF

ICA

TIO

NS

Spec

ific

atio

n 1

Spec

ific

atio

n 2

Spec

ific

atio

n 4

Page 57: How Smart is Smart Money? An Empirical Two-Sided Matching ...

MEAN MEDIAN dF/dX dP/dWOUTCOME EQUATION

TOTAL_EXPERIENCE 0.001599 0.001443 0.000417 (.000513) ***

FUND_SIZE ($ MIL) 0.000398 0.000300 0.000104 (.000159) **YEAR -0.000155 -0.000480 -0.000040 (.000852)

STAGE° 0.1671 0.1678 0.0458 (.0884) *I_COMMUNICATION° 0.7185 0.7305 0.2283 (.1348) ***I_COMPUTER° 0.4186 0.3792 0.1124 (.1137) ***I_ELECTRONICS° 0.5645 0.5279 0.1756 (.1440) ***I_BIOTECHNOLOGY° 1.1205 1.0516 0.3908 (.1615) ***I_MEDICAL° 0.5243 0.4627 0.1594 (.1335) ***

CONSTANT -1.2732 1.0369 (1.6861)VALUATION EQUATION

TOTAL_EXPERIENCE 0.007045 0.008197 0.0020 (.003448) ***

FUND_SIZE ($ MIL) 0.002770 0.002758 0.0008 (.002533) *

STAGE° 0.3399 0.2821 0.0950 (.1377) ***I_COMMUNICATION° 0.0052 -0.0600 0.0015 (.0983)I_COMPUTER° -0.0273 -0.0163 -0.0077 (.0794)I_ELECTRONICS° 0.0357 -0.0188 0.0101 (.1058)I_BIOTECHNOLOGY° 0.3543 0.2961 0.0989 (.1365) **I_MEDICAL° 0.0681 0.0991 0.0192 (.1029)

VARIANCECOVARIANCE 0.1764 0.1776 (.0840) **

STD DEV

Dependent variable for outcome equation is IPO. Dependent variable of vlauation equation isvaluation of potential investment. TOTAL_EXPERIENCE contains number of previous investmentsby investor. FUND_SIZE ($ MIL) contains capital available to investor. YEAR contains year ofinvestment. STAGE equals one if company is late stage and is zero otherwise. I_XXXXX are fiveindustry groups, omitted base group is I_OTHER. Reported estimates are Bayesian estimates. MEAN,MEDIAN, and STD DEV are mean, median, and standard deviation of simulated posteriordistributions.Marginal effect are evaluated at MEAN. Marginal effect for variables marked ° is effectof discrete change. Definition of dP/dW is given in section 5.2. Estimates are based on 1,000,000simulations, of posterior distribution from which the initial 400,000 are discarded. *, **, and ***denote that 0 is contained in the 10%, 5%, and 1% credible intervals respectively.

TABLE 5BAYESIAN ESTIMATES OF STRUCTURAL MODEL

Page 58: How Smart is Smart Money? An Empirical Two-Sided Matching ...

PROBIT OUTCOME EQUATION

0 21.5% 15.1%100 28.6% 19.1%225 38.9% 25.1%443 58.3% 37.3%

Figure 7 and Table 6 and present the probabilities of public offerings predicted by the Probit model and theoutcome equation of the structural model. The probabilities are evaluated with the coefficients in Table 4.1and in Table 5. Except for experience, the probabilities are evaluated at sample averages. The solid linerepresents the probabilities predicted by the Probit model. The broken line represents the probabilitiespredicted by the outcome equation.

PREDICTED PROBABILITIESFIGURE 7

TOTAL EXPERIENCE

TABLE 6PREDICTED PROBABILITIES

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

0 25 50 75 100 125 150 175 200 225

B

CA

TOTAL EXPERIENCE

Page 59: How Smart is Smart Money? An Empirical Two-Sided Matching ...

CO

EF

dF/d

XC

OE

FdF

/dX

CO

EF

dF/d

XO

UT

CO

ME

EQ

UA

TIO

NTO

TAL_

EX

PE

RIE

NC

E0.

0075

270.

0023

16(.0

0313

2)**

0.00

8376

0.00

2516

(.003

019)

**0.

0075

850.

0026

74(.0

0328

9)**

FU

ND

_SIZ

E ($

MIL

)0.

0001

250.

0000

39(.0

0023

0)0.

0001

030.

0000

31(.0

0022

0)0.

0001

440.

0000

51(.0

0023

0)YE

AR

-0.0

2436

4-0

.007

496

(.011

985)

**-0

.031

541

-0.0

0947

4(.0

1150

8)**

STA

GE

°0.

0990

0.03

04(.0

927)

0.12

230.

0439

(.095

9)I_

CO

MM

UN

ICA

TIO

0.68

030.

2358

(.139

7)**

*0.

7101

0.27

02(.1

402)

***

I_C

OM

PU

TER

°0.

4061

0.12

45(.1

145)

***

0.43

070.

1534

(.115

7)**

*I_

ELE

CTR

ON

ICS°

0.51

700.

1762

(.143

0)**

*0.

5619

0.21

34(.1

451)

***

I_B

IOTE

CH

NO

LOG

Y°1.

0637

0.39

03(.1

773)

***

1.08

400.

4121

(.178

4)**

*I_

ME

DIC

AL°

0.49

670.

1671

(.140

7)**

*0.

5042

0.18

97(.1

420)

***

CO

NST

ANT

47.1

841

(23.

6458

)**

60.9

023

(22.

7411

)**

-1.5

314

(.219

3)**

*ST

ATE

CO

N T

RO

LSno

noye

sYE

AR

CO

NTR

OLS

nono

yes

SEL

EC

TIO

N E

QU

AT

ION

FU

ND

_SIZ

E ($

MIL

)0.

0488

21(.0

0638

5)**

*0.

0470

95(.0

0639

5)**

*0.

0471

00(.0

0641

6)**

*YE

AR

-0.1

7615

6(.4

8091

5)-0

.229

230

(.483

112)

STA

GE

9.90

27(2

.966

9)**

*9.

9033

(2.9

625)

***

I_C

OM

MU

NIC

ATI

ON

2.36

59(3

.141

8)2.

4961

(3.1

462)

I_C

OM

PU

TER

0.10

32(2

.469

6)0.

0467

(2.4

857)

I_E

LEC

TRO

NIC

S2.

4699

(3.4

991)

2.74

24(3

.528

6)I_

BIO

TEC

HN

OLO

GY

6.85

88(4

.382

2)7.

0668

(4.3

330)

I_M

ED

ICA

L2.

5336

(3.3

214)

2.80

90(3

.321

6)

AV

G_T

OTA

L_E

XP

0.61

14(.0

928)

***

0.61

05(.0

930)

***

0.63

70(.1

363)

***

CO

NST

ANT

392.

31(9

50.5

6)49

4.84

(954

.70)

42.1

1(4

.14)

***

STA

TE_C

ON

TR

OLS

nono

yes

YEA

R_C

ON

TRO

LSno

noye

sV

AR

IAN

CE

SIG

MA

_247

.873

9(1

.636

9)**

*47

.674

7(1

.609

8)**

*47

.641

5(1

.607

2)**

*SI

GM

A_1

2-1

2.69

92(7

.650

3)*

-15.

1026

(7.3

391)

*-1

3.08

56(7

.907

4)*

Dep

ende

ntva

riabl

ein

outc

ome

equa

tion

isIP

O.

Dep

ende

ntva

riabl

ein

sele

ctio

equa

tion

isT

OT

AL

_EX

PER

IEN

CE

.TO

TAL_

EXPE

RIE

NC

Eco

ntai

nsnu

mbe

rof

prev

ious

inve

stm

ents

byin

vest

or.

FUN

D_S

IZE

($M

IL)

cont

ains

capi

tala

vaila

ble

toin

vest

or.Y

EAR

cont

ains

year

ofin

vest

men

t.ST

AG

Eeq

uals

one

ifco

mpa

nyis

late

stag

ean

dis

zero

othe

rwis

e.I_

XX

XX

Xar

efiv

ein

dust

rygr

oups

,om

itted

base

grou

pis

I_O

TH

ER

.STA

TE_C

ON

TRO

Lis

cont

rolf

orlo

catio

nof

com

pany

.YEA

R_C

ON

TRO

LSar

eco

ntro

lsfo

rin

divi

dual

year

s.SI

GM

S_2

and

SIG

MA

_12

are

varia

nce

ofer

ror

inse

lect

ion

equa

tion

and

cova

rianc

ebe

twee

ner

rors

inno

rmal

dist

ribut

ion

(bef

ore

trunc

atio

n)re

spec

tivel

y.R

epor

ted

coef

ficie

ntsa

reM

Les

timat

es.M

argi

nale

ffec

tfor

varia

bles

mar

ked

°is

effe

ctof

disc

rete

chan

ge.S

TDE

RR

isst

anda

rd e

rror

of c

oeff

icie

nt. *

, **,

and

***

den

ote

stat

istic

al s

igni

fican

ce a

t the

10%

, 5%

, and

1%

leve

ls re

spec

tivel

y.

EST

IMA

TE

S O

F ST

AN

DA

RD

SE

LE

CT

ION

MO

DE

LT

AB

LE

7

STD

ER

RST

D E

RR

STD

ER

RSp

ecif

icat

ion

1Sp

ecif

icat

ion

2Sp

ecif

icat

ion

3