Econometrics is a branch of economics that confronts economic models with data. The “metrics” in “econometrics” suggests measurements. As Lawrence Klein (1974, p. 1) pointed out, measurement alone describes only the theoretical side of econometrics. Its empirical side deals with data and the estimation of relationships. Econometricians construct models, gather data, consider alternative specifications, and make forecasts or decisions based on econometric models (Granger 1999, p. 62). Many textbooks do econometrics rather than define it, mainly because it is not all science, for it requires a “set of assumptions which are both sufficiently specific and sufficiently realistic” (Malinvaud 1966, p. 514). As with any empirical discipline, econometric model building may not precede data analysis. One may be amused to find that econometrics can be used to answer the question “Which came first: the chicken or the egg?” by the use of causality testing (Thurman and Fisher 1988). Sometimes econometricians use “a minimum of assistance from theoretical conceptions or hypotheses regarding the nature of the economic process by which the variables studied are generated” (Koopmans 1970, p. 113). Other times, econometric models such as in time-series analysis use clearly defined approaches such as identification, estimation, and diagnostics.

The tradition for introductory econometrics is to start with a single equation emanating from economic theory and knowledge of how to fit the theory to a sample of data. For example, on the economic side, econometricians have some a priori notions of the demand schedule such as the law of demand, implying that more will be bought as the price falls. This is enough of a hypothesis to allow statistical testing. The econometrician needs to confront this demand hypothesis with a sample of data, which is either time-series or cross-section.

The econometrician’s best friend is randomness. One way to appreciate randomness is to assume that the econometrician wants to explain how prices vary with the quantity sold in the form of a linear single equation model *P _{t }*=

*a*+

*bQ*+

_{t }*ε*, where

_{t}*P*is price,

*Q*is quantity,

*a, b*are coefficients to be estimated,

*t*is time, and

*ε*is an error term. The error term is the main random mechanism in this model. It is normally distributed with a zero mean and a constant variance, independence of the independent variables, and uncorrelated for different sets of observations. Besides the assumptions, the error term makes the dependent variable probabilistic, clarifying that a statistical test may not be based on the independent variables, which are not stochastic. Another requirement of randomness is that the observations should be kept sequentially in time in order to detect whether the errors are related serially, which is called “serial” or “autocorrelation” of the error terms. This is measured by the DurbinWatson statistic, ideally equal to 2. Other preliminary diagnostic tests would require the t-statistics of the coefficients to be approximately 2 or greater, and the adjusted R-square should be in the 90 percent range. The test of a good econometric model “… should emphasize the quality of the output of the model rather than merely the apparent quality of the model” (Granger 1999, p. 62).

Besides single equations, econometricians study systems of equations models. A system of equations is necessary to capture interrelations or feedbacks among economic variables. In microeconomics, the demand and supply curves and their equality are thought of as a model to study market conditions such as equilibrium, excess demand, or excess supply. In macroeconomics, the Keynesian consumption and investment functions and a national income identity are required to study full employment and full production. A system of equations is usually solved or reduced to a single equation for forecasting purposes, which requires variables to be classified either as given (exogenous), such as the money supply and tax rates, or as variables determined by structural equations within the system (endogenous), such as prices and quantities. When the value of a variable is not in doubt at the current time, perhaps because we are relying on its previous values, then the variable is classified as predetermined. Structural equations are required in order to estimate the coefficients, whereas identity equations are required to sum up definitional terms such as that gross national product is the sum of consumption and investments. The Keynesian system of equations requires that planned savings must be equal to planned investment, which is referred to as an *ex ante *condition, as opposed to an *ex post *condition, where the variables are equal from an accounting perspective. The reduced form of the model can be used for policy purposes as instrument-versus-target models as suggested by Jan Tinbergen (1952) for the attainment of social welfare goals as suggested by Henri Theil (1961), or to simulate probable outcomes.

A system of equation models has peculiarities on both the model and the estimation sides. On the modeling side, the main difficulties reside with identification and reflection problems. Briefly stated, the identification problem requires that enough information be present in the model to make each equation represent a definite economic relation such as supply or demand. The reflection problem is concerned with getting a unique group data in order to explain individual behavior. Depending on the results of the identification problem, appropriate techniques for establishing a system of equations are available, such as ordinary least square (OLS), and three-stage least squares (3SLQ).

Some pitfalls are common to both single and systems of equations. Multicollinearity occurs when the independent variables are related, such as when one variable measures activity for a day and another variable measures the same activity for a week, requiring that one is seven times the other. A dummy variable trap occurs when binary variables such as for the treatment of sex, seasonality, or shocks all add up to a column of ones.

Expectations can be treated in both single and systems of equations. An expected variable may be present in the model, which requires one to specify, before estimation, how expectations are formed. One method calls for an adaptive mechanism to correct for past errors. The most recent method of rational expectation models requires the econometrician to adjust the expected value of the variable for all the information that is available. For instance, if one’s average commuting distance to work is 10 minutes, and one hears on the news that a traffic jam has occurred, an adjustment must be made to the average time for the forecast of the arrival time to be rational. Econometricians are trying to build large-scale rational expectation models to rival standard models such as the Wharton Econometric model, the Data Resource model, or the Federal Reserve Board U.S. model, but such achievements are not in sight as yet.

