Panel Data Analysis Tutorial - Appendix

Created by Steve Hoover, Modified on Thu, Dec 12, 2024 at 10:34 AM by Steve Hoover

Technical Notes

Panel data analytics is a vast topic that could (and should) be a separate course by itself. For our purpose, we can think of panel data analytics as a sophisticated version of the linear regression model. Let yit be the observed value of the dependent variable (e.g., Sales, number of conversions) for entity i at time (or replication) t. Using standard notation, we represent the panel model estimated by Enginius as:

yit = X'itβ + ciεit

Where Xit denotes a set of independent variables associated with different entities when they are observed in different contexts or replications t, β are the coefficients that denote the effects of the X variables on y, cis an entity-specific additive effect on y, and εit denotes the error term. Note that cdoes not vary with t and that β is the same for all entities i and replications t. The main difference of this model from a standard regression model is that ci represents an entity-specific unobserved characteristic (e.g., the unknown or hidden characteristic of a keyword, or the hidden talent or personality of an individual), which makes ci a random variable. If we can obtain statistically valid estimates for ci, that will not only provide us useful information about those entity-specific effects, but it will also ensure that the effects of the X variables (i.e. β), are also statistically valid. This is the primary benefit of using panel data models instead of the standard regression model – the ability to account for, and measure entity-specific effects that cannot be detected by the standard regression model. The potential complexities associated with estimating the above panel regression model have to do with the assumptions we make about the nature of the errors (εit). For our purpose, we will consider three possibilities for estimating cwith the usual simplifying assumptions about εit. For further technical details, especially for advanced users, please review the appropriate chapters in Greene (2017, Chapter 11), Pesaran (2015, Chapter 26), or other econometric textbooks.

  1. Pooled regression model: Here ci has the same value for all entities (i.e., c= c for all entities i), and the error term satisfies all the requirements of the Ordinary Least Squares (OLS) model for each entity.  In this case, the results obtained from model estimation will be comparable to those obtained from OLS – any differences in the results are likely to be minor, and occur because of the more robust ways the errors are handled due to the panel structure of the data.
  2. Fixed effects model: This model allows ci to be correlated with Xit and still provide statistically valid estimates of ci and β. This is a critical feature of the Fixed Effects model in that it allows for the possibility the unobserved hidden characteristics of an entity could influence the observed characteristics Xit, and for the two of them together to jointly determine the dependent variable yit. The key downside of this model is that it does not allow the estimation of the effects of any observed entity-specific characteristics Xi that do not vary across t.
  3. Random effects model: This model requires ci to be uncorrelated with Xit but still provide statistically valid estimates of ci and β. If this requirement is met, this model offers two advantages over the fixed effects model: (i) It allows for estimation of the effects of observed entity-specific characteristics Xi that do not vary across t, and (2) The estimates have higher statistical efficiency (i.e., they are estimated with greater precision than an equivalent fixed-effects model). 

The Enginius software is structured so that one of the options available is for the software to automatically determine which of the three models would be the most appropriate one to use for a given data set. Here is one way to think about how to choose between the three model options. The most parsimonious and the simplest option would be pooled regression model (equivalent to ordinary least squares regression) if that applies to the data set, as determined from the appropriate statistical test. The next most parsimonious is the random-effects model, as determined by the structure of the data set and the appropriate statistical test. An advantage of the random-effects model is that you can estimate the effects of observed independent variables within the model that do not vary across time or replications (e.g., race, education level, risk tolerance, the state or country where a firm is incorporated, etc.). The most comprehensive model is the fixed effects model if it is appropriate for the data set, again as determined from the structure of the data and the appropriate statistical tests. However, we cannot estimate the effects of independent variables that do not vary across time or replications – the effects of those independent variables are absorbed into the fixed effects ci.


Technical Details about the Panel Data Analytics used in Enginius

The main outputs of panel data analysis are generated using the plm package available in R. The Automatic option first tests the null hypothesis that the Pooled OLS model fits the data best by executing a pooling test. If the poolability hypothesis is rejected, then the Automatic option tests whether a random effects model is appropriate by applying the Hausman test. If the null hypothesis is rejected, then the Fixed-effects model is recommended. Otherwise, the Automatic option recommends the Random-effects model. In the fixed effects model, the fixed effects are obtained via the fixef function, and the random effects are obtained via the ranef function. For the fixed effects model, the sum of the fixed effects is set to 0 for estimation.

 

References

Greene, William H. (2017). Econometric Analysis (Eighth edition), New York, Pearson.

Pesaran, M. Hashem (2015), Time Series and Panel Data Econometrics, Oxford, Oxford University Press.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article