Is Logistic Regression an OLS? Exploring the Differences and Related Concepts

Is Logistic Regression an OLS?

Logistic regression is not an Ordinary Least Squares (OLS) regression. While they share some superficial similarities, these two methods are fundamentally different and are suited for different types of problems. Let's explore the key differences and some advanced topics in logistic regression.

Key Differences

Nature of the Dependent Variable

OLS (Ordinary Least Squares) is used for continuous dependent variables, predicting a continuous outcome. In contrast, logistic regression is used for binary or categorical dependent variables, predicting the probability of a certain class or event occurring, such as success or failure.

Modeling Approach

OLS minimizes the sum of the squared differences between observed and predicted values, adhering to the least squares criterion. Logistic regression, on the other hand, uses the logistic function to model the probability that a given input point belongs to a certain category. It maximizes the likelihood of observing the given data under the model.

Output

The output of OLS can be any real number, while logistic regression produces probabilities that range between 0 and 1, which can be converted into class labels.

Yes and Almost Yes

While logistic regression and OLS are fundamentally different, there are certain circumstances where OLS can be used to estimate logistic regression parameters. This is a topic of interest in advanced econometrics, especially in situations where the dependent variable is binary or almost binary.

Log-odds and OLS

Yes, because the log-odds of the logistic regression can be written as an OLS problem. The logistic regression equation can be rewritten as log-odds ( Z X'B ), which can be transformed to find the parameter estimates ( B_{est} ) using the OLS method. However, this approach may lead to overfitting if the probabilities are very low or high, as log-odds can become very large, causing ( B_{est} ) to overshoot while estimating.

Almost yes, because you can first form an OLS model ( Y XB'mu ) and then pass the Y through a logistic link function to convert it to probabilities, creating ( Z S(Y) S(XB'mu) ). In this sense, logistic regression and OLS have a relational base from the perspective of maximizing likelihood.

No, Not in the Usual Form

In the usual form, logistic regression attempts to predict 0 or 1 using the independent variables, and is estimated by choosing parameter values that maximize the log-likelihood function of the data. This is done through maximum-likelihood estimation, not OLS. However, there are related estimation methods that can be estimated using OLS.

Economics and Market Share

In economics, we often want to estimate models that predict the market share of various products, depending on the product characteristics and market-level features like advertising. These models have consumer behavior that is a lot like a logistic regression, with a twist. For instance, in a car market, consumers buy only one car at a time, making the decision a binary one.

An important 1994 paper by Yale Professor Steve Berry showed that such market share models can be estimated using OLS by using a clever technique. This estimation technique is convenient and allows the use of instruments, which are essential in demand estimation. However, such models do not perfectly capture realistic consumer switching patterns and individual differences in opinion.

Berry, along with Levinsohn and Pakes, found a more generalized model in a later paper that addresses these issues. The estimation in this model is more complicated, but it is now included in STATA, a popular statistical software.

Conclusion

While logistic regression is not an OLS regression, there are situations where OLS methods can be applied to logistic regression problems. These scenarios often involve market share models and demand estimation in economics. Understanding these nuances is crucial for real-world applications and advanced statistical modeling.