Difference between logit and probit models

  • What is the difference between Logit and Probit model?

    I'm more interested here in knowing when to use logistic regression, and when to use Probit.

    If there is any literature which defines it using R, that would be helpful as well.

    There exists hardly any difference between the results of the two (see Paap&Franses 2000)

    I once had an extensive (bioassay) dataset where we could see probit fitted marginally better, but it made no difference for conclusions.

    the graph of logit model is aproach to 0 slowly while the the probit model quickly......

    @Alyas Shah: and that is the explanation why with my data probit fited (marginally) better---because above a certain dose, mortality is 100%, and below some treshold, mortality is 0%, so we dont see the slow approach of the logit!

    For real data,by opposition with data generated from either logit or probit, a considerate approach to the issue would be to run a model comparison. In my experience, the data rarely leans towards one of the two models.

    I've heard that the practical use of the logistic distribution originates from its similarity to the normal CDF and its much simpler cumulative distribution function. Indeed the normal CDF contains an integral that must be evaluated - which I guess was computationally costly back in the days.

    @kjetilbhalvorsen: Maybe a cardinal logistic with three or more levels would be a best fit?

  • vinux

    vinux Correct answer

    9 years ago

    They mainly differ in the link function.

    In Logit: $\Pr(Y=1 \mid X) = [1 + e^{-X'\beta}]^{-1} $

    In Probit: $\Pr(Y=1 \mid X) = \Phi(X'\beta)$ (Cumulative normal pdf)

    In other way, logistic has slightly flatter tails. i.e the probit curve approaches the axes more quickly than the logit curve.

    Logit has easier interpretation than probit. Logistic regression can be interpreted as modelling log odds (i.e those who smoke >25 cigarettes a day are 6 times more likely to die before 65 years of age). Usually people start the modelling with logit. You could use the likelihood value of each model to decide for logit vs probit.

    Thanks for your answer Vinux. But I also want to know when to use logit, and to use probit. I know logit is more popular than probit, and majority of the cases we use logit regression. But there are some cases where Probit models are more useful. Can you please tell me what are those cases. And how to distinguish those cases from regular cases.

    When you are concerned with tail part of the curve, sometime the selection of logit or probit matters. There is no exact rule to select probit or logit. You can select model by looking at likelihood (or log likelihood) or AIC.

    Thanks for the advice! Can you elaborate on how to select between logit vs probit? In particular: (1) How do I tell when you are concerned with the tail part of the curve? (2) How do I select a model by looking at likelihood, log likelihood, or AIC? What specifically should I look at, and how should this influence my decision about which model to use?

    Well, could you give examples in which logit fails compared to probit? I cannot find the ones you have in mind.

    What is the relationship between X and X' ??

    Yes but, is there any theory behind which one to fits better binomial data?

    @flies Here $X'$ denotes the transpose of the matrix $X$.

License under CC-BY-SA with attribution

Content dated before 6/26/2020 9:53 AM