What is the difference between a "nested" and a "non-nested" model?
In the literature on hierarchical/multilevel models I have often read about "nested models" and "non-nested models", but what does this mean? Could anyone maybe give me some examples or tell me about the mathematical implications of this phrasing?
You asked about the difference between nested and nonnested models. See: - Cox, D. R.: Tests of separate families of hypotheses, Proceedings 4th Berkeley Symposium in Mathematical Statistics and Probability, 1, 105–123 (1961). University of California Press. - Cox, D. R.: Further results on test of separate families of hypotheses. Journal of the Royal Statistical Society B, 406–424 (1962). Where the subject of nonnested or separate models was treated fo
Nested versus non-nested can mean a whole lot of things. You have nested designs versus crossed designs (see eg this explanation). You have nested models in model comparison. Nested means here that all terms of a smaller model occur in a larger model. This is a necessary condition for using most model comparison tests like likelihood ratio tests.
In the context of multilevel models I think it's better to speak of nested and non-nested factors. The difference is in how the different factors are related to one another. In a nested design, the levels of one factor only make sense within the levels of another factor.
Say you want to measure the oxygen production of leaves. You sample a number of tree species, and on every tree you sample some leaves on the bottom, in the middle and on top of the tree. This is a nested design. The difference for leaves in a different position only makes sense within one tree species. So comparing bottom leaves, middle leaves and top leaves over all trees is senseless. Or said differently: leaf position should not be modelled as a main effect.
Non-nested factors is a combination of two factors that are not related. Say you study patients, and are interested in the difference of age and gender. So you have a factor ageclass and a factor gender that are not related. You should model both age and gender as a main effect, and you can take a look at the interaction if necessary.
The difference is not always that clear. If in my first example the tree species are closely related in form and physiology, you could consider leaf position also as a valid main effect. In many cases, the choice for a nested design versus a non-nested design is more a decision of the researcher than a true fact.
Nested vs non-nested models come up in conjoint analysis and IIA. Consider the "red bus blue bus problem". You have a population where 50% of people take a car to work and the other 50% take the red bus. What happens if you add a blue bus which has the same specifications as the red bus to the equation? A multinomial logit model will predict 33% share for all three modes. We intuitively know this is not correct as the red bus and blue bus are more similar to one another than to the car and will thus take more share from one another before taking share from the car. That is where a nesting structure comes in, which is typically specified as a lambda coefficient on the similar alternatives.
Ben Akiva has put together a nice set of slides outlining the theory on this here. He begins talking about nested logit around slide 23.
One model is nested in another if you can always obtain the first model by constraining some of the parameters of the second model. For example, the linear model $ y = a x + c $ is nested within the 2-degree polynomial $ y = ax + bx^2 + c $, because by setting b = 0, the 2-deg. polynomial becomes identical to the linear form. In other words, a line is a special case of a polynomial, and so the two are nested.
The main implication if two models are nested is that it is relatively easy to compare them statistically. Simply put, with nested models you can consider the more complex one as being constructed by adding something to a more simple "null model". To select the best out of these two models, therefore, you simply have to find out whether that added something explains a significant amount of additional variance in the data. This scenario is actually equivalent to fitting the simple model first and removing its predicted variance from the data, and then fitting the additional component of the more complex model to the residuals from the first fit (at least with least squares estimation).
Non-nested models may explain entirely different portions of variance in the data. A complex model may even explain less variance than a simple one, if the complex one doesn't include the "right stuff" that the simple one does have. So in that case it is a bit more difficult to predict what would happen under the null hypothesis that both models explain the data equally well.
More to the point, under the null hypothesis (and given certain moderate assumptions), the difference in goodness-of-fit between two nested models follows a known distribution, the shape of which depends only on the difference in degrees of freedom between the two models. This is not true for non-nested models.
Two models are nonested or separate if one model cannot be obtained as limit of the other (or one model is not a particular case of the other)
Can you clarify what you mean by 'limit of the other'? A nested model can be seen as one having some restriction on the parameters space compared to another, but I'm not sure if this what you intended to write.
See a simpler answer in this pdf. Essentially, a nested model is a model with less variables than a full model. One intention is to look for more parsimonious answers.
Unfortunately, this is a simpler answer only because it is describing a different type of "nested model" than the type the OP is asking about. The OP asks instead about nested models in the context of *hierarchical / multilevel models*. That is, this answer, while correct in its own terms, is incorrect in the context of this thread.