Assessment of the Predictive Power of Selected Sociologically Relevant Models
Statistical models are commonly used in the social sciences to describe and explain relationships between variables. The focus is often only on the estimated coefficients, which are interpreted in their direction or tested to see if they are significantly different from zero. Of less interest is whether the model fits the data. The question of whether the model is predictive is not an issue that is typically addressed. Rather, it is assumed that a model that fits the data well is also suitable for making predictions about unobserved outcomes.
This thesis examines whether the best statistical models from two time periods, 1975 to 1985 and 2010 to 2020, are suitable for making predictions. Models that deal with the prediction of fertility, marriage, divorce, and regional migration are analyzed. The best models are selected based on the citation metrics of the journals in which they were published. It is assumed that authors want to publish their model in the best possible journal and that the journal with better content will be cited more often.
To evaluate the predictive power, the estimation dataset and the model are reproduced according to the descriptions in the articles. As part of a cross-validation, the predictive power is evaluated using metrics previously determined in a literature review.
The study shows that the datasets used to estimate the models are difficult to reproduce because it is not always clear from the articles examined which version of a dataset was used, how variables were coded, and how missing values were handled. As a result, descriptive statistics and marginal distributions could not always be accurately reproduced. The same applies to the size of the model coefficients.
The cross-validation of the models shows that most of the models are not suitable for prediction. In many cases, the predicted probabilities of occurrence or non-occurrence of an event, such as the birth of a child, do not differ. Furthermore, the predictive power of the models does not improve over time, either within or across the topics. Only the methods used to estimate the models show technical progress. Furthermore, the results suggest that the code of the data analysis should be published to ensure the traceability and reproducibility of the results.