Species distribution modelling of stream macroinvertebrates at the catchment scale
This thesis investigated the usefulness of broad-scale variables as predictors on modelling distributions of single macroinvertebrate species, the spatial transferability of the distribution models and the application of field validation. Species distribution models (SDMs) allow the prediction of the spatially explicit presence and absence of species based on environmental predictors that reflect the species’ habitat requirements. Thus, SDMs constitute a useful tool to predict the distribution of occurrences or abundances across a landscape, sometimes requiring extrapolation in space and time. SDMs therefore provide an appropriate way to gain information on species occurrences within one catchment without need to produce a catchment-wide sampling data. Hence, the use of SDMs is a cost- and time-effective alternative. SDMs for riverine systems rise specific demands, as they are heterogeneous, directional highly structured networks, connected laterally, longitudinally and vertically. According to the Water Framework Directive (WFD, European Commission 2000) SDMs may provide useful tools to determine the actual distribution, to predict potential donor populations for the recolonization of a restored section or to compute potential changes in the ecological status due to restorations. Local environmental variables are mostly monitored at single river sections but for the construction of SDMs and their application as prediction tool, continuous variables along entire river networks are obligatory. This fact points at a dilemma in the data availability. In this study, broad-scale variables as surrogate predictors of macroinvertebrate distribution were used to train and validate SDMs in a mountainous river catchment. The presence/absence of macroinvertebrate species was extensively scanned by two field campaigns in two years (2010 and 2011). Besides gaining detailed insights into the species’ actual distribution and the determination of their ecological requirements (chapter 3), the entire data set of 225 sampling sites was split up spatially (Lenne watershed and upper Ruhr watershed) to develop SDMs based on broad-scale predictors and to gain first insights into the transferability of SDMs between adjacent watersheds (chapter 4). In a next step, the data set was split up temporally (2010 and 2011) to properly validate the models using field data of equal sampling design and compare the results to internal cross-validation and independent survey data (chapters 5). Overall, reliable performance and predictive power were found for models of Dinocras cephalotes in both watersheds. Models of several other species performed fair in the river Ruhr (Leuctra geniculata, Silo piceus, Siphlonurus lacustris) but not in the Lenne system or vice versa (Hydropsyche instabilis). Broad-scale SDMs included predictors on physical habitat quality as well as riparian land use at a similar extent. For five out of eleven species, the SDMs including fine-scale predictors (e.g. physico-chemistry, microhabitat distribution) outperformed those models using broad-scale predictors only. I suggest that species specifically distributed in upstream reaches explicitly respond to fine-scale variables due to stronger dependency of their occurrences on local conditions. Model transferability from one watershed to another was low (transferability index < 0.60), thus revealing SDMs not only to be species-specific but also variable across adjacent watersheds. The transferability was suggested to be limited not only by actual environmental differences between both watersheds, but also by legacy land use effects that may continue to affect the recent distribution of macroinvertebrates. Overall, SDMs showed acceptable predictive performance measures for the stonefly Dinocras cephalotes and the caddisflies Silo piceus and Silo pallipes. The model’s performances were neither positively nor linearly correlated with predictive accuracy The comparison of the three different validation approaches revealed an over-estimation of the discriminatory power of cross-validated models over field and independently validated models. SDM predictive performance (AUC and TSS) consistently decreased from cross- to field to independent validation. This highlights the intermediate position of field validation between overly optimistic cross-validation and underestimating independent validation. In addition, species prevalence (ranging 8–50%) affected the model’s predictive performance: SDMs of less prevalent species tend to over-predict species absences rather than presences. These findings show that the SDM’s measure of goodness-of-fit is decoupled from a model’s predictive performance. The comparison of validation approaches suggests the use of new field data (instead of training data or survey data based on differing sampling methods), which provide a more reliable basis for SDM quality assessment and a benchmark for comparisons with other methods, such as cross-validation. The results of this thesis contribute to a better understanding of the habitat preferences and distribution of macroinvertebrate species and the requirements and limitations regarding SDMs. Macroinvertebrate SDMs based on broad-scale predictors led to moderate to weak species-environment relationships. This implies that major factors controlling riverine species distribution remained undetected. In general, this study confirms the lack of generality of species distribution, resulting in poor transferability of models between adjacent areas and points at the need for further research in this field. The study supports other authors that raise caution against possible bias in (over-)estimates of model-prediction due to a cross-validation approach because the models are optimized to deal with the ‘noise’ in the data and might consequently lose generality outside the original data. Field validation constitutes an alternative to avoid splitting of small data sets or using independent survey data that may have been collected under different research targets. Furthermore, low prevalence needs to be considered in defining SDM quality as species rarity is usually a property of data from specialist and/or endangered species that are typically in focus of ecology and conservation management. Thus, assessing predictive performance of SDMs of rare freshwater species poses problems in presence-absence modelling which developers and users of SDMs should bear in mind.