Evaluating the effectiveness of content-oriented XML retrieval

The INEX initiative is a collaborative effort for building an infrastructure to evaluate the effectiveness of content-oriented XML retrieval. In this paper, we show that evaluation methods developed for standard test collection must be modified in order to deal with retrieval of structured documents. Specifically, size and overlap of document components must be taken into account. For this purpose, we use coverage in addition to relevance as evaluation criterion, in combination with multi-valued scales for both. A new quality metric based on the notion of concept spaces is developed. We compare the results of this new metric with the results obtained by the metric used in the first round of INEX, in 2002


