SeqFeatR for the Discovery of Feature-Sequence Associations
Specific selection pressures often lead to specifically mutated genomes. The open source software SeqFeatR has been developed to identify associations between mutation patterns in biological sequences and specific selection pressures (“features”). For instance, SeqFeatR has been used to discover in viral protein sequences new T cell epitopes for hosts of given HLA types. SeqFeatR supports frequentist and Bayesian methods for the discovery of statistical sequence-feature associations. Moreover, it offers novel ways to visualize results of the statistical analyses and to relate them to further properties. In this article we demonstrate various functions of SeqFeatR with real data. The most frequently used set of functions is also provided by a web server. SeqFeatR is implemented as R package and freely available from the R archive CRAN (http://cran.r-project.org/web/packages/SeqFeatR/index.html). The package includes a tutorial vignette. The software is distributed under the GNU General Public License (version 3 or later). The web server URL is https://seqfeatr.zmb.uni-due.de.