Machine learning analysis of the T cell receptor repertoire identifies sequence features of self-reactivity

Johannes Textor*, Franka Buytenhuijs, Dakota Rogers, Ève Mallet Gauthier, Shabaz Sultan, Inge M.N. Wortel, Kathrin Kalies, Anke Fähnrich, René Pagel, Heather J. Melichar, Jürgen Westermann, Judith N. Mandl*

*Korrespondierende/r Autor/-in für diese Arbeit


The T cell receptor (TCR) determines specificity and affinity for both foreign and self-peptides presented by the major histocompatibility complex (MHC). Although the strength of TCR interactions with self-pMHC impacts T cell function, it has been challenging to identify TCR sequence features that predict T cell fate. To discern patterns distinguishing TCRs from naive CD4+ T cells with low versus high self-reactivity, we used data from 42 mice to train a machine learning (ML) algorithm that identifies population-level differences between TCRβ sequence sets. This approach revealed that weakly self-reactive T cell populations were enriched for longer CDR3β regions and acidic amino acids. We tested our ML predictions of self-reactivity using retrogenic mice with fixed TCRβ sequences. Extrapolating our analyses to independent datasets, we predicted high self-reactivity for regulatory T cells and slightly reduced self-reactivity for T cells responding to chronic infections. Our analyses suggest a potential trade-off between TCR repertoire diversity and self-reactivity. A record of this paper's transparent peer review process is included in the supplemental information.

ZeitschriftCell Systems
Seiten (von - bis)1059-1073.e5
PublikationsstatusVeröffentlicht - 20.12.2023

Strategische Forschungsbereiche und Zentren

  • Forschungsschwerpunkt: Infektion und Entzündung - Zentrum für Infektions- und Entzündungsforschung Lübeck (ZIEL)


  • 205-33 Anatomie
  • 204-05 Immunologie