1 High-throughput single cell analysis of B-cell receptor usage among autoantigen specific 1
plasma cells in celiac disease 2
Bishnudeo Roy, Ralf S. Neumann†, Omri Snir, Rasmus Iversen, Knut E.A. Lundin†,‡, and 3
Ludvig M. Sollid,†
4
Centre for Immune Regulation and Department of Immunology, University of Oslo and Oslo 5
University Hospital, 0372, Oslo, Norway 6
†KG Jebsen Coeliac Disease Research Centre and Department of Immunology, University of 7
Oslo and Oslo University Hospital, 0372, Oslo, Norway.
8
‡KG Jebsen Coeliac Disease Research Centre and Department of Gastroenterology, Oslo 9
University Hospital-Rikshospitalet, Oslo, Norway 10
11
Corresponding author – Ludvig M. Sollid, Department of Immunology, Oslo University 12
Hospital-Rikshospitalet, 0372 Oslo, Norway. Phone: +47 230 73 811; Fax: +47 230 73 510;
13
Email: l.m.sollid@medisin.uio.no 14
15
Running title – TG2-reactive autoantibody repertoire 16
17
2 Abstract
18
Characterization of antigen-specific BCR repertoires is essential for understanding disease 19
mechanisms involving humoral immunity. This is optimally done by interrogation of paired 20
heavy and light chain variable region (VH and VL) sequences of individual and antigen- 21
specific B cells. By applying single cell high-throughput sequencing on gut lesion plasma 22
cells (PCs), we have analyzed the transglutaminase 2 (TG2)-specific VH:VL autoantibody 23
repertoire of celiac disease (CD) patients. Autoantibodies against TG2 are a hallmark of CD, 24
and anti-TG2 IgA-producing gut PCs accumulate in patients upon gluten ingestion.
25
Altogether, we analyzed paired VH and VL sequences of 1482 TG2-specific and 1421 non- 26
TG2-specific gut PCs from 10 CD patients. Among TG2-specific PCs, we observed a striking 27
bias in IGHV and IGKV/IGLV gene usage as well as pairing preferences with a particular 28
presence of the IGHV5-51:IGKV1-5 pair. Selective and biased VH:VL pairing was 29
particularly evident among expanded clones. In general, TG2-specific PCs had lower numbers 30
of mutations both in VH and VL genes than non-TG2-specific PCs. TG2-specific PCs using 31
IGHV5-51 had particularly few mutations. Importantly, VL segments paired with IGHV5-51 32
displayed proportionally low mutation numbers suggesting that the low mutation rate among 33
IGHV5-51 PCs is dictated by the BCR specificity. Finally, we observed selective amino acid 34
changes in VH and VL and striking CDR3 length and J segment selection among TG2- 35
specific IGHV5-51:IGKV1-5 pairs. Hence, this study reveals features of a disease- and 36
antigen-specific autoantibody repertoire with preferred VH:VL usage and pairings, limited 37
mutations, clonal dominance and selection of particular CDR3 sequences.
38 39
3 Introduction
40
Autoimmune diseases are typically characterized by the presence of specific autoantibodies.
41
Antibodies are soluble Igs consisting of heavy and light chains that are produced by plasma 42
cells (PCs) as terminally differentiated B cells. Cell-surface Ig together with accessory 43
molecules make up the BCR that allows B cells to specifically recognize antigens. IgG- 44
producing PCs are devoid of cell-surface Ig, whereas IgA- and IgM-producing PCs retain a 45
functional BCR (1, 2). Recognition of autoantigen by BCRs and antibodies is considered key 46
events in adaptive immune responses that can lead to the development of autoimmune disease.
47
Upon recognition of antigen, and typically with the help of T cells, B cells proliferate and 48
undergo affinity maturation by accumulation of somatic mutations in Ig genes. B-cell 49
responses to foreign or self-antigens are characterized by the activation of multiple reactive B- 50
cell clones. During the response, there is selection of B cells that are particularly fit to 51
recognize antigen. Interrogation of an autoantibody response is ideally done by large-scale 52
characterization of the BCR repertoire of antigen-specific cells at a single cell level. This is 53
now feasible with arrival of high throughput sequencing (HTS) technologies. Recently, 54
analysis of thousands of naïve and antigen-experienced single B-cells in healthy subjects was 55
reported (3), but so far no studies have been done with knowledge of the antigen involved.
56
Celiac disease (CD) presents as a disease ideal to pioneer this type of approach in relation to 57
autoimmunity. CD is an autoimmune disorder driven by exposure to dietary gluten proteins 58
that is characterized by highly disease-specific antibodies reactive with the enzyme 59
transglutaminase 2 (TG2) and selective killing of enterocytes mediated by immune cells (4).
60
Presence of serum IgA anti-TG2 antibodies at high titer is now considered diagnostic in 61
children, and the IgA anti-TG2 antibodies are among the autoantibodies with the highest 62
specificity and sensitivity for any autoimmune disease (5). In the celiac lesion of the proximal 63
small intestine there is accumulation of TG2-specific IgA PCs, which on average accounts for 64
4 10% of the local PCs (6). These cells express cell surface Ig, allowing isolation of antigen- 65
specific cells from gut biopsies of individual patients by use of labeled TG2. In previous 66
studies, we reported generation of a panel of 63 anti-TG2 mAbs (6) as well as bulk HTS of 67
IGHV genes (7) from TG2-specific PCs. In this study, we have developed a high-throughput 68
protocol for sequencing of heavy and light chain variable regions (VH and VL) of single 69
antigen-specific cells, and used it to characterize the anti-TG2 IgA response in CD patients.
70
The study gives detailed insight into important aspects of an autoimmune B-cell response, 71
including the nature of clonal expansions and the restricted usage and strong pairing 72
preference of particular VH and VL gene segments.
73 74
5 Materials and methods
75
Subjects and cells 76
Biopsies used for preparing the single cell suspension was collected from a total of 10 CD 77
patients - 8 untreated consuming a normal diet and 2 treated consuming a gluten-free diet.
78
Diagnosis of all the subjects was done according to the guidelines of the British Society of 79
Gastroenterology (8). Clinicopathological details of all the subjects are given in Table I. TG2- 80
specific serum IgA Ab levels were determined using Celikey Tissue Transglutaminase IgA kit 81
(Thermo Fisher Scientific). Prior to collecting biopsies, informed consent was obtained from 82
all patients and the study had been approved by the Regional Ethics Committee of South- 83
Eastern Norway (REK 2010/2720). Duodenal biopsies obtained by endoscopy were collected 84
in ice-cold RPMI 1640 (Sigma-Aldrich). To obtain lamina propria lymphocytes, tissues were 85
digested with collagenase (1 mg/ml; Sigma-Aldrich) in RPMI with 3% FCS at 37°C for 1 86
hour under constant rotation. Tissue debris was removed by passing the digest through a 40 87
µM cell strainer, followed by, centrifugation, and washing 2 times with RPMI. The single-cell 88
suspension was cryopreserved in RPMI containing 50% FCS and 10% DMSO until used.
89 90
Sorting of single plasma cells 91
Recombinant human TG2 produced in Sf9 insect cells with an N-terminal BirA biotinylation 92
site was used for sorting of TG2-specific PCs. Biotinylated TG2 and APC-conjugated 93
streptamers (IBA) were mixed at a 4:1 molar ratio in PBS supplemented with 3% FCS and 94
incubated for 1 hour on ice to generate TG2-multimers. For staining, single-cell suspensions 95
from gut biopsies were incubated with TG2-streptamer-APC conjugate, α-IgA-FITC 96
(Southern Biotech), α-CD3-BV570 (BioLegend), α-CD14-BV570 (BioLegend), α-CD19-PB 97
(BioLegend), α-CD27-PE-Cy7 (eBioscience) and α-Ig-PerCPCy5.5 (BioLegend) for 45 98
minutes on ice. TG2-specific (CD3-CD14-CD27+CD19+/-IgA+TG2+) and non-TG2-specific 99
6 (CD3-CD14-CD27+CD19+/-IgA+TG2-) single PCs were sorted using FACS ARIA II (BD), 100
into 96-well plates containing 5µl of catch buffer containing RNAse free H2O, 1x Ist strand 101
buffer (Invitrogen), 800nM STRT-T30 primer (9) (alone or in combination with 800nM 102
CACH1 (10) primer), 800nM barcoded (well specific) TSO primers (9), 5mM DTT 103
(Invitrogen), 0.8 U/µl RNAsin (Promega) and 0.02% Tween 20 in each well. Immediately 104
after sorting, plates were sealed using plate sealers, centrifuged at 2500 rpm for one minute 105
and stored at -70°C until cDNA synthesis.
106 107
Single cell RT-PCR and cDNA purification 108
Prior to cDNA synthesis, plates containing single cells were incubated at 72°C for 3 minutes, 109
centrifuged and immediately placed on ice. Each well was added with 5µl of RT mix 110
containing RNAse free H2O, 1x Ist strand buffer, 2mM dNTPs (Thermo Scientific), 1.6M 111
betaine (Sigma-Aldrich), 12mM MgCl2 (Sigma-Aldrich), 0.8 U/µl RNAsin and 4U/µl 112
SuperScript II Reverse Transcriptase (Invitrogen). For cDNA synthesis, plates were incubated 113
at 42°C for 70 min; and 70°C for 10min. After synthesis, cDNA was purified using Agencourt 114
RNAClean XP beads (Beckman Coulter) and stored at -20°C until further use.
115 116
PCR sequencing library preparation 117
VH, V and V genes were amplified from cDNA in two rounds of PCR. For the 1st round, 118
PCR reaction mix, containing H2O, 1X KAPA HiFi HotStart ReadyMix (Kapa Biosystems ), 119
0.25 µM of each primer - STRT For 2, CaCH1-2 (10), IgK GSP1 (11), IgLC Rev (12) 120
(Supplemental Table I), 0.05U/µl USER Enzyme (New England Biolabs) and cDNA in a total 121
volume of 20µl, was subjected to following conditions: 37ºC for 15 min (to facilitate the 122
action of USER enzyme), 95ºC for 2 min, 1x (98ºC for 15 sec, 70ºC for 30sec, 72ºC for 40 123
sec), 1x (98ºC for 15 sec, 67ºC for 30sec, 72ºC for 40 sec), 23x (98ºC for 15 sec, 60ºC for 124
7 30sec, 72ºC for 40 sec), and 72ºC for 5 min. For the 2nd round, a reaction mixture containing 125
H2O, 1X KAPA HiFi HotStart ReadyMix, 0.25 µM forward primer (R2-STRT), 0.25µM of 126
barcoded primers corresponding to IGHJ, IGKC and IGLC (Supplemental Table I) and 1st 127
round PCR product was subjected to following conditions: 95ºC for 2 min, 1x (98ºC for 15 128
sec, 70ºC for 30sec, 72ºC for 40 sec), 1x (98ºC for 15 sec, 67ºC for 30sec, 72ºC for 40 sec), 129
13x (98ºC for 15 sec, 60ºC for 30sec, 72ºC for 40 sec), and 72ºC for 5 min. Illumina MiSeq 130
adapter sequences were introduced at both ends of 2nd round PCR products in 3rd round of 131
PCR, using Qiagen Multiplex PCR Kit (Qiagen) under PCR conditions: 95°C for 15 min, 10x 132
(95°C for 30 sec, 60°C for 45 sec, 72°C for 90 sec) and 72°C for 10 min. Final amplicon 133
libraries were first concentrated using Agencourt AMPure XP beads (Beckman Coulter), then 134
extracted from agarose gel using QIAquick Gel Extraction Kit (Qiagen), further purified using 135
QIAquick PCR Purification Kit (Qiagen) and then paired-end sequencing of 300 base pairs 136
was performed using Illumina MiSeq at the Norwegian Sequencing Centre, Oslo, Norway 137
(http://www.sequencing.uio.no).
138 139
Processing of raw sequencing data 140
Quality evaluation of raw reads was first done using FastQC 141
(http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and raw data were further 142
processed using pRESTO (repertoire sequencing toolkit) (13). In short; reads with mean 143
Phred quality <30 were removed, followed by removal of reads that did not contain valid 144
primers. Reads were then assembled to a full sequence. Sequences were then marked 145
according to gene specific primers (IGH/IGK/IGL), well, plate indices and identical 146
sequences belonging to the same well and plate were collapsed to remove duplicate sequences.
147
The number of sequences collapsing as one sequence was denoted as “dupcount”. Only 148
sequences with ≥2 dupcount were used for further analysis. The collapsed VH and VL 149
8 sequences were annotated using IMGT HighV-Quest (14). Further data analysis and 150
preparation was done using in house developed sequence analysis pipeline - Immune 151
Receptor Information System (IRIS). Only sequences having number of dupcounts ≥5 was 152
included for the data preparation. In cases, where, for a cell, sequences more than one VH:VL 153
pair appeared, the VH/VL sequence with dupcount ≥5 fold higher than others was selected for 154
analysis, otherwise, the corresponding well was assumed to have more than one cell and 155
discarded from the analysis. If a well contained both IGKV and IGLV sequences then the 156
corresponding cell was assumed to express both chains. Highly similar (<5 nucleotide 157
difference) IGHV/IGKV/IGLV gene segments belonging to the same subgroup were 158
commonly named as IGH/K/L-VX-p (e.g. IGKV1-39-p for IGKV1-39 and IGKV1D-39) to 159
avoid ambiguous IGHV/IGKV/IGLV gene assignments by IMGT. In case ambiguity with 160
regards to IGHV/IGKV/IGLV gene and IGHJ/IGKJ/IGLJ gene assignments still remained, 161
such sequences were not included for the calculation of respective gene usage frequency.
162
Circos plots, depicting the VH:VL pairing frequency was generated using Circos Table 163
Viewer (15). Plots, showing amino acid changes were generated using Seq2Logo (16) 164
165
Clonal assignment between different subpopulations 166
Functional VH and VL sequences from sorted TG2-specific and non-TG2-specific PCs were 167
assigned into clonal groups using Change-O – Repertoire clonal assignment toolkit (17).
168
Clones were defined as having same IGHV/IGKV/IGLV, IGHJ/IGKJ/IGLJ assignments, and 169
junction length with hs5f substitution model and 0.15 distance threshold. First, clonal 170
assignment was done separately for VH and VL chain sequences. Then a VL sequence was 171
assumed to be part of a clonal group only if the corresponding VH sequence also belonged to 172
the same group.
173 174
9 Generation of lineage trees
175
Lineage trees for different clonal groups were generated using Alakazam tool (17) after 176
assigning a full length IMGT-numbered, germline sequence to each clone using IMGT 177
reference database of IGHV/IGKV/IGLV and IGHJ/IGKJ/IGLJ gene sequences with a 178
masked (replaced with Ns) D region. Maximum parsimony lineages were inferred using the 179
dnapars application of PHYLIP (18) using the germline sequence as the outgroup. This was 180
followed by recursively replacing inferred ancestors in each tree with descendants having a 181
Hamming distance of zero from their inferred parent. Branch lengths were assigned as the 182
Hamming distance between sequences; i.e. the number of unambiguous mutation events.
183 184
Generation of monoclonal antibodies 185
TG2-specific mAbs were expressed as human IgG1 in 293-F cells and purified on Protein G 186
as previously described (19). To obtain the plasmids for expression of lambda-containing 187
mAbs, synthetic DNA (GenScript) encoding the VH and VL regions of selected plasma cells 188
was subcloned into immunoglobulin expression vectors between the AgeI and SalI (VH) or 189
the AgeI and XhoI (VL) sites (20). Reversion of IGKV1-5 residue 106 to germline 190
configuration was done by overlap-extension PCR using anti-TG2 mAb 679-14-E06 as 191
template (6). The resulting kappa chain sequence was cloned into the expression vector 192
between the AgeI and HindIII sites.
193 194
ELISAs 195
Recombinant human TG2 produced in insect cells (Phadia) was coated overnight in TBS at 3 196
µg/ml. For comparison of wild-type and mutant versions of TG2, proteins were produced in 197
Escherichia coli as previously described (19, 21). mAbs were added in various concentrations 198
in TBS containing 0.1% (v/v) Tween 20 (TBST) followed by incubation for 1 h at 37ºC. After 199
10 washing with TBST, bound mAbs were detected using alkaline phosphatase (AP)-conjugated 200
rabbit anti-human IgG (Abcam). For competitive ELISAs, coated TG2 was incubated with 201
various concentrations of lambda-containing IgG1 mAbs in 100 µl 3%(w/v) BSA/TBST for 202
30 min at 37ºC. Without removing the IgG1 mAbs, different kappa-containing anti-TG2 IgA1 203
mAbs were added in 10 µl buffer at 0.2 µg/ml followed by continued incubation for 1 h. After 204
washing with TBST, bound IgA1 was detected using AP-conjugated goat anti-human IgA 205
(Sigma). Measured OD values were compared to the signals obtained in the absence of 206
competing IgG1 mAbs.
207 208
11 Results
209
Paired analysis of TG2-specific VH and VL repertoire from single cells 210
Difficulties in isolating antigen-specific PCs have impeded large-scale analysis of 211
authentically paired VH and VL genes. We took advantage of CD and the disease-associated 212
expansion of TG2-specific PCs to generate a large-scale data set of paired VH:VL genes 213
using a single-cell HTS approach. Duodenal biopsies from 10 CD patients, most of whom 214
untreated, were used for single-cell flow cytometry sorting (Table I). cDNA was synthesized 215
using oligo-DT or a combination of oligo-DT and IgA-specific primers and unbiased template 216
switching (9, 22). VH and VL PCR amplicons were generated and sequenced on the Illumina 217
MiSeq platform. The gene segment usage of each cell could be assigned by use of cell- and 218
sample-specific barcodes. For VL sequencing, the success rate was around 85% of the input 219
cells. The inclusion of IgA-specific primers for cDNA synthesis improved VH sequencing 220
efficiency from 40% to 75% compared to oligo-DT primers resulting in paired sequence 221
information for around 60% of the input cells with this protocol. Only cells with paired VH 222
and VL sequence information were considered for further analysis. A total of 1482 TG2- 223
specific and 1421 non-TG2-specific PCs were analyzed (Table II). Consistent with earlier 224
findings (7), TG2-specific PC populations contained many clonally related sequences and 225
showed less heterogeneity than non-TG2-specific PCs, indicating a restricted repertoire 226
(Table II).
227 228
Certain L chain V gene segments are overrepresented among TG2-specific PCs 229
Similar to biases observed in BCR V gene usage in several autoimmune conditions (23, 24), 230
VH gene usage among TG2-specific PCs from CD patients were previously found to show a 231
strong bias towards a few IGHV gene segments (6, 7). In the current study, the analysis was 232
extended to TG2-specific VL sequences. Usage frequency was calculated by taking only 233
12 unique clonotypes into account to exclude the effect of clonal expansion. In a variety of 234
autoantibodies, including rheumatoid factor and antibodies with specificity for dsDNA, 235
phospholipids, histone A2 and laminin, L chain was more frequent than L chain (23-26).
236
In contrast, TG2-specific PCs expressed at higher percentages than non-TG2-specific PCs 237
(Fig. 1A, 1B). In a previous study, done with a small number of sequences, only 5% of TG2- 238
specific PCs were estimated to use light chains (6), whereas here, 20% of them were - 239
expressing. The reason for this difference could reflect patient-to-patient variation or the use 240
of different antigen preparations for staining of PCs in two studies. To test if the -expressing 241
PCs are truly TG2-reactive and verify the specificity of our staining, we selected antibody 242
sequences of two such PCs and expressed them recombinantly as human IgG1. Both mAbs 243
were TG2-specific and found to target epitopes located in the N-terminal or core domain of 244
the enzyme (Supplemental Fig. 1A, 1B, 1C), which is in agreement with the location of 245
epitopes previously assigned to anti-TG2 mAbs (19, 21). One of the mAbs used the IGHV3- 246
48 gene segment that was previously found to be overrepresented among TG2-specific mAbs 247
together with light chains (6). Most of the mAbs using IGHV3-48 were found to target the 248
same epitope located around residue Arg19 (21, 27). Importantly, the -containing IGHV3-48 249
mAb was not reactive with a TG2 R19S mutant (Supplemental Fig. 1B), suggesting that the 250
same epitopes can be targeted by - and -using antibodies.
251
The VL gene segments preferentially used by TG2-specific PCs belonged primarily to the 252
IGKV1 gene family. In particular, IGKV1-39 (18.4%) and IGKV1-5 (14.4%) were frequently 253
used (Fig. 1C, 1D). No noticeable difference was observed in the frequency of JH or JL gene 254
usage between TG2-specific and non-TG2-specific PC populations (Fig. 1E).
255 256
IGHV5-51 and IGKV1-5 preferentially pair among TG2-specific PCs 257
13 To gain insight into the dependence of antibody specificity on VH:VL and to find TG2- 258
specific signatory VH:VL combinations, the frequency of each VH:VL pair was calculated.
259
Only unique clonotypes were taken into account in order to exclude the influence of clonal 260
expansion on observed pairing frequencies. In general, each VH segment paired only with a 261
fraction of VL segments (Fig. 2A, 2B). A number of VH:VL pairs, including IGHV5- 262
51:IGKV1-5, IGHV1-69:IGKV1-17, IGHV3-48:IGLV5-45 were found at very prominent 263
frequencies among TG2-specific PCs, and in some cases they were completely absent among 264
non-TG2-specific PCs (Fig. 2A, 2B, 2C). This is in agreement with a recent study which 265
showed that, relative to the naïve repertoire, some VH:VL pairs were increased or decreased 266
among antigen-experienced B cells (3). Among TG2-specific PCs, the IGHV5-51:IGKV1-5 267
pair was the most frequent (8.3%) (Fig. 2C). On average, of all TG2-specific PCs having 268
IGHV5-51, 45.2% contained IGKV1-5 (Fig. 2D, 2E) indicating that the specificity of these 269
antibodies depends on both chains. This is consistent with a previous study showing that, 270
some TG2-specific antibodies lose binding when native VH:VL pairing is changed (19).
271
Besides IGKV1-5, although at lower frequencies, IGHV5-51 also associated with IGKV1-39 272
and IGKV3-20 (Fig. 2C, 2D, 2E). Though most of the highly frequent VH:VL pairs featured 273
L chains, some pairs – IGHV3-48:IGLV5-45, IGHV3-48:IGLV1-47 and IGHV3-74:IGLV1-44 274
featuring L chain were found exclusively among TG2-specific PCs (Fig. 2C).
275
Analysis of BCR-repertoire similarity using Morisita-Horn indices demonstrated that TG2- 276
specific VH:VL sequences from different CD patients clustered closely together and had more 277
relatedness than non-TG2-specific VH:VL sequences (Supplemental Fig. 2). This confirms 278
the presence of stereotypic sequences in TG2-specific antibody responses among different 279
individuals.
280 281
Certain TG2-specific VH:VL pairs undergo frequent clonal expansion 282
14 After antigen encounter, B-cell clones with high affinity to specific antigens, get selected and 283
undergo clonal expansion. To get an indication of whether frequently found TG2-specific 284
VH:VL pairs are selected for specificity and affinity, we analyzed the extent of clonal 285
expansion among TG2-specific PCs and the propensity with which certain VH:VL pairs 286
expanded in individual CD patients. For establishing true clonality, both VH and VL 287
sequences were taken into account. As would be expected for an antigen-restricted immune 288
reaction, cell-per-clonotype ratio for all the CD patients was higher than 1, thus reflecting 289
clonal expansion among TG2-specific PCs (Fig. 3A). Large fractions of the TG2-specific 290
clones were found to be expanded, whereas this was not seen for non-TG2-specific clones 291
(Fig. 3B). Clonal families with 3 or more cells were also detected at noticeably high levels 292
among TG2-specific PCs (Fig. 3C), indicative of a restricted repertoire. Certain VH:VL pairs 293
from TG2-specific PCs showed a tendency to expand, and most of them contained IGKV1-39 294
L chains (Fig. 3D). Among all the clonally expanded TG2-specific VH:VL pairs, the IGHV5- 295
51:IGKV1-5 pair was present in most patients (Fig. 3E, 3F). Though this VH:VL pair could 296
also be detected among the non-TG2-specific PC population in 3 out of 10 patients, these 297
cells did not show any expansion (Fig. 3F). The VH:VL pair associated with most expanded 298
clones differed patient to patient (Fig. 3G). Interestingly, most of these biggest sized clonal 299
families consisted of IGKV1-39 associated with different VH gene segments (Fig. 3H). This is 300
in agreement with a recent study where, IGKV1-39:IGKJ2 rearrangement was indeed found to 301
be very promiscuous among human peripheral blood memory cells (28). In comparison to 302
IGKV1-39, IGKV1-5 did not show a tendency of pairing with as diverse VH gene segments 303
among expanded clones (Fig. 3D, 3E), thus indicating a restricted specificity. Next to IGKV1- 304
39, IGKV3-20 also paired with many different VH gene segments among expanded TG2- 305
specific PCs (Fig. 3D, 3F). This might reflect the observation that IGKV3-20 is among the 306
mostly used VL gene segments in humans (29).
307
15 TG2-specific VH and VL carry few mutations
308
We next analyzed the phylogenetic relationship between cells belonging to individual 309
clonotypes and the extent of mutation among TG2-specific VH and VL sequences.
310
Phylogenetic analysis demonstrated that clones with a particular mutation expanded more 311
than others (Fig. 4), indicating that certain introduced mutations have increased the affinity 312
for TG2. Interestingly, though most of the clones were mutated, a number of them had low 313
mutation levels and a few had only acquired a single mutation (Fig. 4). Interestingly, we 314
observed that mutation levels in the TG2-specific IGHV and IGKV/IGLV gene segments were 315
significantly lower than those of non-TG2-specific PCs, and the decrease in number of 316
mutations in H and L chains was proportionate (Fig. 5A, 5B). The global effect on mutational 317
activity, affecting both H and L chain, suggests that TG2-specific B cells undergo limited 318
somatic hypermutation and possibly spend little or no time in germinal centers prior to PC 319
differentiation.
320
We corroborated on mutation frequencies among TG2-specific PCs using IGHV5-51 and 321
found that they contain fewer mutations than TG2-specific PCs using other IGHV gene 322
segments (Fig. 5C) (6). It has been proposed that low mutation rate in IGHV5-51 might be a 323
characteristic feature of this gene segment (30). If so, TG2-specific VL paired to IGHV5-51 324
would be expected to have mutation levels comparable to VL paired to other TG2-specific 325
IGHV genes. Our analysis revealed that this is not the case, as the VL gene segments paired 326
with IGHV5-51 had significantly lower mutation levels than VL gene segments paired with 327
other IGHV genes (Fig. 5D). The numbers of mutations in IGHV5-51 and associated VL gene 328
segments were linearly correlated (Fig. 5E). This suggests that the low mutation rate among 329
IGHV5-51 is not inherent to this IGHV, but rather an antigen-specific phenomenon.
330 331
Amino acid changes in VH and VL are selective 332
16 Given the strong selection for IGHV5-51:IGKV1-5 and assuming that such antibodies dock 333
onto the same epitope of TG2, it was of particular interest to analyze the mutational pattern in 334
this VH:VL pair. Previously, molecular dynamics simulation and mutational analysis of the 335
interaction between TG2 and the Fab fragment of the CD-derived mAb 679-14-E06, that 336
carries IGHV5-51:IGKV1-5 indicated involvement of VH residues D62, D64, K82 and S83 337
and VL residue K56 (19). Consistent with these results, we observed no substitution or 338
conservative substitutions at these positions in the HTS data except for position 64 of VH (Fig.
339
6A, Supplemental Fig. 3). The two AA changes observed for the VL of 679-14-E06 (K45R 340
and Q106H) (19), were observed at a frequency of 4.2% and 49.5%, respectively, among 341
TG2-specific IGKV1-5:IGHV5-51 BCRs (Fig. 6A, Supplemental Fig. 3). The Q106H 342
substitution was significantly more frequent when IGKV1-5 was paired with IGHV5-51 than 343
when it was paired with other IGHV gene segments or when it was used by non-TG2-specific 344
PCs, and the frequency of the Q106H substitution was significantly higher than that in TG2- 345
specific IGKV1-5 paired with non-IGHV5-51 or non-TG2-specific IGKV1-5 (Fig. 6A, 6B, 6C, 346
6D). In a previous study, reversion of two mutations (K45R and Q106H) in VL of 679-14- 347
E06 to germline led to significant reduction in binding to TG2 (6). Here we assessed the 348
Q106H alone, observing a slight but significant reduction in binding to TG2 (Fig. 6E), 349
indicating that this mutation, at a position within the CDR3 region, gives a modest, but 350
possibly decisive increase in antibody affinity.
351 352
TG2 specific IGHV5-51:IGKV1-5 pairs display length bias in their CDR3 sequences 353
We also analyzed CDR3 length and observed uneven distribution with striking 354
overrepresentation of lengths of 14AA and 16AA for the H chain and for 11AA in the L 355
chain when TG2-specific PCs were compared to non-TG2 specific PCs (Fig. 7A). This 356
peculiar distribution skewing was rooted in the bias for IGHV5-51 and IGKV1-5 among TG2- 357
17 specific PCs (Fig. 7B, 7C). Notably, PCs carrying the IGHV5-51:IGKV1-5 pair had a strong 358
dominance of 14AA long CDR3s in the H chain and 11AA long CDR3s in the L chain (Fig.
359
7A, 7B, 7C). Moreover, 11AA long L chain CDR3 were biased toward usage of the IGKJ2 360
gene segment (Fig. 7D). The observed biases are not due to clonal expansion as the frequency 361
calculation was based on number of unique clonotypes. The patterns are striking and hardly 362
coincidental, and they are likely dictated by the epitope interaction. However, a meaningful 363
interpretation will require detailed X-ray crystal structures of prototype antibodies in complex 364
with the TG2 antigen.
365
18 Discussion
366
In this study, we describe isolation and large scale single-cell sequencing of immunoglobulin 367
VH and VL genes of TG2-specific PCs from the CD gut lesion. The results reveal common 368
patterns across patients with a striking preferential VH:VL pairing, dominance of certain 369
VH:VL pairs and significant clonal expansions with the largest expansions most often seen 370
among PCs with the predominant VH:VL pair. Still, in some individuals, more unique VH 371
and VL gene segments and VH:VL pairs were observed among the most abundant clones.
372
Thus, the autoantibody response in CD appears strongly stereotypic, although some individual 373
variation is also seen.
374
The basis for this study is the observation that single TG2-specific IgA- and IgM-expressing 375
PCs can be isolated with TG2-antigen as bait by taking advantage of the expression of surface 376
Ig by PCs (1, 2). Expression-cloning and testing of mAbs from 63 single PCs, revealed a 377
specificity of 90% in the selection of TG2-specific PCs (6). This high selection efficiency is 378
an important prerequisite for the analysis undertaken in the present study.
379
Technological advances now allow paired sequencing of Ig VH and VL genes from large 380
populations of single B cells (3, 11, 28, 31, 32). Many of the really high throughput methods, 381
like emulsion droplet based processing (3, 28), however, only succeed in the analysis of a 382
fraction of the input cells. In settings with analysis of antigen-specific cells the number of 383
cells available for analysis is usually scarce, and it is important to have methods that 384
successfully report paired Ig VH and VL sequences for a high fraction of the cells. With a 385
paired sequencing efficiency of around 60%, our method reasonably succeeds in this 386
requirement.
387
One of the most striking features of the anti-TG2 BCRs that emerge from our analysis is the 388
stereotypic nature of the response both in VH and VL gene usage as well as in VH:VL pairing.
389
19 The stereotypic VH gene usage in the anti-TG2 response is already established (6, 7, 33).
390
Some evidence was obtained for a stereotypic VL response as well, but the finding was based 391
on analysis of a limited number of cells chiefly from one subject, thus limiting the strength of 392
the observation (6). The current study establishes without doubt that there is also a clear 393
stereotypic response in VL gene usage. Our findings also demonstrate a remarkable pairing 394
preference of certain VH and VL genes. The mAbs generated from single PCs of CD patients 395
recognize four major epitopes clustered in the N-terminal region of TG2 (epitope 1-4), and 396
there is a strong correlation between VH gene usage and the epitope specificity of the 397
antibodies (27). A previous detailed analysis of the reactivity of a single IGHV5-51:IGKV1-5 398
encoded mAb recognizing epitope 1 revealed that the epitope recognition involved both VH 399
and VL residues (19). Of the VH:VL pairs we observed in this study, IGHV5-51:IGKV1-5 400
was by far the most strongly preferred. The pairing preference thus likely relates to 401
recognition of TG2 where both VH and VL residues are involved. It could also relate to 402
overall protein stability of antibodies, as certain IGHV and IGKV/IGLV segments have 403
superior fitness for each other (34). B cells expressing such stable VH:VL pairs will have a 404
competitive advantage during their development which could explain domination of VH:VL 405
pairs in the repertoire of naïve B cells. We used as reference test population, non-TG2- 406
specific PCs, which are not fully representative of the naïve repertoire, but since IGHV5- 407
51:IGKV1-5 and many of the other preferred pairs were not prevalent among non-TG2- 408
specific PCs, it is unlikely that the bias we observe is dictated to a large extent by protein 409
stability.
410
The stereotypic anti-TG2 response is further underscored by the calculated Morisita-Horn 411
indices. When analyzing VH and VL sequences and comparing different donors as well as 412
TG2-specific and non-specific PCs, we found that TG2-specific PCs of different individuals 413
showed the highest Morisita-Horn indices, indicating closest relatedness.
414
20 Another striking feature we observed is the presence of clonal expansions among TG2- 415
specific PCs. Indication of clonal expansion among TG2-specific cells was previously 416
observed based on HTS analysis of pools of cells (7). The present study extends this 417
observation as it provides detailed information from single cells. Clonal expansions were 418
typically seen among cells with the preferred VH:VL pairs. For instance, the IGHV5- 419
51:IGKV1-5 pair was frequent among the expanded clones in 80% of the patients.
420
Genealogical analysis revealed that some of the clonal trees were fairly large with some PCs 421
acquiring a significant number of mutations, but there were also smaller trees that typically 422
had nodes with several replicates of identical BCRs among the single-cell sorted PCs.
423
Importantly, we observed that the degree of mutations in the H chain was paralleled by the 424
degree of mutations in the L chain. This was the case also for IGHV5-51, suggesting that the 425
low mutation rate of IGHV5-51 is related to the developmental fate of the B cells rather than 426
inherent propensity of this heavy chain gene segment to acquire few mutations.
427
Comparing the mutational pattern in mAbs that dock on the same epitope should provide 428
insights into critical epitope-paratope interactions. Being the most frequent among the 429
collected BCRs, we chose to investigate the HV5-51/KV1-5 pair in more detail. Reactivity of 430
the celiac mAb 679-14E06, which uses this VH:VL pair, was previously studied in detail by 431
SAXS, dynamic modeling and mutational analysis (19). The 679-14-E06 mAb is specific for 432
epitope 1, and several key epitope-paratope interaction residues were identified. The current 433
analysis of mutations confirms and extends the previous analysis. Residue 106 of the light 434
chain, despite carrying a mutation from Q to H in the 679-14-E06 mAb, was not identified in 435
the previous study. A uniform mutation pattern with half of the BCRs carrying H, strongly 436
suggests that H at this position makes contact with TG2. Another striking feature we observed, 437
is the conservative pattern of mutations. In general, there were few mutations, and when a 438
mutation occurred, this was usually to a residue with similar functional physicochemical 439
21 properties. This suggests a good fit for the epitope in germline configuration by IGHV5- 440
51:IGKV1-5 with little drive to mutate away from the mode of interactions enabled by the 441
germline-encoded residues.
442
Taken together, the present study of paired VH and VL sequences derived from single 443
antigen-specific PCs of a disease lesion unravel key features of a highly disease-specific 444
autoantibody response in a human condition. These results motivate this type of analysis to be 445
undertaken in other human autoimmune diseases.
446 447
ACKNOWLEDGEMENTS. We thank Vikas K. Sarna for providing clinicopathological 448
details of investigated celiac disease patients.
449
22 References
450
1. Pinto, D., E. Montani, M. Bolli, G. Garavaglia, F. Sallusto, A. Lanzavecchia, and D.
451
Jarrossay. 2013. A functional BCR in human IgA and IgM plasma cells. Blood 121:
452
4110-4114.
453
2. Di Niro, R., L. Mesin, M. Raki, N. Y. Zheng, F. Lund-Johansen, K. E. Lundin, A.
454
Charpilienne, D. Poncet, P. C. Wilson, and L. M. Sollid. 2010. Rapid generation of 455
rotavirus-specific human monoclonal antibodies from small-intestinal mucosa. J 456
Immunol 185: 5377-5383.
457
3. DeKosky, B. J., O. I. Lungu, D. Park, E. L. Johnson, W. Charab, C. Chrysostomou, D.
458
Kuroda, A. D. Ellington, G. C. Ippolito, J. J. Gray, and G. Georgiou. 2016. Large- 459
scale sequence and structural comparisons of human naive and antigen-experienced 460
antibody repertoires. Proc Natl Acad Sci U S A 113: E2636-2645.
461
4. Sollid, L. M., and B. Jabri. 2013. Triggers and drivers of autoimmunity: lessons from 462
coeliac disease. Nat Rev Immunol 13: 294-302.
463
5. Rostom, A., C. Dube, A. Cranney, N. Saloojee, R. Sy, C. Garritty, M. Sampson, L.
464
Zhang, F. Yazdi, V. Mamaladze, I. Pan, J. MacNeil, D. Mack, D. Patel, and D. Moher.
465
2005. The diagnostic accuracy of serologic tests for celiac disease: a systematic review.
466
Gastroenterology 128: S38-46.
467
6. Di Niro, R., L. Mesin, N. Y. Zheng, J. Stamnaes, M. Morrissey, J. H. Lee, M. Huang, 468
R. Iversen, M. F. du Pre, S. W. Qiao, K. E. Lundin, P. C. Wilson, and L. M. Sollid.
469
2012. High abundance of plasma cells secreting transglutaminase 2-specific IgA 470
autoantibodies with limited somatic hypermutation in celiac disease intestinal lesions.
471
Nat Med 18: 441-445.
472
7. Snir, O., L. Mesin, M. Gidoni, K. E. Lundin, G. Yaari, and L. M. Sollid. 2015.
473
Analysis of celiac disease autoreactive gut plasma cells and their corresponding 474
23 memory compartment in peripheral blood using high-throughput sequencing. J 475
Immunol 194: 5703-5712.
476
8. Ludvigsson, J. F., J. C. Bai, F. Biagi, T. R. Card, C. Ciacci, P. J. Ciclitira, P. H. Green, 477
M. Hadjivassiliou, A. Holdoway, D. A. van Heel, K. Kaukinen, D. A. Leffler, J. N.
478
Leonard, K. E. Lundin, N. McGough, M. Davidson, J. A. Murray, G. L. Swift, M. M.
479
Walker, F. Zingone, D. S. Sanders, B. S. G. C. D. G. D. Group, and G. British Society 480
of. 2014. Diagnosis and management of adult coeliac disease: guidelines from the 481
British Society of Gastroenterology. Gut 63: 1210-1228.
482
9. Islam, S., U. Kjallquist, A. Moliner, P. Zajac, J. B. Fan, P. Lonnerberg, and S.
483
Linnarsson. 2012. Highly multiplexed and strand-specific single-cell RNA 5' end 484
sequencing. Nat Protoc 7: 813-828.
485
10. Benckert, J., N. Schmolka, C. Kreschel, M. J. Zoller, A. Sturm, B. Wiedenmann, and 486
H. Wardemann. 2011. The majority of intestinal IgA+ and IgG+ plasmablasts in the 487
human gut are antigen-specific. J Clin Invest 121: 1946-1955.
488
11. Tan, Y. C., L. K. Blum, S. Kongpachith, C. H. Ju, X. Cai, T. M. Lindstrom, J.
489
Sokolove, and W. H. Robinson. 2014. High-throughput sequencing of natively paired 490
antibody chains provides evidence for original antigenic sin shaping the antibody 491
response to influenza vaccination. Clin Immunol 151: 55-65.
492
12. Tiller, T., E. Meffre, S. Yurasov, M. Tsuiji, M. C. Nussenzweig, and H. Wardemann.
493
2008. Efficient generation of monoclonal antibodies from single human B cells by 494
single cell RT-PCR and expression vector cloning. J Immunol Methods 329: 112-124.
495
13. Vander Heiden, J. A., G. Yaari, M. Uduman, J. N. Stern, K. C. O'Connor, D. A. Hafler, 496
F. Vigneault, and S. H. Kleinstein. 2014. pRESTO: a toolkit for processing high- 497
throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics 498
30: 1930-1932.
499
24 14. Alamyar, E., P. Duroux, M. P. Lefranc, and V. Giudicelli. 2012. IMGT((R)) tools for 500
the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J 501
repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV- 502
QUEST for NGS. Methods Mol Biol 882: 569-604.
503
15. Krzywinski, M., J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S. J. Jones, 504
and M. A. Marra. 2009. Circos: an information aesthetic for comparative genomics.
505
Genome Res 19: 1639-1645.
506
16. Thomsen, M. C., and M. Nielsen. 2012. Seq2Logo: a method for construction and 507
visualization of amino acid binding motifs and sequence profiles including sequence 508
weighting, pseudo counts and two-sided representation of amino acid enrichment and 509
depletion. Nucleic Acids Res 40: W281-287.
510
17. Gupta, N. T., J. A. Vander Heiden, M. Uduman, D. Gadala-Maria, G. Yaari, and S. H.
511
Kleinstein. 2015. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin 512
repertoire sequencing data. Bioinformatics 31: 3356-3358.
513
18. Willis, L. G., M. L. Winston, and B. M. Honda. 1992. Phylogenetic relationships in 514
the honeybee (genus Apis) as determined by the sequence of the cytochrome oxidase 515
II region of mitochondrial DNA. Mol Phylogenet Evol 1: 169-178.
516
19. Chen, X., K. Hnida, M. A. Graewert, J. T. Andersen, R. Iversen, A. Tuukkanen, D.
517
Svergun, and L. M. Sollid. 2015. Structural basis for antigen recognition by 518
transglutaminase 2-specific autoantibodies in celiac disease. J Biol Chem 290: 21365- 519
21375.
520
20. Smith, K., L. Garman, J. Wrammert, N. Y. Zheng, J. D. Capra, R. Ahmed, and P. C.
521
Wilson. 2009. Rapid generation of fully human monoclonal antibodies specific to a 522
vaccinating antigen. Nat Protoc 4: 372-384.
523
25 21. Iversen, R., S. Mysling, K. Hnida, T. J. Jorgensen, and L. M. Sollid. 2014. Activity- 524
regulating structural changes and autoantibody epitopes in transglutaminase 2 assessed 525
by hydrogen/deuterium exchange. Proc Natl Acad Sci U S A 111: 17146-17151.
526
22. Picelli, S., O. R. Faridani, A. K. Bjorklund, G. Winberg, S. Sagasser, and R. Sandberg.
527
2014. Full-length RNA-seq from single cells using Smart-seq2. Nat Protoc 9: 171-181.
528
23. Dorner, T., and P. E. Lipsky. 2005. Molecular basis of immunoglobulin variable 529
region gene usage in systemic autoimmunity. Clin Exp Med 4: 159-169.
530
24. Foreman, A. L., J. Van de Water, M. L. Gougeon, and M. E. Gershwin. 2007. B cells 531
in autoimmune diseases: insights from analyses of immunoglobulin variable (Ig V) 532
gene usage. Autoimmun Rev 6: 387-401.
533
25. Roben, P., S. M. Barbas, L. Sandoval, J. M. Lecerf, B. D. Stollar, A. Solomon, and G.
534
J. Silverman. 1996. Repertoire cloning of lupus anti-DNA autoantibodies. J Clin 535
Invest 98: 2827-2837.
536
26. Van Esch, W. J., C. C. Reparon-Schuijt, H. J. Hamstra, C. Van Kooten, T. Logtenberg, 537
F. C. Breedveld, and C. L. Verweij. 2003. Human IgG Fc-binding phage antibodies 538
constructed from synovial fluid CD38+ B cells of patients with rheumatoid arthritis 539
show the imprints of an antigen-dependent process of somatic hypermutation and 540
clonal selection. Clin Exp Immunol 131: 364-376.
541
27. Iversen, R., R. Di Niro, J. Stamnaes, K. E. Lundin, P. C. Wilson, and L. M. Sollid.
542
2013. Transglutaminase 2-specific autoantibodies in celiac disease target clustered, N- 543
terminal epitopes not displayed on the surface of cells. J Immunol 190: 5981-5991.
544
28. DeKosky, B. J., T. Kojima, A. Rodin, W. Charab, G. C. Ippolito, A. D. Ellington, and 545
G. Georgiou. 2015. In-depth determination and analysis of the human paired heavy- 546
and light-chain antibody repertoire. Nat Med 21: 86-91.
547
26 29. Jackson, K. J., M. J. Kidd, Y. Wang, and A. M. Collins. 2013. The shape of the 548
lymphocyte receptor repertoire: lessons from the B cell receptor. Front Immunol 4:
549
263.
550
30. Boursier, L., D. K. Dunn-Walters, and J. Spencer. 1999. Characteristics of IgVH genes 551
used by human intestinal plasma cells from childhood. Immunology 97: 558-564.
552
31. DeKosky, B. J., G. C. Ippolito, R. P. Deschner, J. J. Lavinder, Y. Wine, B. M.
553
Rawlings, N. Varadarajan, C. Giesecke, T. Dorner, S. F. Andrews, P. C. Wilson, S. P.
554
Hunicke-Smith, C. G. Willson, A. D. Ellington, and G. Georgiou. 2013. High- 555
throughput sequencing of the paired human immunoglobulin heavy and light chain 556
repertoire. Nat Biotechnol 31: 166-169.
557
32. Busse, C. E., I. Czogiel, P. Braun, P. F. Arndt, and H. Wardemann. 2014. Single-cell 558
based high-throughput sequencing of full-length immunoglobulin heavy and light 559
chain genes. Eur J Immunol 44: 597-603.
560
33. Marzari, R., D. Sblattero, F. Florian, E. Tongiorgi, T. Not, A. Tommasini, A. Ventura, 561
and A. Bradbury. 2001. Molecular dissection of the tissue transglutaminase 562
autoantibody response in celiac disease. J Immunol 166: 4170-4176.
563
34. Worn, A., and A. Pluckthun. 2001. Stability engineering of antibody single-chain Fv 564
fragments. J Mol Biol 305: 989-1010.
565
27 Footnotes:
566
This work was supported by the European Commission (grant ERC-2010-Ad-268541), by the 567
Research Council of Norway through its Centres of Excellence funding scheme, project 568
number 179573/V40, by Stiftelsen KG Jebsen and by grants from and the South-Eastern 569
Norway Regional Health Authority.
570 571
Abbreviations used in this article: VH, heavy chain variable region; VL, light chain variable 572
region; PCs, plasma cells; TG2, transglutaminase 2; CD, celiac disease; HTS, high throughput 573
sequencing.
574
28 Figure legends
575
FIGURE 1. Selective usage of VL gene segments among TG2-specific PCs. (A and B) 576
Averaged percentages of Ig and Ig expressing PCs determined by sequencing (A) or flow 577
cytometry (B). Each bar graph represents average ± SD. (C) VL usage frequency for a 578
representative untreated CD patient (CD1322). (D) Average VL usage frequency for a total of 579
10 patients. Number of total unique clonotypes denoted by ‘n’ (C) was taken into account for 580
the usage frequency calculation (C and D). (E) Usage frequency of JH and JL gene segments 581
amongst TG2+ and TG2- PCs is comparable. Frequency has been calculated from the 582
sequence data combined for all patients (n=10). TG2+, TG2-specific; TG2-, non-TG2-specific.
583
Significance was determined using a two-tailed t test. n.s., not significant, *p < 0.05, ***p <
584
0.001.
585
FIGURE 2. TG2-specific BCRs show strong VH:VL pairing preferences. (A) Circos plot 586
depicting the VH:VL pairings for a representative CD patient (CD1256). Pairing between VH 587
(red arcs) and VL (dark blue arcs) gene segments is shown by the connecting lines inside the 588
circle with thickness corresponding to pairing frequencies. (B) Heat map showing the relative 589
VH:VL paring frequencies. The color intensity index for each pair was obtained by dividing 590
the difference in frequency between TG2+ and TG2- PCs with the highest difference value.
591
Average of frequency values from 10 patients were used. (C) Most frequently used (≥ 1%) 592
VH:VL pairs amongst TG2+ PCs. (D) Pairing frequencies of IGHV5-51 with different light 593
chains. (E) VL gene segments that paired most frequently with IGHV5-51. Each group of bar 594
graphs shows the paring frequency for the indicated CD patient. Each bar graph shows 595
average ± sd calculated from indicated no. of patients (C and D).
596
FIGURE 3. Clonality amongst TG2-specific PCs is fairly widespread. (A) Ratio of total 597
sequences (assumed to be cells) and number of clonotypes for individual CD patients (B).
598
Percentage of expanded clonotypes. Each plot corresponds to indicated CD patient. (C) 599
29 Absolute frequency of differently sized clonal families. (D) Heat map indicating the VH:VL 600
pairs with strong propensity for clonal expansion. The color intensity index for each pair was 601
obtained by dividing the difference in average (n = 10) fold expansion between TG2+ and 602
TG2- PCs with the highest difference value. (E) Percentage of analyzed patients where 603
plotted VH:VL combination showed expansion. (F) Observed fold expansion for IGHV5- 604
51:IGKV1-5 bearing PCs in individual CD patients. (G) VH:VL pair expressed by most 605
expanded clones in each individual. The numbers in the parenthesis shows the size of the 606
clonal group (numerator) vs total no. of cells (denominator).
607
FIGURE 4. TG2-specific PCs acquire mutations as they clonally expand. Phylogenetic trees 608
show the clonal relationship between TG2-specific PCs belonging to indicated CD patients.
609
Associated VH and VL gene segments for each clonal family have been indicated. The 610
numbers in the brackets close to each circle representing a clone, indicate, number of total and 611
non-silent mutations in respective VH region. Size of the circles corresponds to the number of 612
clones it contains. Cited text inside the circle, represent clones and individual clones are 613
separated by comma.
614
FIGURE 5. TG2-specific PCs acquire few mutations. (A) Total number of mutations in 615
IGHV, IGKV and IGLV genes. (B) Correlation between mutation numbers in paired IGHV and 616
IGKV/IGLV (C) Number of total mutations in TG2-specific IGHV5-51 and other (all except 617
IGHV5-51) IGHV gene segments. (D) Number of total mutations in TG2-specific IGKV/IGLV 618
gene segments either paired with IGHV5-51 or other (all except IGHV5-51) IGHV gene 619
segments. (E) Number of mutations in TG2-specific IGHV5-51 and of paired IGKV/IGLV.
620
Horizontal bars (A, C and D) show the average value. Number of sequences (from 10 621
patients) is denoted by ‘n’. Significance was determined using unpaired t test. ****p < 0.0001.
622
30 FIGURE 6. Amino acid (AA) changes in TG2-specific IGHV5-51 and IGKV1-5 are strongly 623
selective. Graphs show the frequency of AA changes (Y-axis) for TG2-specific, IGHV5-51 624
and IGKV1-5 paired to each other (A) at positions (IMGT numbering shown on X-axis) 625
undergoing frequent AA changes. (B) Frequency of AA changes for TG2-specific IGKV1-5 626
paired to VH gene segments other than IGHV5-51 and non-TG2-specific IGKV1-5. (C) 627
Height of each letter (AA code) corresponds to relative frequency, and AA residues with 628
similar physicochemical properties are depicted in the same colors. Number of unique 629
sequences subjected to this analysis, is denoted by ‘n’. Letters below the numbers on the X- 630
axis show the corresponding germline AA residues. According to IMGT numbering, positions 631
27-38, 56-65 and 105 correspond to CDR1, CDR2 and the first residue of CDR3, respectively.
632
Underlined AA residues correspond to ones that were predicted to be engaged in binding 633
between TG2 and the Fab fragment of anti-TG2 mAb 679-14-E06 (19). The second tier of 634
letters below the X-axis shows the AA changes observed for 679-14-E06 (19). (D) Frequency 635
of Q to H replacement mutation at position 106 of IGKV1-5 when paired to IGHV5-51 or VH 636
gene segments other than IGHV5-51 (Others). Each dot (number of unique sequence ‘n’ = 4- 637
18) represents one individual CD patient except for non-TG2-specific IGKV1-5 paired to 638
IGHV5-51 where an average of all is shown (number of unique sequence ‘n’ =5). (E) Role of 639
IGKV1-5 residue 106 in TG2 binding. The prototype anti-TG2 mAb 679-14-E06 using the 640
IGHV5-51: IGKV1-5 pair was expressed either in its native form (106H) or with VL residue 641
106 reverted to the germline configuration (106Q). Binding of the mAbs to TG2 was assessed 642
by ELISA. EC50 values were obtained by non-linear regression analysis and the 95%
643
confidence interval is given in parenthesis. Significance was determined using one-way 644
ANOVA. n.s., not significant, ***p < 0.001, ****p < 0.0001.
645
FIGURE 7. Bias in CDR3 lengths among TG2-specific PCs. (A) VH and VL CDR3 length 646
distribution amongst TG2+ and TG2- PCs as calculated from the sequence data combined for 647