• No results found

44

family 2 member D (CLEC2D) located in close proximity to KLRB1 within the natural killer gene complex on chromosome 12 (Kirkham & Carlyle, 2014). LLT1, similarly to CD161, is a type II transmembrane protein with N-terminal 38-amino acid cytoplasmic chain, 21-amino acid transmembrane segment and C-terminal 132-amino acid extracellular domain (Boles et al., 1999; Skálová et al., 2014). The cytoplasmic tail of LLT1 lacks ITIM motif (Boles et al., 1999). LLT1 forms a disulfide-linked homodimer comprised of two subunits with a predicted molecular mass of 22 kDa each. C-terminal part of LLT1 encompasses C-type lectin-like domain that is involved in the recognition and interaction with CD161. At least 6 different splice variants of CLEC2D have been found. Variant 1 encodes membrane bound LLT1 and only this isoform can bind to CD161 (Germain et al., 2010).

CD161 binds LLT1 with low affinity and fast kinetics that are typical for a cell-to-cell recognition receptor (Kamishikiryo et al., 2011). The CD161-LLT1 receptor-ligand pair represents a system that is involved in the regulation of both adaptive and innate immune responses (Skálová et al., 2014). Although biophysical characteristics of the LLT1 are well understood, the exact distribution of this receptor is unclear. It is generally accepted that LLT1 is not present on resting lymphocytes, but is upregulated upon cellular activation. LLT1 was found on the surface of activated B cells, T cells, NK cells, monocytes and DCs (Llibre et al., 2016a; Rosen et al., 2008;

Germain et al., 2011).

45

simultaneously decreasing the costs of sequencing itself. These developments enabled more detailed and accurate sequencing that led to novel insights and interesting prospects for future research (Heather & Chain, 2016).

Overall, the DNA sequencing technologies can be classified into three major groups that are traditionally arranged in the order of their occurrence, namely first-generation sequencing, second-first-generation sequencing and third-first-generation sequencing. Of note, the nomenclature and classification of sequencing technologies is unfortunately vague and second-generation sequencing as well as third-generation sequencing methods are sometimes collectively described as next-generation sequencing. First-next-generation sequencing, exemplified by Sanger’s dideoxy chain-termination method, allows to sequence short fragments of DNA.

However, first-generation sequencing approaches are expensive, low-throughput and time-consuming, thus not suitable to study full genomes or transcriptomes.

Indeed, the Human Genome Project utilising such methods took approximately 15 years to complete at a cost of 3 billion USD. Second-generation sequencing represents substantial improvement from first-generation sequencing, allowing to sequence genomes or investigating transcriptomes at more affordable scale.

Second-generation sequencing instruments are based on engineering concept of massively parallel sequencing of uniquely barcoded samples. Second-generation sequencing began with 454 pyrosequencing which is still in operation in some laboratories but technical support from the distributing company is no longer available. In recent years, Illumina has dominated the DNA sequencing market although alternative solutions can be used, most notably sequencing platforms developed by Ion Torrent Systems. Nonetheless, second-generation sequencing suffers from some shortcomings including the requirement of DNA amplification which can introduce biases due to unevenness of such reaction. Third-generation sequencing aims to overcome potential problems linked to amplification step by directly sequencing single non-amplified DNA molecules. Third-generation

46

sequencing instruments from Pacific Biosystems and Oxford Nanopore Technologies are able to generate very long reads and therefore are particularly useful for de novo genome assembly. Although potentially revolutionary, current third-generation sequencing systems are reportedly affected by substantial error rates (Bleidorn, 2015).

Single-cell RNA-sequencing

Global analysis of entire transcriptomes using RNA-sequencing (RNA-seq) became possible with the introduction of high-throughput sequencing technologies. Bulk RNA-seq experiments are typically performed on samples containing thousands of cells. Therefore, these assays provide only an average estimation of gene expression across the population and give the impression that phenotypically similar cells may also be transcriptionally similar. However, bulk RNA-seq is known to mask cellular heterogeneity due to the fact that the signals from transcriptionally active cells might be averaged out by signals from transcriptionally inactive cells. Biological heterogeneity at the level of gene expression and its consequences are not completely understood and such variations might have multiple origins, for example they may arise as a result of subtle differences in the local microenvironment. In contrast to bulk RNA-seq, single-cell RNA-sequencing (scRNA-seq) enables to dissect and investigate complex cellular compositions of large cell populations at the level of single cells (Kolodziejczyk & Lönnberg, 2018). Ever since its first introduction over 10 years ago, scRNA-seq has proven extremely useful in many areas of modern biological sciences. At the same time, novel scRNA-seq technologies made a huge demand on upgraded and refined bioinformatic tools that are indispensable to any scRNA-seq experiment (Svensson et al., 2018). In the field of immunology, the unbiased investigation of transcriptomes at the single-cell

47

level can be used to study the key aspects of the human immune system including identification of novel and/or rare cell types as well as in-depth examination of immune receptors located on immune cells (Sandberg, 2014).

In general, scRNA-seq protocols can be divided into microplate-based protocols, droplet-based protocols, full-length protocols or partial-length protocols.

Microplate-based protocols, where single cells are isolated into individual wells on a microplate, are labour-intensive and suitable only for limited number of cells.

However, these approaches can be combined with messenger RNA (mRNA) spike-ins, for example commercially available set of External RNA Control Consortium (ERCC) control mRNAs, which are used to estimate the level of technical variation between cells. In contrast, in droplet-based protocols individual cells are encapsulated in single droplets and such approaches can capture thousands of cells in one experiment. Moreover, droplet-based techniques allow to incorporate unique molecular identifiers (UMIs) which are attached to each mRNAs within a cell during reverse transcription and significantly improve subsequent bioinformatic quantification of transcripts. Full-length approaches cover the entire transcripts and are suitable for quantification of gene isoforms as well as to monitor single-nucleotide polymorphisms (SNPs). On the other hand, partial-length methods focus on sequencing of 3' or 5' ends of transcripts (Kolodziejczyk et al., 2015). Currently available scRNA-seq methods were developed for different purposes and various protocols frequently combine multiple characteristics. For example, Smart-seq2 is a microplate-based method that is optimised for full-length coverage and it is ideal for studying rare and/or infrequent cells as well as for reconstruction of full-length TCRs or BCRs. Moreover, scRNA-seq libraries can be generated with Smart-seq2 using mostly off-the-shelf reagents which significantly reduces the costs of experiments (Picelli et al., 2013; Picelli et al., 2014a). Drop-seq is a droplet-based method which enables sequencing of 3' end of transcripts. This method is perfectly suitable for scRNA-seq analysis of large number of cells which can accelerate

48

discoveries of novel cell subtypes as well as help to understand the composition of complex body tissues (Macosko et al., 2015).

Each scRNA-seq protocol can be divided into a number of fundamental steps including (1) isolation of single cells, (2) cell lysis, (3) reverse transcription, (4) amplification of complementary DNA (cDNA) and (5) preparation of sequencing libraries. After sequencing and bioinformatic quantification of gene expression levels, results should be subsequently validated with other methods such as real-time quantitative polymerase chain reaction (RT-qPCR) or flow cytometry. Each stage of the scRNA-seq library preparation is equally important and has to be performed with care and precision. Isolation of mononuclear cells from blood is a relatively straightforward procedure that do not impose much stress on the cells and consequently do not diminish the quality of their transcriptome. One popular way of acquiring peripheral blood mononuclear cells (PBMCs) is by density gradient centrifugation using reagents such as Lymphoprep or Ficoll. In comparison, isolation of mononuclear cells from body tissues is more problematic as enzymatic or mechanistic approaches designed to dissolve the tissue may affect the viability as well as integrity of dissociated mononuclear cells that may severely diminish the quality of cellular RNA. Once the single-cell suspension is prepared, further sorting methods can be applied. Fluorescence-activated cell sorting (FACS) is frequently used to sort a small number of single cells with specific combination of surface markers into a microwell plate. Alternative strategy is to randomly capture many individual cells in nanolitre droplet emulsions. Cells are usually captured into lysis buffer that disrupt the cell membrane and facilitate the release of RNA. Proper cell lysis is therefore important for subsequent reverse transcription where mRNA molecules are transcribed into cDNA. The major goal of reverse transcription reaction is to avoid transcribing ribosomal RNA and transfer RNA. Both RNA species are highly abundant and can dominate over less numerous mRNA. Therefore, most of available protocols utilise oligo-dT primers that bind exclusively to the poly(A) tail

49

of mRNA molecules. Of note, RNA purification is sometimes executed before reverse transcription reaction. In this step, cell extracts are mixed with Solid-phase reversible immobilization (SPRI) beads, for example RNAClean XP beads, in order to remove genomic DNA that could interfere with downstream processes. Single cell contains on average around 10 pg of mRNA which, when reverse transcribed into cDNA, is not sufficient for library preparation. For that reason, cDNA has to be amplified either by polymerase chain reaction (PCR) reaction or in vitro transcription.

PCR reaction is more prevalent among scRNA-seq library protocols, but exponential amplification is known to distort original composition of mRNAs in the cell, especially when it comes to low abundant transcripts. Individual samples are also uniquely indexed prior to sequencing. This strategy permits to multiplex samples, thus increasing throughput of each scRNA-seq experiment. In addition, various controls are adopted to determine the quality of the scRNA-seq library. The gold standard is to analyse its purity and size distribution. Contamination with organic compounds can be assessed by NanoDrop spectrophotometer whereas size distribution can be visualized by chip-based capillary electrophoresis using Bioanalyzer. scRNA-seq libraries should be kept at -20℃ until sequencing to minimise degradation (Kolodziejczyk & Lönnberg, 2018).

Smart-seq2

The original Smart-seq protocol was designed to generate read coverage across full transcripts from single cells containing picogram amount of mRNA material.

Theoretically, Smart-seq enables to both quantify gene expression and to detect alternative transcript variants. However, it has shown substantial limitations for transcripts expressed at very low levels that might code for important, yet infrequent proteins (Ramsköld et al., 2012). Smart-seq was therefore further optimized in order to increase sensitivity, accuracy and full-length coverage across the transcriptome.

50

As a result, improved Smart-seq2 showed substantial increase in detection of gene expression and lower technical variability than previous Smart-seq. The detailed step by step Smart-seq2 protocol was given elsewhere (Picelli et al., 2014a) and it will be only briefly summarised here.

Smart-seq2 (Figure 9) starts with isolation of single cells into a lysis buffer containing a RNase inhibitor that protects and stabilises fragile mRNA molecules. For subsequent reverse transcription reaction, the buffer containing deoxynucleoside triphosphates (dNTPs) and oligo-dT primers is used. Additionally, the reverse transcription buffer contains betaine and magnesium chloride that display protein thermal-stabilisation properties and help to maximise cDNA yield by increasing the efficiency of reverse transcription reaction. Smart-seq2 exploits SMART technology (Switching Mechanism At the end of the 5'-end of the RNA Transcript) that utilizes two intrinsic properties of Moloney Murine Leukemia Virus (MMLV) reverse transcriptase, namely terminal transferase activity and template switch activity.

MMLV enzyme reverse transcribes mRNA and introduces a few extra cytosines at the end of newly synthesized cDNA molecule. This extended region works as a docking site for Template Switch Oligonucleotide (TSO) consisting of modified guanosines that anneal with the newly added non-template cytosines and a primer sequence for downstream PCR. MMLV reverse transcriptase is then able to switch templates and to synthesize sequence complementary to the TSO. Therefore, the entire transcriptome can be amplified in a single PCR reaction that is referred to as

“Preamplification” (Picelli et al., 2013; Picelli et al., 2014a; Picelli, 2019).

After the first-strand reaction and preamplification, the single-cell full-length cDNA libraries should display an average fragment size of around 2000 base pair (bp) on a Bioanalyzer trace and no spikes below 400 bp should be visible. Next, amplified full-length fragments are subjected to a near-to-random Tagmentation reaction by a hyperactive variant of Tn5 Transposase that cuts double-stranded DNA and ligates synthetic oligonucleotides, containing primer sequence, at both ends.

51

Tagmentation reaction is completed by enrichment PCR that adds the P5 and P7 sequences required for binding to the Illumina flowcell as well as unique collection of Illumina Nextera S5xx and N7xx indexes necessary for multiplexing. The dual-indexed tagmented single-cell libraries are then pooled together and the final product should exhibit size spanning between 200 bp to 1000 bp a Bioanalyzer trace with no visible spikes below 200 bp (Picelli et al., 2014b; Picelli, 2019).

52

Figure 9. Smart-seq2 library preparation flowchart. Single cells are isolated directly into a lysis buffer and are subsequently reverse transcribed from oligo-dT primer. When reverse transcriptase reaches the end of the RNA template, a few cytosines are added to newly synthetized cDNA strand. After annealing of the TSO to cytosines, reverse transcriptase synthesizes cDNA fragment using TSO as a template. Following PCR amplification, cDNA is fragmented during tagmentation reaction using Tn5 transposase. Simultaneously to fragmentation, Tn5 ligates primers to both ends to each cDNA fragment. Another round of PCR amplification introduces Illumina compatible sequences as well as unique index sequences for sample multiplexing. Adapted from Picelli, 2019 with permission.

53

Bioinformatic analysis of scRNA-seq data

The investigation of global transcriptional landscape at the single-cell level offers an unprecedented opportunity for new discoveries that were previously unobtainable with bulk RNA-seq technologies. In order to maximally exploit scRNA-seq data it is important to apply appropriate bioinformatic approaches. Undoubtedly, the biggest challenge for all scRNA-seq studies is to determine which differences in expression of particular genes are biologically relevant and which are the result of technical or biological noise. As computational analysis is an integral part of any experiment utilizing modern sequencing technology, the choice of bioinformatic tools might have a profound impact on subsequent biological interpretation and therefore should be done carefully (Stegle et al., 2015). Bioinformatics is probably the fastest developing area of science at the moment and the full overview of the field is unfortunately beyond the scope of this chapter. Here, I intend to briefly introduce typical bioinformatic workflow that is often used to analyse scRNA-seq data from T cells generated by Smart-seq2 protocol or any other similar full-length approach allowing to investigate gene expression and to reconstruct TCR sequences.

The computational analysis can be divided into a number of fundamental steps (Figure 10). The quality of scRNA-seq data can be affected by many variables such as cell isolation and capture, PCR amplification and library storage, not to mention the process of sequencing itself. Therefore, the aim of the first step during bioinformatic analysis is to perform quality control of raw scRNA-seq data where low-quality reads are filtered out. The remaining reads are then ready for alignment to the reference genome. After completing the alignment step, the data need to be combined into expression matrices that are required for further quality control where low-quality cells and/or non-informative genes are removed. Normalization represents the process that removes systematic variation in the sequencing data caused by technical effects, for instance mRNA capture efficiency or sequencing

54

depth, while preserving true biological variation. Another important step is to account for batch effects that can be understood as systematic errors arising from the experimental design, for example separately generated scRNA-seq data sets could lead to vastly different gene expression profile in one batch versus the other batch (Van Der Byl et al., 2019).

The main analysis is usually focused on four components. First, the data has to be transformed using a dimensionally reduction technique such as principal component analysis (PCA), t-distributed stochastic neighbour embedding (tSNE) or uniform manifold approximation and projection (UMAP). Next, it is important to group cells using a clustering technique which is based on the quantification of the distance between cells as a function of gene expression. Differential expression is a key analysis that is utilised to identify the genes with the greatest variation in expression between cell subsets. Lastly, pathway enrichment analysis, also known as Gene Set Enrichment Analysis (GSEA), is performed in order to investigate whether differentially expressed genes (DEGs) are associated with some particular functions, phenotypes or diseases (Van Der Byl et al., 2019).

Full-length TCR sequences can be assembled from SMART-seq2 data using methods like TraCeR. Additionally, TraCeR allows to link paired TCR sequences with transcriptional profiles in individual cells. Furthermore, the gene expression data for a limited number of targets may also be correlated with fluorescent intensity values that were generated during FACS sorting. As a result of this integration, an even more complete picture of a T cell’s functional profile can be obtained (Van Der Byl et al., 2019).

55

Figure 10. Bioinformatic analysis of scRNA-seq data. Brief description of each step was given in the chapter above. Adapted from Van Der Byl et al., 2019 with permission.

56

Aims of thesis

In this research project we studied gluten-reactive CD4+ T cells from UCeD patients at the single-cell level using scRNA-seq methods. Our goal was to expand the knowledge of gluten-reactive CD4+ T cells in CeD by using novel cutting-edge techniques. In the first stage of the project, we applied Smart-seq2 in order to (i) investigate transcriptional heterogeneity of gluten-reactive CD4+ T cells and to (ii) reconstruct full-length TCR sequences using TraCeR tool. In the second stage of the project, we verified results obtained during the first stage and performed functional assays on selected target using expanded gluten-reactive CD4+ T-cells. As we found KLRB1 to be differentially expressed in gluten-reactive CD4+ T cells compared to control effector memory CD4+ T cells, we decided to further study the role of protein encoded by KLRB1, namely CD161. Our aims were to (i) discover the role of CD161 in gluten-reactive CD4+ T cells and to (ii) characterise broader significance of CD161 in CD4+ T cells.

The specific aims of my PhD project were to:

• Utilize SMART-seq2 protocol to build scRNA-seq libraries from gluten-reactive CD4+ T cells isolated from peripheral blood and duodenal biopsies of UCeD patients,

• Conduct functional assays on CD161 using gluten-reactive CD4+ T-cell clones (TCCs),

• Summarise the current knowledge on CD161 expression and role in CD4+ T cells.

57

Summary of papers

Paper I

Yao Y, Zia A, Wyrożemski Ł, et al. 2018. Exploiting antigen receptor information to quantify index switching in single-cell transcriptome sequencing experiments. PLoS One. 13: e0208484.

When preparing the samples for scRNA-seq, individual indexes (also known as barcodes) are added to genetic material of each single cell during library preparation. Subsequently, the large number of uniquely indexed libraries can be pooled together (multiplexing) and sequenced simultaneously during a single sequencing run. Multiplexed libraries are bioinformatically segregated (demultiplexing) and sequences are assigned to separate cells based on unique indexes. However, some sequences may be misallocated due to the index switching (sometimes referred to as index hopping or index misassignment), a rare phenomenon where a proportion of reads from the expected indexes are attached to incorrect indexes. In this study we isolated CD4+ T cells and plasma cells from peripheral blood od UCeD patients. scRNA-seq libraries were prepared using Smart-seq2 protocol and were subsequently sequenced on Illumina HiSeq 3000 or HiSeq 4000 platforms. Unique expression of TCRs by T cells and BCRs by plasma cells was used to quantify the rate of index switching. We found that the median percentage of incorrectly assigned reads was 3.9%. No major differences in the level of index switching were observed between HiSeq 3000 and HiSeq 4000 platforms.

All indexes used to prepare Smart-seq2 libraries were equally inclined for switching.

58

Paper II

Yao Y, Wyrożemski Ł, Lundin KEA, et al. 2021. Differential expression profile of gluten-specific T cells identified by single-cell RNA-seq. PLoS One. 16: e0258029.

Smart-seq2 is currently the technique of choice when performing scRNA-seq analysis of rare cell types. This protocol allows to profile the entire length of the transcripts and enables the assembly of full-length TCR sequences. We have utilized Smart-seq2 to examine gluten-reactive CD4+ T cells and control effector memory CD4+ T cells from peripheral blood of four UCeD patients. Gluten-reactive CD4+ T cells were sorted as effector memory cells (CD62L-CD45RA-) that bound to HLA-DQ2.5:gluten tetramers presenting four immunodominant gluten epitopes while control CD4+ T cells were sorted as tetramer-negative effector memory cells (CD62L-CD45RA-). Our analysis showed that gluten-reactive CD4+ T cells display distinct transcriptional profiles together with an activated state as well as features of either TH1 or TFH cells. 237 genes were differentially expressed between gluten-reactive CD4+ T cells and control effector memory CD4+ T cells. The top up-regulated genes were associated with TCR signalling, downstream TCR signalling and TCR-dependent co-stimulation as well as TCR-dependent cytokine production.

In contrast, the top down-regulated genes were connected with the process of protein translation. Furthermore, 51 clonotypes comprising at least two T cells were detected and the vast majority of clonally expanded T cells belonged to the pool of HLA-DQ2.5:gluten tetramer-sorted reactive CD4+ T cells. Lastly, gluten-reactive CD4+ T cells that shared the same clonal origin were transcriptionally more similar than clonally unrelated gluten-reactive CD4+ T cells.

59

Paper III

Wyrożemski Ł, Sollid LM, Qiao SW. 2021. C-type lectin-like CD161 is not a co-signalling receptor in gluten-reactive CD4+ T cells. Scand J Immunol. 93: e13016.

As presented in paper II, expression of KLRB1, a gene encoding C-type lectin-like receptor CD161, was found to be upregulated in gluten-reactive CD4+ T cells.

CD161 is a surface receptor that was suggested to act as co-signalling molecule in T cells. In line with Christophersen et al., 2019a, we found that CD161 was present on gluten-reactive CD4+ T cells isolated from peripheral blood and gut of UCeD patients. Moreover, CD161 was expressed on many gluten-reactive CD4+ TCCs derived from gut samples of TCeD patients. We hypothesized that in gluten-reactive CD4+ T cells, CD161 might modulate TCR-dependent production of IFN-γ and IL-21 as well as TCR-dependent proliferation. To address this question, we combined T-cell stimulation and T-cell proliferation assays with ligation of CD161 by anti-CD161 monoclonal antibodies (mAbs). By utilising expanded anti-CD161-positive gluten-reactive CD4+ TCCs, we found that ligation of CD161 did not influence TCR-dependent cytokine secretion nor TCR-TCR-dependent cell proliferation, thereby suggesting that CD161 does not act as co-signalling receptor on gluten-reactive CD4+ T cells. Incidentally, we observed that surface CD161 was downregulated following ligation with anti-CD161 mAbs. Although some degree of steric hinderance between unlabelled anti-CD161 mAbs used for ligation and fluorescent-labelled anti-CD161 mAbs used for staining was detected, the downregulation of CD161 was similar to that elicited by LLT1 as demonstrated by Fergusson et al., 2014.

60

Paper IV

Wyrożemski Ł, Qiao SW. 2021. Immunobiology and conflicting roles of the human CD161 receptor in T cells. Scand J Immunol. 94: e13090.

C-type lectin-like CD161 is a transmembrane II protein expressed on the surface of NK cells and T cells. Besides being present on a fraction of circulating T cells, CD161-positive T cells were also enriched in various tissues including the gastrointestinal tract. CD161 was also applied as a marker of innate-like T cells and IL-17-producing T cells. While the inhibitory role of CD161 on NK cells is well established, the co-signalling function of CD161 on T cells is less clear. In this review article we summarised the current state of knowledge regarding the CD161 expression in T cells. In particular, we focused on describing the limitations associated with use of CD161 as a marker of innate-like T cells and IL-17-producing T cells. Moreover, we discussed published studies that had examined the CD161 as co-signalling molecule in T cells. In the recent years, a few review articles discussing different aspects of CD161 have been published (Fergusson et al., 2011, Kirkham & Carlyle, 2014, Llibre et al., 2016b and Bialoszewska & Malejczyk, 2018), yet this review gives a first attempt to put together the existing conflicting and contradictory studies on the role of CD161 expression in T cells.

61

Methodological considerations

Coeliac disease patients (paper I, II and III)

The PhD project was approved by regional ethics committee (REK ID: 6544, project leader Knut E. A. Lundin). Peripheral blood and duodenal biopsy specimens were collected from UCeD patients during clinical assessment at Oslo University Hospital.

All patients had undergone a biopsy-confirmed diagnosis of CeD according to the established guidelines (Ludvigsson et al., 2013). UCeD patients included in paper I, paper II and paper III were HLA-DQ2.5. TCeD patients were previously recruited for other purposes and samples obtained from these individuals were kept in a nitrogen tank. In paper III we used samples from TCeD that were either HLA-DQ2.5 or HLA-DQ8.

In paper I we explored seq libraries from three UCeD patients while scRNA-seq libraries from four UCeD individuals (CD1507, CD1517, CD1615 and CD1982) were used in paper II. In addition, the scRNA-seq data from two healthy donors (GSM4138162 and GSM4138163) were included in paper II. In paper III we have shown flow cytometry staining of gluten-reactive CD4+ T cells from peripheral blood and gut of one UCeD patient (CD1982). Although only single individual was included, the staining provides representative information of CD161 surface expression that was in concordance with paper II and with examination gluten-reactive CD4+ T cells by Christophersen et al., 2019a. Moreover, in paper III four in vitro expanded gluten-reactive CD4+ TCCs from TCeD were utilized for the functional investigation of CD161. TCCs were previously generated from duodenal biopsy samples of TCeD patients using the protocol that was described in detail elsewhere (Molberg et al., 2000). TCCs were either HLA-DQ2.5-restricted (TCC387.3 and TCC411.3.39) or HLA-DQ8-restricted (TCC544.1.3.1 and TCC548.1.8.5).

62

Cell isolation, tetramer staining and cell enrichment (paper I, II and III)

Gluten-reactive CD4+ T cells in paper I, paper II and paper III were obtained from UCeD patients. Initially, the fractions of PBMCs were isolated from whole blood by means of density gradient centrifugation using Lymphoprep. Subsequently, PBMCs were stained with HLA-DQ2.5:gluten tetramers in order to visualise gluten-reactive CD4+ T cells. In general, tetramer staining is a method of visualising and characterising antigen-specific T cells. Tetramers typically consist of four MHC molecules that are complexed with the antigenic peptides on interest. Each MHC molecule has a biotin attached to its tail that binds to streptavidin which operates as an anchor. In this way MHC-peptide molecules can be multimerised in order to overcome low affinity of TCR interaction with a single MHC-peptide molecule.

Moreover, streptavidin is often labeled with a fluorochrome so that tetramer-stained T cells can be analysed by flow cytometry or sorted using FACS. Although tetramer staining is a very powerful method, it still suffers from some shortcomings such as variable staining intensity and some degree of unspecific binding (Christophersen, 2020).

HLA-DQ2.5:gluten tetramers used in paper I, paper II and paper III represented four immunodominant gluten T-cell epitopes, that is DQ2.5-glia-α1a, DQ2.5-glia-α2, DQ2.5-glia-ω1 and DQ2.5-glia-ω2. These tetramers were expressed recombinantly in baculovirus-insect cell expression system (Quarsten et al., 2001). During the preparation phase the tetramer molecules were biotinylated and tetramerized on phycoerythrin-labelled streptavidin. Tetramer staining was performed in room temperature in the dark and took up to 40 minutes. Although we used tetramers carrying the most frequently found gluten epitopes, gluten-reactive CD4+ T cells recognising other important gluten epitopes were omitted. Tetramer-stained T cells were subsequently subjected to bead enrichment with

anti-phycoerythrin-63

conjugated magnetic microbeads. This step was necessary to compensate for the low frequency of gluten-reactive CD4+ T cells in peripheral blood (Christophersen et al., 2014).

Flow cytometry and Fluorescence Activated Cell Sorting (paper I, II and III)

Flow cytometry -based approach was used to analyse and sort gluten-reactive CD4+

T cells. In paper I, paper II and paper III, tetramer-stained T cells were further stained using a panel of mAbs which allowed to specifically search for tetramer-stained T cells with the desired phenotype while excluding unwanted lymphocytes.

Flow cytometry which is a method enabling examination of single cells suspended in a solution. There, each cell is evaluated for visible light scatter as well as one or many fluorescence parameters. Modern flow cytometers are equipped with multiple laser systems that enable to measure up to 18 different parameters at the same time.

Flow cytometry is very popular method to perform immunophenotyping and is frequently used to detect and characterise rare or infrequent cell populations.

However, flow cytometry is quite challenging method and many variables have to be taken into consideration when designing and performing experiments utilising this method. The most important steps to consider are instrument compensation and proper controls that collectively enable to distinguish background noise from actual biological signals. Compensation is performed using single-stained samples for each antibody in the experiment panel which is necessary to ensure that signal spillover between different fluorescence channels is corrected. One of the most important controls is fluorescence minus one which is often used in order to set up proper gating for the given parameter. On the other hand, FACS sorting is a modified type of flow cytometry where cells can be isolated and used for further experiments. In short, desired cell populations, that is negative or positive for the