• No results found

4.2 MOFA

5.1.5 Subtype-specific transcription factors

For the ATAC-seq data, TFBS enrichment was performed using both UniBind and HOMER. HOMER was also used to find enriched TFBS motifs from the RNA-seq gene signatures. The latter only involved enrichment in promoter regions, which means that the TFBS enrichment for the ATAC-seq data gave a fuller picture by involving possible enhancer regions as well. In addition, the cluster assignment differed slightly between the data sets (Figure S5). Therefore, the results from both data sets were not expected to fully overlap.

LumA/B

The results of the TFBS enrichment analyses for the ATAC-seq data show that FOXA1, FOXA2 and GATA(3/2) are enriched for the LumA/B cluster, for both TFBS enrichment tools (Figure 4.8). The roles of FOXA1 and GATA3 in

Luminal breast cancer have been widely documented in multiple studies, while FOXA2 and GATA2 have received less attention. However, some studies have shown that FOXA2 acts to prevent metastasis in breast cancer (Zhang et al., 2015). In addition, multiple sets of TFBSs for ERα (showed as "ESR1" in the plot) were found enriched using UniBind. ERα is known to be a main driver of ER+ subtypes, and was therefore expected to be enriched. It is unclear why this TF was not enriched using HOMER for the ATAC-seq data. The TFBS enrichment for the LumA/B cluster from the RNA-seq data showed an enrichment for FOXA1 and FOXA2, showing that these TFs could possibly upregulate some of the most highly expressed genes in the LumA/B peak signature through the promoter regions (Figure 4.5).

Basal

For the Basal cluster of the ATAC-seq data, which only contained Basal-like samples, SOX10, SOX2, TEAD4 and GHRL2 were found enriched, regardless of TFBS enrichment tool (Figure 4.9). All of these TFs have previously been found to be enriched in the Basal-like (or triple negative) subtype (Cimino-Mathews et al., 2013; Rodriguez-Pinilla et al., 2007; Wang et al., 2015). In addition, STAT3 and MYC, which have also been proposed as potential Basal drivers (Zhu et al., 2020; Xu et al., 2010), were found enriched using UniBind, but not HOMER.

The TFBS enrichment for the Basal cluster from the RNA-seq data showed an enrichment for GHRL2 and OCT4-SOX2-TCF-NANOG (Figure 4.6). As these overlaps with some of the TFs found in the ATAC-seq data, this strengthens the hypothesis that SOX TFs and GHRL2 are important drivers of Basal-like breast cancer. Also, because the clustering of Basal-like samples was strong also for the RNA-seq data, it is likely that these TFs are involved in regulating some of the genes in the Basal gene signature.

The results of the following clusters are shown in Attachments.

LumA/Normal

The LumA/Normal cluster differed from the LumA/B cluster, as the peaks that characterized this cluster were enriched for CEBP and members of the STAT family, including STAT5 (Figure 6.1). STAT5 has previously been associated with good prognosis in ER/PR+ breast cancers (Barash, 2012), indicating that this group might represent the group with the best prognosis. These TFs were however not found in the promoter regions from the RNA-seq data, which contained very few enriched TFBSs (Figure ??). Previous research has shown that both

CEBP and STAT bind to enhancer regions (Ramji & Foka, 2002; Vahedi et al., 2012), which explains why these were not found in the promoter region of the highly expressed genes.

Her2

The results for the Her2 cluster showed no overlap between HOMER and UniBind (Figure 6.2). However, using UniBind, TFAP2C and YY1 were found to be enriched. These TFs are both previously found to be important for the Her2 subtype (Begon et al., 2005; Powe et al., 2009; Woodfield et al., 2010). Also here, UniBind is more consistent with previous research. The TFs found enriched in the promoter regions of the top genes for the Her2mix cluster (Figure ??) did not correspond with the TFs found for the Her2 cluster from the ATAC-seq data. Here, FOXA1 and FOXA2, among others, were found enriched. These are normally associated with ER+ subtypes, such as Luminal A and Luminal B. The enrichment seen for these TFs are likely because the Her2mix cluster contains a mix of five Her2 samples, three Luminal A samples and 2 Luminal B samples, and some of the highly specific genes for this group might be related to the Luminal subtypes.

LumA

The LumA cluster from the ATAC-seq data contained a large number of Luminal A samples, and some Luminal B samples. Although the locations of the most open regions differ from the LumA/B cluster, they involve a lot of the same TFs (Figure 6.3). These include FOXA1, FOXA2 and GATA3/GATA2. The overlap indicates that the tumors making up the samples of this cluster might behave in a similar matter as those from the LumA/B cluster. This cluster had no corresponding cluster in the RNA-seq data, as the RNA-seq data only contained two Luminal clusters, instead of three. Therefore, most of the samples in this group were distributed between the LumA/B and LumA/Normal RNA-seq clusters. This complicated the results of the analysis for the ER+ subtypes (Luminal A, Luminal B and Normal-like) considerably, especially when making connections between the data sets. Understanding which TFs of the ATAC-seq cluster regulates which genes of the RNA-ATAC-seq clusters is therefore a hard connection to make from this analysis. However, the TFs inferred from the ATAC-seq data are largely supported by literature, and give valuable information on its own.