Dissecting the transactivation domain of the transcription factor c-Myb A joint project for three MSc students

(1)

Dissecting the transactivation domain of the transcription factor c-Myb

A joint project for three MSc students

Jan Ove Storesund

30 credits

Department of Bioscience

Faculty of Mathematics and Natural Science

UNIVERSITY OF OSLO

June 2019

(2)

II

(3)

III

Dissecting the transactivation domain of the transcription factor c-Myb

Jan Ove Storesund Department of Biosciences

University of Oslo

June 2019

(4)

IV

Dissecting the transactivation domain of the transcription factor c-Myb Jan Ove Storesund

http://www.duo.uio.no

Print: Reprosentralen, University of Oslo

(5)

V

Acknowledgements

The work presented in this thesis was conducted at the Department of Biosciences, Faculty of Mathematics and Natural Science at the University of Oslo, from January 2019 to June 2019.

First, I want to give all my thanks to my supervisor Odd Stokke Gabrielsen for having me in his group and giving me this opportunity. His guidance and positive outlook whenever I hit a wall and nothing seemed to work has given me the strength needed to pull through.

I am truly grateful of the guidance received in the lab by my co-supervisor Marit Ledsaak. She always had time to answer questions, and even after the fourth visit to her office in 30 minutes, the answers were all positive and encouraging. Her experience has been invaluable and I would not have wanted any other to guide me through the technicalities.

A thank you goes to all my friends who have been working next to me, for giving me sound advice along the way and making this study into a pleasant experience. An extra thanks to Signe Värv for her constant curiosity around my work, and Kirill Jefimov for making me think outside of the box to find the solutions to my problems.

This combined study would not have been possible without Priyanga Dina Udayakumar and Guro Næs. The inspiring work ethics of Guro, and the contagious happiness of Dina will always be reminding me of how nice it is to work in the Gabrielsen Lab. Thank you both for helping me through these past six months!

I want to thank my parents Anne Berit and Tor Arvid for helping me through not only the master study, but all the past years by giving me the support needed along the way. You have always been there when I need it, from lonely moments to just saying hi.

Thank you, everyone who made it possible for me to succeed!

Oslo, June 2019

- Jan Ove Storesund

(6)

VI

(7)

VII

Abstract

Comprehending the transcription factor (TF) and master regulator c-Myb and its function in regulating gene expression is important to understand the hematopoietic homeostasis. The molecular mechanisms by which the transactivation domain (tAD) of c-Myb function is not yet fully understood, including questions relating to interactions with a vast network of cofactors.

The human c-Myb (hcM) has a tAD hypothesised to contain short linear motifs (SLMs) able to recruit cofactors. Inspired by the studies of Staller et al.[1] and Boija et al.[2], potential SLMs will be studied by a mutational approach where a few amino acids are altered at a time. Amino acid regions of acidic, basic and hydrophobic nature assumed to affect transactivation properties and recruitment of cofactors are mutated. The goal is to find the model best fit to explain the function of c-Myb tAD. Through a joint project with three MSc students, we addressed these questions using a common generated set of mutants. Each of the students tested these mutants in separate systems.

The current part of this project used several effector plasmids with altered versions of hcM, combined with a reporter plasmid capable of expressing a dual reporter gene. Saccharomyces cerevisiae (S. cerevisiae) was used as a model organism because it was expected to reveal to what degree the tAD of c-Myb is able to operate in a distant organism. This would reveal common basic mechanisms, and to what degree the tAD depends on recruited cofactors that are of mammalian nature. The results are compared to the two parallel studies done with identical mutants in separate mammalian systems.

When c-Myb was expressed in yeast, it clearly was able to activate the reporter gene driven by Myb-responsive elements. This shows that some basic features of the transactivation process is conserved throughout eukaryotic evolution, making a human TF able to turn on genes in yeast.

In contrast, mutants observed to affect the c-Myb activity severely in mammalian systems showed little or no variation when studied in yeast. Although no quantitative reporter data could be presented for the yeast system at this stage, a stamp test revealed all mutants as active with small deviations compared to the wild-type (WT). The findings suggest that the yeast system is more promiscuous and less sensitive to the tAD composition than the mammalian systems. It is possible that c-Myb has evolved to become specialized, where interacting with mammalian cofactors have become critical for the tAD to function.

(8)

VIII

(9)

IX

1 Introduction

Transactivation domains (tAD) are regions of transcription factors (TFs), which in combination with the DNA binding domain (DBD) can activate transcription from a promoter by contacting the transcriptional machinery either directly or through other proteins known as coactivators.

tADs are not much studied, and therefore some information is lacking. There are different models of the molecular mechanism of tADs, but there is still disagreements between the scientists.

In the first chapter, the theoretical basis of transcription and epigenetics will be introduced. In the start, there will be discussed basic knowledge of the eukaryotic genome, epigenetic regulation and the transcription process. Subsequently, the main topic of this master project, the tAD of the proto-oncogenic transcription factor c-Myb will be reviewed. Finally, the aims of the study is presented at the end of this chapter.

This chapter is identical in the three MSc theses, and is written by all three students in collaboration.

1.1 The eukaryotic genome

The function of the genome is to store the genetic information of an organism. The linear double-helix structure of eukaryotic, genomic DNA is packaged into a chromatin structure to adapt to the size of the nucleus. The smallest unit of chromatin is the nucleosome, a DNA-histone protein complex, formed by wrapping DNA around a complex of eight histone proteins. The octamer contains two copies each of the core histones H2A, H2B, H3 and H4 [3].

Also present in most nuclei, the linker histone H1 associates with linker DNA, which provides partial nuclease protection for up to 20 bp of linker DNA [4]. The nucleosomes are further coiled to form higher-order structures like chromatin loops and fibers, and ends up with the chromosome structure [3, 4].

It is more difficult to access the DNA strands when the double-helix is packed into a chromatin structure. Regulation of accessibility must therefore be provided. This relates to both the transcription-, replication- and DNA-repair process. To regulate access to the DNA, the chromatin must have a dynamic structure. The flexibility can be altered for example by eviction

(12)

2

of histones from DNA by ATP-dependent chromatin remodeling enzymes and covalent modifications of histones [5].

1.1.1 The epigenome

Epigenetics is the study of heritable changes in gene expression or phenotype that are stable between cell divisions, but do not involve changes in the primary nucleotide sequence. The combination of histone and DNA post-translational modifications and the related interacting proteins result in the epigenome, which helps defining the transcriptional program in a given cell [3]. The epigenetic modifications are important markers for interpreting the genome and inducing local changes in chromatin, which leads to either permissive or suppressive effects on gene expression and other processes.

Several molecular mechanisms contribute to epigenetic gene regulation. These include the ATP-dependent chromatin remodeling enzymes and the histone modifier enzymes [3]. The ATP-dependent chromatin remodeling enzymes use ATP hydrolysis to disrupt histone-DNA interactions and the histone modifier enzymes modify nucleosomal histones [6].

1.1.2 Chromatin structure and function

Chromatin is the fibers, which has a total length of 2 meters, in which DNA and genes are packed in the nucleus of a cell. The structure is accomplished when the negatively charged DNA is tightly compacted with the help of the positively charged histone proteins. Chromatin is also the physiological template of all eukaryotic genetic information and a subject to a diverse array of post-translational modifications [7].

The specific post-translation modifications of histones are associated with an open or closed chromatin state. For instance, histone acetylation contributes actively in the process of gene transcription, by weakening the interactions between histones and DNA, which results in an open chromatin state. The histone phosphorylation adds a negative charge to the histone which results in release of nucleosome structure [8].

(13)

3

1.1.3 Transcription

The expression of genetic information of a cell starts with transcription. This process is tightly regulated to ensure that genetic programs are adapted to cell requirements. If the transcription is deregulated this can lead to serious diseases, including cancer [9].

The transcription process is when ribonucleic acid (RNA) is synthesized from a complementary DNA strand through three steps. One of the RNA products is mRNA, which is a single stranded nucleotide sequence complementary to the DNA strand. The following process is the translation where protein is the final product. The three steps of transcription are the initiation step, elongation step and termination step. Transcription is catalyzed by RNA polymerase enzymes along with general and sequence-specific TFs, transcriptional repressors, coactivators, corepressors, histone-modifying enzymes, and chromatin remodeling complexes [9, 10]. In eukaryotes, the process starts when the preinitiation complex (PIC) assembles at the core promoter [11]. The PIC includes RNA polymerase II (Pol II), the general TFs TFIIA, -B, -D, -E, -F, and -H, and additional coactivators and corepressors. Pol II reads the DNA sequence of protein coding genes, and synthesizes complementary messenger RNA (mRNA) [10, 12].

1.1.4 Transcription factors

There are two types of TFs, general and sequence-specific. Most TFs have two domains with different functions [13]. TFs are DNA-binding proteins that influence cell fate by interpreting the regulatory DNA within a genome. All the different TFs recognize DNA in a specific manner, and their role is to recruit the different factors needed for transcription to start. They bind to promoter regions in the proximity of genes or at more distant enhancers, and thereby regulate their target genes. Depending on modifications and interaction partners, the TFs can either activate or repress gene expression. Transcriptional repressors are divided into two classes: general and gene specific. The different repressors might block the ability of Pol II to interact with the coding DNA, and influence DNA compaction and thus the accessibility of chromatin. The repressors can also recruit histone deacetylases making the chromatin more compact, which reduce the accessibility [10, 14]. Post-translational modifications can regulate, both rapidly and reversibly, TFs by affecting subcellular localization, stability and interactions with other proteins [15].

(14)

4

1.2 Transactivation domains

The general practice has for several years been to distinguish between four classical models defining different classes of tADs [16]. These models focus on the amino acid composition, and their placement in relation to each other within the tAD. More recently, new models have been published, and both Staller et al. and Boija et al. showed interesting findings supporting these models in their articles from last year [1, 2]. Since there exist different types of tADs, it is naturally to think that the transcriptional activation is likely to be mediated by several different mechanisms [17].

This section describes the different models of how the tADs operate and what they look like.

The tAD of the c-Myb oncoprotein used as a model in this thesis will be presented later on.

1.2.1 Model 1: transactivation domains as acidic domains

This model of tADs being essentially acidic domains states that these domains tends to be rich in D and E amino acids in the center of the domain. The acidic domains are also called acid blobs or negative noodles, based on the formation and action of the transcriptional PIC. The PIC forms a convoluted loop that brings the tAD into contact with the Pol II and its promoter binding proteins [18]. It is thought that the negative noodles attach through their DNA-binding domains to the appropriate cis-activating sequences. There are stabilizing interactions between the carboxylates of the noodle and the hydroxyl groups of the CT7n, an appendage in the PIC [18]. Acidic domains not anchored to the DNA may be able to form a stable but inactive complex with some essential component of the general transcriptional apparatus [19].

There have been several studies of the yeast GAL4 system and its tAD. Gill et al. did a mutational study on this domain, which showed that there is a correlation between the strength of activation and the preponderance of negative charges [20]. The VP16 system has also been studied in some detail. The VP16 is a herpes simplex virus protein. Sadowski et al. showed that the hybrid protein GAL4-VP16 activates transcription remarkably efficiently in mammalian cells when bound close to, or at large distances from the gene [21].

The important role of the tAD was shown in a study where various lengths of the transactivation region in a specific yeast Gcn4 construct were deleted. The deletions resulted in a higher loss of transcription activity compared to the wild-type (WT), where the loss corresponded to the

(15)

5 size of the deleted activation region. If these findings are analyzed in the light of the acidic blob model, it can be assumed that the deletion has removed critical acidic amino residues essential for activation [22].

Ness S.A. stated that the acidic residues are important, and as long as the residue is acidic it will give transcriptional activity [23]. If an acidic stretch is replaced by another acidic stretch from any other tAD, VP16 in this case, it does not change the activity of c-Myb largely.

1.2.2 Model 2: transactivation domains as specific residue-rich domains

Glutamine-rich domains

The human transcription factor Sp1 utilizes glutamine-rich tADs and binds to GC-rich sequence elements. Courey et al. found out that high glutamine content might be an important feature of the tADs, but it is agreed upon that random glutamine-rich protein segments cannot serve as a tAD on its own [19].

Glutamine-rich and acidic domains act by different mechanisms on the background that the Sp1 activation region can super-activate transcription, while the isolated acidic tAD inhibit transcription [19, 24]. It was proposed that glutamine-rich domains may only interact with the general transcriptional machinery when anchored to the DNA [19].

Proline-rich domains

The human CTF/NF-1 consists of a family of CCAAT box binding proteins that activate both the transcription and DNA replication [17]. The CTF C-terminal region includes an unusual type of tAD containing around 25% proline residues. This tAD activates the heterologous promoter SV40 when fused to the DNA binding domain of Sp1. The proline-rich region in the tAD is needed for specific interactions with other factors that play a role in the initiation or transcription. There is a possibility that the domain interacts directly with components of the general transcriptional complex such as the TFIIA, -B, -D, -E, or -F, the subunits of Pol II, or other ancillary factors that participate in the formation of an initiation complex [17]. There is also a possibility that proline domains will fold into a unique structure that forms protein-protein contact with the transcription machinery.

(16)

6

Isoleucine-rich domains

The Drosophila tissue-specific transcription factor NTF-1, also known as Elf-1, binds specifically to promoters of several developmentally regulated Drosophila genes [25]. In contrast to other factors, the NTF-1 has a single tAD, which has a high percentage of isoleucines. The isoleucines were found out to be important for the function, since changing as few as two of the isoleucines to alanine caused its activity to be significantly disrupted [25].

It was also found that NTF-1 is likely to be activating transcription via different mechanisms in yeast and Drosophila. The tAD in NTF-1 might therefore be an example of species-specific tADs or even tissue-specific domains that function only in specific Drosophila cell types [25].

1.2.3 Model 3: transactivation domains as short linear motifs

Short linear motifs (SLMs) mediate molecular interactions and may be involved in recruitment of cofactors and thus enhance transcription. SLMs are hydrophobic and conserved sequence- specific motifs, some of which create powerful tADs as they bind proteins via a “fuzzy”

complex [26]. Warfield et al. focused on the central tAD of the yeast factor Gcn4. It appeared to be intrinsically disordered, binding the Gal11 activator-binding domain (ABD) 1 as a helix in this “fuzzy” complex. The complex has a purely hydrophobic protein-protein interface, allowing the Gcn4 helix to bind Gal11 in multiple different orientations [26]. The SLM presented by Warfield et al. is the WxxLF-motif and they focused on the mediator subunit Gal11/Med15, which contains three activator-binding domains for the yeast TF Gcn4 [26]. The different orientations induced by the “fuzzy” protein-protein interaction explain how different tADs can bind to coactivators.

Brzovic et al. also looked at the “fuzzy” complex of the Gcn4-Gal11, and found out that this is a low-affinity interaction rather than a high-specificity interaction [27]. The ABD of Gal11 contains a hydrophobic cleft where the hydrophobic motif of Gcn4 can bind. This interaction also induces a helical formation that may facilitate activity [27].

The sequence LxxLL was identified in RIP-140, SRC-1 and CBP [28], and was later found in the tAD of c-Myb [29]. Heery et al. suggested that the motif is dependent on hydrophobic residues in helix formation in order to interact with nuclear receptors [28].

(17)

7

1.2.4 Model 4: transactivation domains as SLMs embedded in intrinsically disordered acidic domains

There is a general agreement on the acidic domain model, but it has been quite unclear why tADs are acidic. Staller et al. uncovered a role for the acidic residues based on the classic model of acidic tADs. They presented a tAD model with the presence of a specific SLM embedded in disordered regions, with acidic residues providing exposure to binding partners. They mainly focused on the WxxLF motif as a SLM of the yeast TF Gcn4 [1]. Other scientists have been hinting to the same model in earlier years, such as Lu et al. and Shen et al. [30-32].

Staller et al. used a rational mutagenesis scheme that deconvolved the function of four tAD sequence features, namely acidity, hydrophobicity, SLMs, and intrinsically disorder regions (IDRs). They did this by quantifying the activity of thousands of variants in vivo and simulating their conformational ensembles using an all-atom Monte Carlo approach [1].

Their model explains why the acidic tAD are acidic, and why mutating hydrophobic residues has the largest influence on the activity. The  helices expose key hydrophobic residues, and is therefore convenient but not essential. The distribution of charge was also shown to have a large impact on activity [1]. Their results reconcile existing observations into a modified model of its function: the intrinsic disorder and acidic residues keep two hydrophobic motifs from driving collapse. The most-active variants keep their aromatic residues exposed to the solvent [1]. The results can also be explained by electrostatic interactions as the hydrophobic binding cleft on Gal11 is flanked by positively charged residues, enhancing Gcn4-Gal11 binding and thereby enhance activity [27].

This model is a combination of model 1 and 3 regarding acidic patches and SLMs in tAD. These apply to c-Myb as it has the motif LxxLL, which is also surrounded by acidic residues, making this an excellent experimental tAD to test this model.

(18)

8

1.2.5 Model 5: transactivation domains as domains inducing liquid-liquid phase-transition

The liquid-liquid phase transition appears to be a fundamental mechanism for organizing intracellular space. Membraneless organelles adopt round morphologies and coalesce into a single droplet upon contact with one another. In this droplet, the organelles exhibit dynamic exchange with the surrounding nucleoplasm and cytoplasm [33]. The first membraneless compartments were observed in the nucleus, and then later in the cytoplasm and on the

membranes of eukaryotic cells [34].

The latest model of the tAD is that it forms phase-separated condensates with the Mediator to activate expression. Boija et al. recently studied the tAD of diverse TFs, such as OCT4, GCN4 and the estrogen receptor (ER) [2]. The dynamic interactions between proteins are typical of the IDR-IDR interactions that facilitates the formation of phase-separated biomolecular condensates [2]. The transcriptional control has recently been proposed to be driven by the formation of phase-separated condensates [35], and in addition, MED1 and BRD4 are shown to form phase-separated condensates at super-enhancers [36]. Boija et al. showed a model whereby TFs interact with the Mediator and activate genes by the capacity of their tADs to form phase-separated condensates. In addition, they found that the tAD amino acids required for phase separation with the Mediator condensates, for both OCT4 and GCN4, were also required for gene activation in vivo [2]. They also observed that by recruiting a disordered protein to the chromatin, diverse coactivators might form phase-separated condensates to drive oncogene expression [2].

1.2.6 The transactivation domain of c-Myb

The tAD of c-Myb has been located in the middle of the protein, but it lacks a systematic functional characterization [37]. The domain consists of clusters of acidic amino acids and a hydrophobic region [38, 39], similar to other tADs found in other transcription factors (reviewed by Ptashne [40]). The domain in c-Myb has been defined as a stretch of 52 amino acids, specifically amino acid 275-327 [41]. Both p300 and the histone acetyltransferase (HAT) CREB-binding protein (CBP) binds to the tAD through their kinase-inducible domain interacting domain (KIX) [42-44]. Part of the c-Myb tAD has a constant intrinsically helical secondary structure that binds constitutively, i.e. it does not change its shape or form in order to interact with its target [45].

(19)

9 Molvaersmyr et al. found out that c-Myb has two activator functions (AFs). There is one AF in the central tAD, which acts in a constitutive fashion, and a second one in the C-terminal regulatory domain (CRD) [46]. This double AF can help the c-Myb being a more potent transactivator.

In this project, the tAD of c-Myb were studied by creating a set of mutations in the central tAD.

Figure 1 shows the sequence of the tAD in c-Myb, where the acidic and basic amino acid residues are marked in red and blue respectively. Some known and hypothesized interaction partners are also included. The sequence in this figure includes more amino acid residues than depicted for the tAD in the c-Myb overview due to the mutations performed during this project.

Amino acid residues between 267 and 361 were mutated.

Figure 1 - A closer look at the transactivation domain of c-Myb. The different basic (blue) and acidic (red) patches as well as their possible interaction partners are included.

(20)

10

1.3 The transcription factor c-Myb

The myb oncogene is the transforming gene of the Avian myeloblastosis virus (AMV) and E26 [41, 47, 48]. There are three closely related Myb genes that are present in vertebrate animals, A-Myb, B-Myb, and c-Myb [41]. In humans these genes are referred to as MYBL1, MYBL2, and MYB. They all share similarities, but are expressed in different tissues [41]. A-Myb is required for spermatogenesis and mammary gland proliferation, while B-Myb is required in early embryonic development [41, 49]. A-Myb and B-Myb are not oncogenic and do not have transforming activity [50]. The biological functions of c-Myb are further discussed in section 1.3.2.

c-Myb was originally identified as the homologue of the v-Myb oncogenes, which can transform undeveloped hematopoietic cells in tissue culture and cause acute leukemias in animals [51]. The c-Myb protein is 75 kDa. The oncogenes v-MybÂMV and v-MybÊ26 are both altered versions of the c-Myb, with sizes of 45 kDa and 135 kDa respectively [41]. While v- MybÂMVcontains several amino acid substitutions, v-MybÊ26 has a viral gag N-terminally and another transcription factor (ETS) which is fused C-terminally [52].

1.3.1 The domains of c-Myb

The proto-oncogene c-Myb encodes a protein that consists of three structural and functional domains, see Figure 2. In addition to the mentioned tAD, c-Myb contains the highly conserved N-terminal DNA binding domain (DBD) and a C-terminal regulatory domain (CRD) [53].

These domains are all involved in regulating the activity of c-Myb and contains interaction sites for DNA and other proteins [53].

(21)

11

Figure 2 - Structural and functional domains of c-Myb. The c-Myb protein consists of 640 amino acid residues and the weight is 75 kDa. The DNA binding domain is located N-terminally and is shown here in orange with three repetitive elements: R1, R2 and R3. The transactivating domain is located in the center of the c-Myb, shown here in blue. In the C-terminal end the regulatory domain is located, shown in green, with its three subdomains: FAETL/LZ, TP and EVES.

The N-terminal DNA binding domain (DBD)

The N-terminal DBD consists of three tandem direct imperfect repeats, R1, R2 and R3 [54], all three being tryptophan-rich 51 or 52-residue repeats [55]. Howe et al. showed that the R2 and R3-MYB repeats are absolutely required for complex formation, and the R1 repeat is dispensable [56]. However, it is found that R1 increases the stability of the Myb-DNA complex [55, 57]. v-Myb and variations of Myb lacking R1 can possibly affect many more genes, as R2R3 without R1 will have a lower specificity [23]. Each repeat gives rise to a helix-turn-helix- related motif with unconventional turns. It is the tryptophan residues in the repeats that will form a hydrophobic core, which will maintain the structure of the motif [58].

The functional DNA binding domain recognizes the consensus sequence 5’-(T/C)AAC(G/T)G(A/C/T)(A/C/T) -3’, referred to as the MYB recognition element (MRE) [54, 59, 60]. The MREs have a bipartite structure, where the R3 binds to the first half-site and the R2 binds to the second half-site [54, 55].

The DBD is also an important site for protein-protein interactions and is also involved in chromatin remodeling. Mo et al. showed three repeated domains in the DBD that have similar structure as the SANT domain. The DBD binds to the tails of histone H3 and H3.3, and thereby facilitate histone tail acetylation [61]. Recently, our laboratory studied this feature in more detail and found that c-Myb acts as a pioneer factor and that specific histone modifications, including H3K27ac, prevent binding of c-Myb to histone tails. This might represent a mechanism for controlling the dynamics of pioneer factor binding to chromatin [62, 63].

R1 R2 R3 FAETL/LZ TP EVES

DNA binding domain Transactivation domain C-terminal regulatory domain

1 37 193 275 325 401 566 640

N C

(22)

12

The C-terminal regulatory domain (CRD)

The CRD was originally referred to as the negative regulatory domain (NRD), since carboxyterminal sequences was found to have a negative effect on transactivation and a negative regulatory function on c-Myb activity. It was observed that after deletion of C-terminal regions, c-Myb obtained higher transactivational activity and increased transformation capacity [38, 64]. The CRD contains three subdomains (see Figure 2), which function independently of each other.

The FAETL subdomain, which is located N-terminally of the CRD, is named after the region EFAETLQLID (aa 321 to 330) [65]. This domain is required for transactivation of c-Myb and oncogenic transformation by v-Myb [66]. The FAETL region contains a leucine rich region, which was found to be critical for negative regulation of c-Myb [48].

The TP subdomain is a region (aa 443 to 514) with the highly conserved threonine- and proline- rich motif TPTPFK. This domain is also implicated in negative regulation, and may mediate folding and protein interaction [23].

The EVES subdomain is located C-terminally of the CRD, and has highly conserved amino acids [67]. The interaction is thought to be regulated by post-translational modifications, and might also affect the accessibility of the leucine zipper region on the FAETL subdomain [68, 69]. The two lysine residues, K503 and K527, are placed in the EVES subdomain and these are modified by SUMOylation [70]. It has been shown that SUMOylation regulates the transcription of c-Myb negatively [70, 71]. When SUMOylation is abolished by mutation, the negative effect of the domain disappears and the region turns into a tAD. Hence, the CRD also harbors an AF along with the tAD [46]. The AF in the CRD is SUMO-regulated (SRAF), which can be activated upon deSUMOylation of c-Myb resulting in a highly active transcription factor.

(23)

13

1.3.2 Target genes and biological functions of c-Myb

MYB targets over 80 genes, where most of them are positively regulated and a few are repressed. A cooperation with other TFs is often required, this can be for instance C/EBP and CBP/p300 [41]. The target genes can be classified into three functional groups [52]:

1. Housekeeping genes, genes that have to function for maintenance of basic cellular functions, they are stably expressed in all cells and are expressed under the developmental stages [72].

2. Genes involved in specific functions in specific cell types or lineages. This include the Myb- induced myeloid protein 1 (mim-1).

3. Genes linked to oncogenicity. This includes that are involved in proliferation, survival and differentiation.

c-Myb plays several roles in hematopoiesis, both in progenitor cells and during differentiation [73]. In addition to having a key role in blood cell production and intestinal maintenance in adults, the c-Myb has also been reported to be expressed in the respiratory tract, skin, and retina [74]. Any disturbances related to expression in c-Myb might lead to diseases such as congenital disorders and hematologic malignancies [75]. Overexpression of c-Myb has been seen in several types of human cancers, such as breast cancer, colorectal cancer and different types of

leukemia [76-79].

As mentioned, c-Myb is involved in proliferation and differentiation, and has also been proven involved in apoptosis[80].

Proliferation

Antisense inhibition of c-Myb has been employed to study how c-Myb functions in cellular proliferation. Inhibition of c-Myb causes blocking of cell cycle progression in late G1 phase and early S phase, and thus the proliferation of hematopoietic cells [41]. Our laboratory recently published a study where c-Myb were knocked-down using siRNA to block endogenous MYB mRNA. The findings show that WT c-Myb when rescued from knockdown rescued 766 affected genes, while cells with the c-Myb mutant D152V lost the expression of 104 genes [81].

When Fuglerud et al. studied the subset of genes incapable of interacting with the mutant c- Myb, they found that they were involved in proliferation, growth and development of the cells.

(24)

14

Cells regulated by both mutant and WT c-Myb showed an enrichment of genes involved in metabolism [81].

Differentiation

c-Myb is highly expressed in progenitor stages of hematopoietic cells and is down-regulated when the cell differentiation begins. When the differentiation of myeloid or erythroid leukemia cells is cytokine or chemically induced the c-Myb is also down-regulated [82].

Apoptosis

c-Myb is also reported to prevent apoptosis by activating the bcl-2 gene, which protects the cancer cells from apoptosis [83].

1.3.3 Interaction partners of c-Myb

c-Myb activity is modulated by post-translational modifications and interactions with other nuclear proteins. The interaction partners of c-Myb regulate transcription via activating regions that interact with specific targets in the Pol II machinery [44]. The interaction partners enable Pol II to gain access to the promoter of a gene and initiate RNA synthesis at the transcription start site (TSS). The productive elongating transcription complex is generated, and a full-length RNA transcript will be produced [84].

Several cofactors have been identified, such as UBC9 and PIAS1 [70, 85], Mi-2 (CHD3) [86]

FLASH [87], HIPK1 [88] and TIP60 [89]. This section will focus on the known and possible protein-protein interactions most relevant for this thesis.

CBP and p300

CBP is homologue of p300 and both constitute a distinct family of HATs. When c-Myb gets acetylated by CBP and p300, an increase in transcriptional activity can be observed [53]. They have the same KIX domain, which is a kinase-inducible domain essential for transcriptional activity. This domain binds to c-Myb through the NR-box LxxLL-motif in tAD, and possibly also through the CRD [53, 90]. The KIX domain in CBP/p300 is predicted to function as a bridge between the transcription factor and transcriptional machinery [91]. The hydrophobic residues of the single helix of c-Myb tAD interact with the hydrophobic docking site of KIX.

(25)

15 More precisely, the Leu302 of c-Myb is inserted deeply into the hydrophobic groove of KIX, having a major effect on the interactions between the KIX-domain of CBP and c-Myb [90].

Leu302 is part of the LxxLL motif studied in this thesis. Heery et al. found that different TFs containing this motif has a key role in nuclear-receptor regulations by coactivators or corepressors, where CBP/p300 is one of the activators [28]. Studies have shown that mutations in critical residues of the tAD essential for CBP/p300 binding decrease transforming abilities [92].

c-Myb does also participate in chromatin remodeling by binding to the N-terminal histone tails of histone H3 and H3.3, which facilitates histone tail acetylation. c-Myb thus has a twofold role where it gets activated by acetylation catalyzed by CBP/p300, while also activating transcription by recruiting CBP and p300 to chromatin to modify the histone tails [39, 61]. Our lab recently found strong evidence of c-Myb being able to affect chromatin remodeling [63].

They suggested through the D152V mutant c-Myb that this is the first pioneer factor where this function is impaired without affecting the DBD. Another of our more recent studies suggest a model where c-Myb act as a pioneer factor, binding to chromatin where it recruits CBP/p300 followed by detachment and reengaging at c-Myb recognition sites [62]. Again, mutant D152V is taken into account, but as an assumption that it would bind to the chromatin without being able to induce acetylation due to its weakened DNA binding.

SUMO

A small ubiquitin-like modifier (SUMO) protein is covalently attached to a protein through SUMOylation, mentioned in section 1.3.4. SUMO regulates cellular processes and is a major repressive agent of transcriptional activity [93]. It can also interact non-covalently with proteins through a SUMO-interacting motif (SIM), which is defined by the amino acid sequence motif V/I-X-V/I-V/I. This SUMO-binding motif exists in nearly all proteins known to be involved in SUMO-dependent processes, and SUMO binds in a parallel or an anti-parallel manner [94, 95].

The sequence can be seen in c-Myb in Figure 1 as LHVNIVNV. This sequence has been mutated by Sæther et al. in a study that showed an activation of c-Myb more than 13-fold compared to the WT [93].

(26)

16 TAF12

TAF12 is a subunit of the general transcription factor TFIID and interacts with MYB. This has been shown to potentiate a malignant gene expression program in acute myeloid leukemia (AML). Depletion of TAF12 also facilitates the proteasomal degradation of MYB, which results in impaired TFIID recruitment to MYB target genes [96, 97]. Another subunit of TFIID, TAF4, contains a single histone-fold domain (HFD) that dimerizes with the HFD of TAF12 forming a “handshake”. The dimerization was further used to study a mechanism called

“squelching”, which is a form of inhibition of transcription [24]. Squelching of TAF12 with a non-functional TAF4 peptide can block the association between MYB and TAF12 and the rest of the TFIID complex and phenocopy the effects of TAF12 depletion [96, 97].

TAF12 is an attractive therapeutic target in MYB-addicted malignancies, where MYB is uniquely impaired upon depleting TAF12. This may explain why many normal tissues can persist in a TAF12-suppressed state [97]. In the c-Myb protein, TAF12 might interact around the sequence AAAAIQRHYNDED in the tAD, see Figure 1, though the actual linear binding motif of this cofactor is unknown

TFIIF

Subunit 1 of the general transcription factor TFIIF is recruited by a motif of the tAD in the androgen receptor (AR) that contributes to transcriptional activity. The AR is a transcription factor that has a key role in the development of prostate cancer, and the protein-protein interactions is therefore of potential therapeutic interest. [98].

The AR has a hydrophobic motif at positions i/i+3/i+4 (W433, L436 and F437) of the tAD while the surface of the subunit of TFIIF contains a hydrophobic cleft. The interaction between the proteins are facilitated by hydrophobic interactions with a significant influence of electrostatic interactions. The relative position of hydrophobic residues in the AR motif is common in tADs, which indicates that there might be a generic mechanism by which tADs recruit their binding partners. This highlights the general importance of regulatory mechanisms to provide specificity [98].

The sequence SSWHTLFTAEEGQLYG in tAD of the AR has similarities to sequence SYPGWHSTTIADHTRPH, found in the tAD of c-Myb (see Figure 1). Based on this, it might be interesting to test whether subunit 1 of TFIIF will bind to this sequence in c-Myb or not.

(27)

17

1.3.4 Post-translational modifications in c-Myb

Post-translational modifications can affect the activity of c-Myb. These are defined as covalent modifications, which alter protein function in both rapid and energetically inexpensive system [99]. Post-translational modifications can mediate the activity if transcription factors through different mechanism such as altering the regulation of cellular location, DNA-binding affinity, their interaction partners and protein stability [100]. Phosphorylation, SUMOylation and ubiquitination generally inhibit c-Myb activity, while acetylation enhances the c-Myb activation [73].

Acetylation

The lysine residues K442, K445, K471, K480, and K485, are located in the CRD. They are modified by acetylation in c-Myb, and the modifications result in ahigher binding affinity of c- Myb to DNA and coactivators. For instance, CBP and p300 function as acetyltransferases, as its C/H2 domain interacts directly with the CRD of c-Myb. In addition to the tAD, the CRD therefore also contributes in recruiting CBP/p300. CBP/p300 might thus function in a synergistically manner to enhance the transactivating capacity of c-Myb [43, 101].

Phosphorylation

Several amino acid residues are modified by phosphorylation in c-Myb. For instance, serine- 528 located near the CRD regulates c-Myb negatively [67]. Serine-11 and serine-12 located in DBD are phosphorylated by casein kinase II (CK-II) in vitro, resulting in decreased DNA binding of c-Myb [102]. Serine-532, located in the CRD, is a phosphorylation site for 42 kDa mitogen-activated protein kinase (p42^mapk). When this site is substitution mutated, an increase of c-Myb transcriptional activity will occur [67, 103]. Phosphorylation of serine-116 by Protein Kinase A destabilizes a subtype of c-Myb-DNA complexes, which results in a reduced expression of target genes [104]. c-Myb is also phosphorylated in the CRD by the nuclear kinase HIPK1. This will repress the ability of c-Myb to activate the chromatin embedded target gene mim-1 [88].

(28)

18

SUMOylation

There are two SUMOylation sites in the CRD of c-Myb, K527 is the principal one and K503 a secondary one. By mutating these sites into arginine residues (2KR mutant), a large enhancement of c-Myb-dependent transactivation is observed. IKQE, found in the EVES sub-domain of the CRD, is the core sequence motif of these sites [70]. The CRD has a SUMO-regulated activation function (SRAF) which is turned off by SUMO-conjugation.

SUMO thereby affects the recruitment of cofactors such as CBP/p300, leading to a weak activation [46]. The 2KR mutation will be used for the plasmid constructs for this study.

Ubiquitination

The 26S proteasome is a large complex engaged in the major mechanism involved in the degradation of WT c-Myb. The proteasome marks the c-Myb for degradation by post-translational ubiquitin modification of unknown lysine residues in the CRD [105, 106].

(29)

19

1.4 Aims of the study

The transactivation domain (tAD) of transcription factors (TFs) is in general poorly understood compared to the DNA-binding domain, despite tAD being responsible for an essential function in gene activation. In this study the tAD of c-Myb was dissected in order to better understand its function. Several different models about the tAD have been proposed, which are based on the composition of amino acid residues and the structure of the tAD, as summarized above. A recently published article reported a highly interesting study of a canonical activation domain from the Saccharomyces cerevisiae TF Gcn4. They reported that the intrinsically disordered and acidic residues keep two hydrophobic motifs from driving collapse and causing inactivation, and that the most active variants keep their aromatic residues exposed to the solvent [1]. This study of the c-Myb tAD is inspired by this article, as well as addressing classical models. The overall approach has been to create a set of mutations in the tAD of c- Myb, followed measuring their transactivation potential in different systems. The design of the mutations are based on the model that the tAD is an assembly of linear motifs kept open by acidic residues and intrinsic disorder. The different questions specified below will be evaluated on the basis of the observed effects of the mutants to determine the most appropriate model for our findings.

Through mutagenesis of amino acid residues believed to contribute to transcriptional activity, the tAD can be dissected by revealing the effect specific residues have on gene expression. By mutating known or hypothesized short linear motifs (SLMs) such as the well-known LxxLL motif, potential cofactor recruiting sequences can be uncovered. Another topic of interest will be whether the specific order of amino acids in the tAD is essential for activation of transcription, or if a shuffled version is sufficient, as suggested by classical “acid blob” models.

Analysis of the mutants’ effect on gene expression will be studied in three separate systems to investigate potential differences in how the mutants affect the transcriptional activity.

(30)

20

The results and discussion will address the following questions:

1. Which specific amino acid residues in c-Myb tAD affect the transcriptional activity?

2. How does the SLM LxxLL affect transactivation function? Is the LxxLL motif sufficient to activate transcription at the same level as the WT tAD of c-Myb?

3. Can we by mutagenesis find evidence for novel SLMs in the tAD of c-Myb, not previously characterized?

4. Does the order of the amino acid residues in tAD have an impact on the transcriptional activity, or is it only the actual content of amino acids that matters, as suggested by some classical model?

5. Is the WT tAD sequence giving a maximal activation effect or does some mutants increase, rather than decrease, its activation potential?

6. Which of the many models for tAD functions matches our results best?

7. How does the difference in chromatinization affect the activity of the c-Myb tAD?

8. How conserved are the mechanism giving the c-Myb tAD its activity? Will the same mutants affect tAD similar when expressed in a mammalian and a yeast systems?

These are general questions of interest jointly addressed by all three students working together.

Some questions will be weighted more based on each studied system and each student preference regarding their work.

(31)

21

2 Methods

2.1 Cell techniques

The bacterial strain used in this thesis is the DH5α strain of E.coli. The strain is especially engineered as a tool for maximizing the effect/efficiency of transformation. It has several mutations defining it: RecA1, which disables recombinase activity and inactivates homologous recombination. The EndA1 mutation stops the cell from producing an intracellular endonuclease, enabling inserted plasmid not to be degraded. The strain also includes the lacZM15 gene enabling tests by blue-white screening, although not used in this project.

The host organism used is Saccharomyces cerevisiae (S. cerevisiae) where the strain used is the INVSc1 yeast host strain. This strain is auxotrophic for histidine, leucine, tryptophan and uracil.

The strain will not grow in medium or plates with deficiencies of these amino acids. With the effector plasmid pYES2 as a host for our mutated cMyb, selection can be done for uracil as pYES2 has a Uracil selection marker. It also has ampicillin resistance to cultivate in bacteria.

The reporter plasmid pGADT7-HLfus-3xGG has a selection marker for Leucine, and has ampicillin resistance. Most importantly, it has the 3xGG MRE, allowing the c-Myb protein to bind through its DBD. These two plasmids together has a double selection on plates with deficiencies of Uracil and Leucine.

2.1.1 Bacterial cells – Storage, growth and transformation

Stock cultures are made in order to take care of relevant plasmids. Cells stored at -80°C can survive for long periods, and can be thawed and cultivated in LB-medium.

Creating stock cultures

Stock cultures are cells kept at -80°C for future extraction of plasmids. When creating the stock cultures, the risk of creating ice crystals that could potentially damage the cultures at low temperatures is avoided by adding glycerol to the stock culture. 430µL glycerol (50%) is added to 1mL of overnight culture. The final concentration of glycerol is 15%.

(32)

22

Growth conditions

The DH5α-cells are grown in LB medium. To amplify a plasmid, overnight cultures are made.

Bacteria is added to LB medium, usually 3mL for extraction through miniprep, with antibiotics (100µg/mL ampicillin) to select for positive transformants. The DH5α-cells grow at 37°C for 16-18 hours, with shaking at 200-250rpm. For extracting larger quantities, 200mL of LB medium with antibiotics can be used to cultivate the cells for a midiprep.

Transformation of DH5α

Transformation is the process where plasmid is introduced into bacterial cells. DH5α are made compatible for transformation through a series of steps before being frozen at -80 ̊C. This has been done by our lab supervisor in advance.

All plasmids introduced needs to have an origin of replication in order for the bacteria to replicate the DNA together with its own during cell division. The process includes thawing competent DH5α on ice, and add plasmid to the cooled bacteria. The cold temperatures will make the DNA stick to the surface of the bacteria where a heat shock at 42 degrees, not lethal for a short period, induces pores in the bacterial membrane. The pores allow DNA to enter the cells. Plasmids usually contain a selection marker, for bacteria a form of antibiotics resistance, to allow for selection of only transformed bacteria. The cells are spread onto LB agar plates supplemented with antibiotics.

Procedure:

1. Thaw 50 µl competent DH5α cells on ice.

2. Add 1-5 µl plasmid DNA (1 ng/µl) to the bacteria.

3. Keep on ice for 20 minutes.

4. Incubate cell solution at 42°C for 90 seconds.

5. Place cells on ice for 2 minutes.

6. Plate on LB agar plates supplemented with antibiotics, and culture at 37°C for 16-20 hours.

(33)

23

2.1.2 Yeast cell techniques

S.cerevisiae, yeast from here on out, is a eukaryotic organism used because of the resemblance to mammalian cells when comparing cell structure and mechanisms. This include compartmentalization, transcription and translation, and to a certain degree epigenetics though with different proteins that might share homology.

Yeast is used as a model organism that provide a framework for studying a larger class of living beings. This said, the different organisms might differ in having the same processes performed by homologous proteins. Up to 30% of genes with a correlation to diseases in humans have homologs in the yeast proteome [107]. For this study, the yeast will serve as a comparison to mammalian cells as to how the tAD in c-Myb recruits or interacts with cofactors.

Growth

Yeast cells are first plated onto YPD (or YEPD: Yeast Extract Peptone Dextrose) plates to get a lot of colonies for making overnight cultures. It takes 2-4 days at 30°C for the cells to grow.

The YPD plates does not select for anything and the INVSc1 yeast cells will be able to grow nicely. Overnight cultures are made by suspending separate yeast colonies in YPD medium, and then incubate at 30°C with shaking. When selecting for transformants, yeast cells are plated onto synthetic complete (SC) plates with amino-acid deficiencies complementary to the selection.

Yeast transformation by use of the One-Step protocol

This protocol is quick and easy to use and can be used if the goal is only to get a plasmid into the yeast, where the number of transformed cells is not that important.

1. Blend 100µL One-Step buffer with 10 µL 1M sterile DTT (NB: The quality of DTT is important. Do not use DTT that has been thawed several times or stored for a long time in the fridge).

2. Suspend a yeast colony from a plate into the mix and vortex well.

3. Add DNA of up to 10µL from a miniprep, or up to 1µL from a maxiprep due to higher concentrations.

(34)

24

4. Vortex.

5. Incubate at 45°C for 30 minutes.

6. Spread the cells on a plate with selection for the inserted plasmid(s). No more than 50µL for each plate.

7. Incubate at 30°C until the colonies appear (Should be visible after 2-4 days).

Yeast transformation by YEAST1 transformation kit from Sigma-Aldrich Preparation of competent cells:

1. Inoculate yeast from YPD plate into 20mL of YPD medium in a 100mL sterile flask, or 10mL in two 50mL tubes.

2. Grow overnight with shaking (200-250rpm) at 30°C. Culture should reach stationary phase (OD600 > 2)

3. Dilute culture into 100mL of YPD medium in a 500mL sterile flask, or 50mL in two 50mL tubes, so that the OD600 ̴ 0.3. Grow with shaking at 30°C for 3-6 hours. The OD600

should double at least once and should not pass 1.5

4. Harvest cells by centrifugation at room temperature for 5 minutes at 5000rpm in a GSA or equivalent rotor.

5. Discard supernatant and resuspend cells in 50mL of sterile water, or 25mL each in two tubes.

6. Repeat centrifugation as in step 4.

7. Discard supernatant and resuspend in 1mL of transformation buffer (Catalog Number T0809). If the cells does not stick to the wall while removing supernatant, leave 700µL in both tubes and transfer to an Eppendorf tube. Repeat centrifugal step as in step 4.

The supernatant should now be easy to remove without removing cells.

(35)

25 Note: Cells are now ready for transformation. It is best to use cells immediately, but they can be kept at 4°C for 1 week with gradual decrease in competence, or glycerol can be added to 15% and cells kept at -70°C.

Plasmid transformation

1. Set up the required number of sterile 1.5mL microcentrifuge tubes, one for each transformation.

2. Aliquot 10µL of 10mg/mL salmon testes DNA into each tube.

3. Add 0.1µg of the yeast plasmid DNA to be studied.

4. Add 100µL of competent cells and vortex.

5. Add 600µL of Plate Buffer and vortex 6. Incubate 30 minutes at 30°C with shaking

7. Optional: Add DMSO to 10%. This is not required, but will increase transformation efficiency 2 to 5-fold

8. Heat shock for 15 minutes in a 42°C heat block or water bath.

9. Spin 10 seconds (Protocol says 3) in microcentrifuge and remove the supernatant. Do not spin too long as it will become hard to resuspend the cells. Time in microcentrifuge depends on the rpm of the labs equipment.

10. Resuspend the cells in 500 µL of sterile water.

11. Plate 100µL on appropriate SC selective plates.

12. Incubate at 30°C for 2-3 days until colonies appear.

(36)

26

2.2 Reporter assay

2.2.1 ONPG-test

LacZ embedded in the reporter plasmid is a gene encoding β-Galactosidase, which in turn splits lactose to galactose and glucose. The ortho-Nitrophenyl-β-Galactoside (ONPG)-test is commonly used for yeast systems with effector and reporter plasmids. The effector plasmid holds the protein whose activity is to be measured, while the reporter plasmid includes a recognition site for the protein and the LacZ gene. After cultivation and lysis of the yeast cells, ONPG is added. The amount of β-galactosidase in the cells will be defined by the intensity of yellow colour over time.

1. Pick one colony for each measurement (3-6 colonies per construction) and make an overnight culture in 5mL SC-medium with selection for the plasmids of interest. Note:

Let it grow in a medium with 0.1% glucose and 2% galactose in the case of a galactose- inducible promoter.

2. The following day, measure OD600 for each overnight culture and calculate the amount needed to create an OD600=0.1, in 5mL of the same SC medium as in step 1. Let the cells grow at 30°C with shaking to an OD600 value between 0.2-0.5.

(Reason for 5mL: 1mL of cells for the ONPG test, 1mL of cells for the OD- measurement, and the rest for extra OD-measurements if needed).

3. Add 1mL of culture to 100µL 10x Z-buffer with sarcosyl and DTT. (NB: Sarcosyl is poisonous! Please refer to the hazard sentence of 2% sarcosyl in buffer). Incubate for 30minutes at room temperature.

4. While incubating, measure OD600 per mL culture of the leftovers not taken aside for permeabilization (Step 3). This can also be measured in the end, but the cultures need to be sat on ice in the meantime.

(37)

27 5. After 30 minutes of permeabilization, add 200µL of 4mg/mL ONPG and shake the tube quickly by hand. Add ONPG with an exact 15 seconds interval to the next tube, as this will make it easier to calculate at what time to stop the reaction. After 10-60 minutes, when a nice yellow colour has appeared on the WT, add 500µL 1.5M Na2CO3. Note the time of addition (ADD).

6. Spin down the cells for 2minutes at max speed, and measure the OD420. If the binding to the reporter is efficient and inducibility is high, too high values of β-galactose at OD420 might occur. If this is the case, thin the culture and take this into account when calculating in the end.

7. Calculate the amount of β-Galactose-units by the formula:

𝑨_𝟒𝟐𝟎∙ 𝟏𝟎𝟎𝟎

𝑨_𝟔𝟎𝟎∙ 𝑻𝒊𝒎𝒆 𝒆𝒏𝒔𝒖𝒆𝒅 𝒃𝒆𝒇𝒐𝒓𝒆 (𝑨𝑫𝑫)

(2.1)

(38)

28

2.3 DNA techniques

2.3.1 Polymerase chain reaction

The polymerase chain reaction (PCR) method use DNA polymerases, primers and nucleotides in order to synthesize a strand or a fragment of a strand complementary to the template DNA.

The primers are added in order to give the DNA polymerase a 3’-OH group at which it can attach itself and elongate the DNA sequence. They are designed as short single stranded sequences matching an area of DNA desirable to amplify. The primer sequences, also called oligos, are used in this study to amplify a fragment, and for site directed mutagenesis explained in 2.3.2.

The PCR reaction consist of three heat-regulated steps:

1. Denaturing, where the hydrogen bonds between the Double stranded DNA (dsDNA) is broken.

2. Annealing, the cooling to an optimal temperature for the primers to anneal. The temperature and time required for annealing depends on the composition of the bases Cytosine (C), Guanine (G), Adenine (A) and Thymine(T). Cytosine bound to guanine has three hydrogen bonds, requiring higher temperatures than adenine and thymine with two.

3. Elongation, where the DNA polymerase extends from the primers and attaches complementary deoxynucleotides (dNTPs) in a 5’ to 3’ direction.

Repeating these three steps results in an exponential amplification of the template. Primer sequences are listed in appendix E. All PCR reactions are mixed in 0.2mL PCR-tubes, and the PCR machine used is the 2720 Thermal Cycler from Applied Biosystems^TM.

(39)

29 PCR of the HLfus-3xGG fragment of pYHLfus-3xGG

In order to get the whole segment of HLfus-3xGG, to create the plasmid designed for ONPG assay, PCR were set up to amplify the gene fragment. The oligos can be found in appendix E.

Table 1: PCR mix for synthesis using Q5 HF DNA Polymerase.

Amount

Q5 reaction buffer (5x) 10 µL

Template (1ng/µL) 5 µL

Forward primer (10 µM) 2.5 µL

Reverse primer (10 µM) 2.5 µL

dNTP (5mM) 2 µL

Q5 High Fidelity DNA polymerase (2U/µL) 0.5 µL

dH2O 27.5 µL

Total 50 µL

(40)

30

Table 2: PCR mix for synthesis using and PfuUltra^TM DNA polymerase.

Amount

PfuUltra^TM buffer (10x) 5 µL

Forward primer (10 µM) 5 µL

Reverse primer (10 µM) 5 µL

dNTP (5mM) 2 µL

BSA 0.5 µL

PfuUltra^TM DNA polymerase (1U/µL) 1 µL

dH2O 26.5 µL

Total 50 µL

Table 3: PCR program for synthesis using Q5 HF DNA polymerase and PfuUltra^TM DNA polymerase

Step Temperature Duration

Step 1 98°C 300 seconds Denaturing

Step 2

Step 3

Step 4

98°C 55°C 72°C

30 seconds

250 seconds

Denaturing Annealing Synthesis

Step 5 Go to Step 2 29 times

Step 6 72°C 600 seconds Synthesis

Step 7 4°C Forever Storage

(41)

31

2.3.2 Site-directed DNA mutagenesis

For this study, oligos were used to create different mutants affecting the tAD of c-Myb. They consist of 40-50 basepairs, with specific mutations chosen to disrupt potential SLMs or proximal amino acids. All mutants with their respective oligo can be found in Appendix D and E. The reaction uses PfuUltra™ DNA Polymerase from Stratagene, which has a very high fidelity with low rates of error.

The template plasmid is extracted from bacteria, which means this DNA will be methylated at all CG motifs. Newly synthesized PCR products does not have methylations. To remove the template, making sure there are no WT DNA able to transform our DH5α-cells, DpnI is added to the solution to digest all methylated CG-motifs. The E.coli cells are transformed directly afterwards, tested for positive clones, and the positive mutant plasmid is amplified for subcloning.

Table 4: PCR mix for site directed DNA-mutagenesis

Amount

PfuUltra^TM Buffer (5x) 5 µL

Forward primer (10 µM) 2 µL

Reverse primer (10 µM) 2 µL

dNTP (5mM) 1 µL

PfuUltra^TM DNA polymerase (2U/µL) 1 µL

dH2O 36 µL

Total 50 µL

(42)

32

Table 5: PCR program for mutagenesis

Step Temperature Duration

Step 1 95°C 120 seconds Denaturing

Step 2

Step 3

Step 4

95°C 55°C 68°C

30 seconds

60 seconds

600 seconds

Denaturing Annealing Elongation

Step 5 Go to Step 2 17 times

Step 6 68°C 600 seconds Elongation

Step 7 4°C Forever Storage

2.3.3 Agarose gel electrophoresis

Gel electrophoresis is used frequently to separate DNA molecules by adding an electric field.

Agarose is a polysaccharide, that when in a gel separates DNA by size due to the longer fragments having a hard time moving through the agarose polymers. The typical concentration of agarose is between 0.5-2%. Because of the negative charge of the DNA phosphate backbone, the DNA will migrate towards the positive anode when adding electricity.

To verify the lengths of the separated DNA fragments, a ladder should always be included to determine the band sizes. The molecular-weight size marker 1 kb plus ladder from Invitrogen^TM is used in this study as it is a good marker when working on molecular sizes lower than 5-6kb.

The bands become hard, but not impossible, to decipher for larger fragments.

In order to see the bands, one drop of Ethidium Bromide (EtBr) added to the gel allows for fluorescence under UV light. EtBr is an intercalator of DNA, and implements itself into the DNA fragments as they migrate through the gel. To run, Tris-acetate-EDTA (TAE) buffer provides ions to carry the current, and is used as running buffer.

(43)

33 Preparation of the gel

1. Weigh out an appropriate amount of agarose powder to create the gel you want. 1g agarose in 100mL TAE gives a 1% agarose gel.

2. Add 1x TAE buffer and heat in the microwave until the agarose melts and the liquid is homogenous.

3. Let it cool to below 50°C, preferably by using running cool water from the spring.

4. Add 1 drop of EtBr (428µg/mL) to the solution, which gives an approximate final concentration of 0.5µg/mL.

5. Pour the solution into a tray and use the comb to remove any bubbles before adding the comb itself. The comb creates wells to load the samples.

6. Let the gel solidify, this step should take about 20-30 minutes. Do not let the gel stay too long as it should not dry out.

Loading procedure

1. Move the tray over to an electrophoresis chamber with 1x TAE buffer.

2. Load the samples supplemented with gel loading buffer to the wells.

3. Set voltage to 100V and let the gel run for 30-60 minutes depending on the expected sizes of your fragments. Larger fragments need longer time to separate.

4. Place the gel under UV light in the vWR® Smart to visualize and take pictures of your DNA molecules.

Purification of DNA from agarose gel

The kit NucleoSpin^® Gel and PCR clean-up from Macherey-Nagel is used to cleanse the DNA from the agarose gel.

Dissecting the transactivation domain of the transcription factor c-Myb A joint project for three MSc students

Dissecting the transactivation domain of the transcription factor c-Myb

A joint project for three MSc students

Jan Ove Storesund

Department of Bioscience

UNIVERSITY OF OSLO

Dissecting the transactivation domain of the transcription factor c-Myb

Jan Ove Storesund Department of Biosciences

University of Oslo

June 2019

Acknowledgements

Abstract

Contents

1 Introduction

1.1 The eukaryotic genome

1.1.1 The epigenome

1.1.2 Chromatin structure and function

1.1.3 Transcription

1.1.4 Transcription factors

1.2 Transactivation domains

1.2.1 Model 1: transactivation domains as acidic domains

1.2.2 Model 2: transactivation domains as specific residue-rich domains

1.2.3 Model 3: transactivation domains as short linear motifs

1.2.4 Model 4: transactivation domains as SLMs embedded in intrinsically disordered acidic domains

1.2.5 Model 5: transactivation domains as domains inducing liquid-liquid phase-transition

1.2.6 The transactivation domain of c-Myb

1.3 The transcription factor c-Myb

1.3.1 The domains of c-Myb

1.3.2 Target genes and biological functions of c-Myb

1.3.3 Interaction partners of c-Myb

1.3.4 Post-translational modifications in c-Myb

1.4 Aims of the study

2 Methods

2.1 Cell techniques

2.1.1 Bacterial cells – Storage, growth and transformation

2.1.2 Yeast cell techniques

2.2 Reporter assay

2.2.1 ONPG-test

2.3 DNA techniques

2.3.1 Polymerase chain reaction

2.3.2 Site-directed DNA mutagenesis

2.3.3 Agarose gel electrophoresis