Molecular basis for the dominant inhibition of RUNX1-dependent transcription by CBFβ-SMMHC

Identification of dual functional domains in CBFβ-SMMHC

It has been shown that the phenotype of heterozygous Cbfb-MYH11 knockin mice is very similar to that of Runx1 knockout mice (Castilla et al., 1996). The results imply that CBFβ-SMMHC inactivates the function of Runx1 nearly completely despite the presence of a residual normal Cbfβ/PEBP2β. In order for this to occur, CBFβ-SMMHC would have to outcompete Cbfβ at the step of heterodimerization with Runx1 first of all, subsequent to which a certain mechanism(s) should act to bring Runx1 into a functionally incompetent state in the end (repression). If one of these steps were lacking or impaired, CBFβ-SMMHC would fail to elicit any substantial dominant-negative effect regardless of how well the other step might work.

Until recently, however, most functional studies of CBFβ-SMMHC have centered around the mechanism of repression, paying relatively little attention to the first heterodimerization step. Nevertheless, evidence suggestive of its enhanced ability for heterodimerization (hyper-heterodimerization) has come from our previous finding that CBFβ-SMMHC is predominantly localized in the cytoplasm in association with the actin cytoskeleton, and simultaneously capable of sequestering RUNX proteins to the cytoplasm in a manner over-riding the intrinsic or artificially modulated ability of Runx to localize in the nucleus (Lu et al., 1995; Adya et al., 1998; Kanno et al., 1998). In sharp contrast, Cbfβ tends to localize in the cytoplasm in a diffuse pattern by itself, and can be partly, although not completely, translocated into the nucleus only through heterodimerization with Runx protein. Furthermore, CBFβ-SMMHC can stabilize RUNX1 against intracellular proteasome-mediated degradation much more strongly than Cbfβ (Huang et al., 2001a).

To characterize the hyper-heterodimerization activity of CBFβ-SMMHC more directly, we conducted extensive functional analyses using a series of CBFβ-SMMHC deletions as truncated exon by exon either C-terminally or internally (Huang et al., 2003). Through in vitro binding experiments by means of co-immunoprecipitation and GST-pulldown assays as well as an intracellular hyperprotection assay, it was confirmed that CBFβ-SMMHC can heterodimerize with RUNX1 at an affinity higher than that of CBFβ by one order of magnitude or more. Parallel analyses with RUNX1 deletions revealed that the region of RUNX1 required for hyper-heterodimerization is the Runt domain. Further, a minimum region of the myosin tail responsible for hyper-heterodimerization was further mapped to exons 33–36 (residues 166–363). The myosin tail alone, of course, did not show any detectable binding to the Runt domain. However, appreciable hetrodimerization activities became detectable when the myosin tail was fused to PEBP2β proteins made heterodimerization-defective due to double point mutations (residues 64 and 104) (Tang et al., 2000) or a short N-terminal truncation (Δ2-11) (Adya et al., 1998). Presumably, the myosin tail and a part of the PEBP2β protein may physically or conformationally cooperate to create a new RUNX-binding interface that functions independent of and in synergy with the original heterodimerization interface on CBFβ. A minimum region of PEBP2β involved in this second putative heterodimerization interface was narrowed down to residues 134–165.

We next investigated whether and how much the hyper-heterodimerization domain could contribute to the dominant repression of RUNX1-mediated transcription by CBFβ-SMMHC against PEBP2β. Upon a transcription assay using an M-CSFR promoter-based reporter system, coexpressions of CBFβ-SMMHC and PEBP2β at equimolar nonsaturating doses resulted in a strong repression of transcription to a level several-fold less than the control obtained by transfection of PEBP2β alone. When the hyperdimerization domain (exons 33–36) was removed from CBFβ-SMMHC by internal deletion, this repression was considerably weakened to twofold or less. This confirmed that the hyperdimerization domain does have an important positive impact in augmenting repression. Curiously, however, a converse deletion construct retaining the hyper-heterodimerization domain but lacking the rest of the C-terminal region caused a moderate stimulation of transcription, rather than repression. A simple explanation for these seemingly paradoxical effects of the hyper-heterodimerization domain may be that the hyperdimerization domain itself has no or little inhibitory influence on RUNX1-mediated transactivation, and that another functional domain responsible for repression (repression domain) resides within the C-terminal proximal region centering around exons 39–40. In apparent coincidence with this explanation, other groups previously reported that C-terminal segments of SMMHC overlapping exons 39–42 could show repressive activities when artificially fused to the Gal4 DNA-binding domain and assayed using a Gal4-TK luc reporter system (Lutterbach et al., 1999). With this bipartite functional domain model (Figure 1), we can readily predict that CBFβ-SMMHC deleted by the hyperdimerization should no longer be able to compete dominantly with the normal CBFβ and hence would fail to display its maximal possible repression potential.

Figure 1
figure 1

Diagrammatic summary of functional domain analyses. The top diagram represents CBFβ/PEBP2β-MYH11. Red-colored segments (βE1–βE5) and blue-colored segments (E33–42) represent exons encoding Cbfβ/PEBP2β and SMMHC, respectively. Double-headed arrows above and below the diagram indicate the mapped locations of respective functional and structural domains as annotated. ACD and ACD2: assembly competence domains identified by Sohn et al. (1997) and Ikebe et al. (2001), respectively. The dotted red line represents an intermediate zone that was suggested to be unpaired upon deletion of the C-terminally flanking coiled coil region, but could be integrated at least partly into the coiled coil domain in the intact CBFβ/PEBP2β -MYH11. The blue double-headed arrow represents the terminal nonhelical tail that is implicated in the regulation of multimerization (Ikebe et al., 2001)

Molecular conformation and higher order assembly of CBFβ-SMMHC

Ever since the discovery of CBFβ-SMMHC (Liu et al., 1993, 1995), it has implicitly been assumed that the myosin tail is able to associate into a coiled-coil homodimer, which in turn extensively multimerizes into filaments. Furthermore, the multimerization process is known to depend on the assembly competence domain (ACD) located within exon 40 (Sohn et al., 1997; Ikebe et al., 2001; Kummalue et al., 2002). During the preceding functional analyses, however, we came upon evidence suggesting that the coiled-coil rod is partially unpaired around the hyper-heterodimerization domain (Huang et al., 2003). In EMSA experiments, the intact CBFβ-SMMHC mixed with RUNX1 forms large DNA-bound complexes that were unable to penetrate into a polyacrylamide gel. When CBFβ-SMMHC was deleted from the C-terminus past exon 39 and further beyond, the resulting DNA–protein complexes started to migrate into the gel in increasing fractions at accelerated mobilities. Judged from their mobilities, these complexes represented heterodimers consisting of single RUNX1 and CBFβ-SMMHC molecules. On the other hand, internally truncated constructs lacking the hyperdimerization domain with or without additional C-terminally extending deletions up to exon 38 still produced low-mobility complexes in much higher proportions than did pure C-terminal deletion constructs. Chemical cross-linking experiments using glutaraldehyde also confirmed that CBFβ-SMMHC C-terminally truncated beyond exon 39 tended to dissociate into monomers in a manner proportionate to their degrees of deletion. Previous glutaraldehyde crosslinking studies with a fewer varieties of C-terminal deletions also indicated similar trends (Adya et al., 1998; Cao et al., 1998). In contrast, the above-noted internal deletions largely remained as dimers and multimers.

On the basis of these results, we proposed a new structural model for CBFβ-SMMHC that comprises a Y-shaped dimer with unpaired N-terminal halves followed by a coiled-coil homodimer in the C-terminal region (Huang et al., 2003) (Figure 2). The hyperdimerization domain and the repression domain coincidentally fall within the unpaired region and the coiled-coil region, respectively. It may be puzzling why the hyper-heterodimerization domain requires such a long stretch of the myosin tail reaching nearly 200-aa for interaction with the Runt domain known to have a compact globular structure. We think that the hyper-heterodimerization domain takes on a somewhat folded structure, part of which makes direct contacts with the Runt domain while the rest serve to keep the hyper-heterodimerization domain in a proper conformation. Here, a question may immediately arise as to how the coiled-coil structure could be so fragile. In fact, the coiled-coiled rod of SMMHC, unlike that of skeletal counterparts, is notoriously flexible and unstable within physiological salt concentrations, readily transforms into a folded hairpin structure (Trybus and Lowey, 1984) or dissociates into a monomer upon truncation (Trybus et al., 1997). For one illustrative example, truncated coiled-coil tails of SMMHC containing more than 100 residues were shown to be less stable than a 28-aa GCN4-derived leucine zipper (Trybus et al., 1997). Moreover, Liu et al. (1994) demonstrated early on that a substantial part of DNA–protein complexes produced in the presence of RUNX1 and in vitro-translated CBFβ-SMMHC migrated into polyacrylamide gels in gel-shift assays, suggesting that even the coiled coil of intact CBFβ-SMMHC could substantially dissociate into monomers at a sufficiently low concentration.

Figure 2
figure 2

A new bipartite model for the structure of CBFβ/PEBP2β -MYH11 complexed with RUNX1. (a) A model previously proposed by Lukasik et al. (2002). (b) A new model proposed by Huang et al. (2003). Indicated are the RUNX1: CBFβ/PEBP2β -MYH11 complex at three different stages of molecular assembly. The filamentous multimer is modeled after the side-polar structure proposed for smooth muscle heavy chain by Xu et al. (1996). The red-colored globular pieces and the gray-colored ovals represent RUNX1 and CBFβ/PEBP2β, respectively. In the filamentous multimer, the dimeric unit is depicted in a simplified form. NES: presumptive nuclear export signal, ACD: assembly competence domain, and NHT: nonhelical tail. See the text for other details

Recently, Lukasik et al. (2002) reported that a truncated CBFβ-SMMHC with a 47-amino acid portion of SMMHC ending within exon 33 was sufficient to support not only its homodimerization into a coiled-coil but also its enhanced heterodimerization with the Runt domain. NMR studies identified interactions with Runx1 in the CBFβ portion of the molecule, as well as the SMMHC portion. Moreover, they described that the enhanced heterodimerization of β/SMMHC47 was hindered in the presence of DNA, which disagrees with our current results (Huang et al., 2003). The discrepancy is probably attributable to differences in the analytical methods and experimental conditions used. Perhaps one critical factor is the concentration of proteins used. In our in vivo and in vitro binding assays, the Runt domain and β/SMMHC were roughly estimated to be present in nanomolar ranges. On the other hand, Lukasik et al. employed isothermal titration calorimetry (ITC) as a chief analytical tool using very high protein concentrations in the range of 20–340 μ M. Their ITC measurements yielded a stoichiometry indicating that two Runt domain proteins bind to each β/SMMHC47. This unusual stoichiometry might reflect the homodimeric interaction of the Runt domain as previously identified by X-ray crystallographic analysis of the Runt domain–CBFβ binary complex (Warren et al., 2000). Of note, this homodimeric interface seems to be unrecognizable in the Runt domain-CBFβ-DNA ternary complex in subsequent X-ray crystallographic studies reported by the same authors’ group (Bravo et al., 2001) as well as a few other groups (Tahirov et al., 2001; Backstrom et al., 2002). If these observations are taken into account, it is conceivable that an extra molecule of the Runt domain might well bind to each heterodimeric unit of the Runt domain and CBFβ-SMMHC in the absence of DNA, when the Runt domain is present in large molar excess at a sufficiently high concentration. Apart from such mechanistic details, it is open to question whether the 2 : 1 stoichiometry complex could be formed under in vivo conditions to any meaningful extent.

Mechanisms of repression of RUNX1-mediated transcription by CBFβ-SMMHC

Early in the mid-1990s, Liu et al. (1994), (1995) first proposed two alternative basic mechanisms by which CBFβ-SMMHC could inhibit Runx1-mediated transactivation. In one mechanism, the formation of multimeric aggregates by CBFβ-SMMHC would sequester a certain amount of RUNX1, making it unavailable to bind its targets in the DNA (a sequestration model). The second possible mechanism is that the RUNX1/CBFβ-SMMHC complex could alter the assembly of sequence-specific transcription factors on adjacent sites in the enhancers of certain target genes by either causing steric hindrance or participating in novel interactions with other proteins (an interference model). More recently, Lutterbach et al. (1999) proposed a specialized variation of the second mechanism asserting that β/SMMHC actively represses in the nucleus through its ability to interact with various corepressor proteins (a corepressor-recruiting model). The results of our preceding functional analyses would provide fresh angles to re-evaluate the relative importance of these apparently conflicting mechanisms (Figure 3).

Figure 3
figure 3

Multifarious mechanisms of transdominant repression by CBFβ/PEBP2β-MYH11. A diagrammatic scheme of how CBFβ/PEBP2β-MYH11 functions in the cell to cause a transdominant repression of RUNX1-promoted transcription in competition with CBFβ/PEBP2β through various putative mechanisms as described in the section Molecular basis for the dominant inhibition of RUNX1-dependent transcription by CBFβ-SMMHC is shown. Dotted double line, nuclear envelope; blue-colored crescent, corepressor; thick winding line, chromosomal DNA entrapped by the filamentous CBFβ/PEBP2β-MYH11 multimer; and TGYGGT, the consensus RUNX-binding site. The other symbols are the same as in Figure 2

The sequestration model subsequently gained impetus by the observations that CBFβ-SMMHC can cause altered subcellular localizations of Runx1, and also Runx2, in a state closely colocalized with CBFβ-SMMHC either in the nucleus as multimerized complexes, which sometimes form large rod-like inclusion bodies (intranuclear sequestration) (Wijmenga et al., 1996), or in the cytoplasm as deposits on cytoskeletal filaments or aggregates (cytoplasmic sequestration) (Lu et al., 1995; Adya et al., 1998). In the former case, RUNX proteins physically confined within those multimeric complexes only have limited access to cognate-binding sites on chromosomes, albeit they are in the nucleus. In addition to these mechanisms, a still different type of sequestration without involving multimerization was found to occur with CBFβ-SMMHC deleted from C-terminal 95 amino acids containing the ACD motif (β/SMMHCΔ95) (Adya et al., 1998). β/SMMHCΔ95 was exclusively localized together with RUNX1 in the cytoplasm in a diffuse pattern with the nucleus left as a complete void, suggesting that this protein could be actively exported out of the nucleus possibly by an NES-dependent mechanism (see further discussion below). In our analysis, a C-terminal deletion lacking exons 40–42, which is nearly equivalent to β/SMMHCΔ95, displayed a strong repression (fourfold), only slightly less than that of the intact CBFβ-SMMHC (sixfold) (Huang et al., 2003). Taken together, these observations indicate that the sequestration mechanism alone can support efficient repression regardless of which mechanism among the three variations noted above is at work.

Given the view reached above, it becomes more important to know how and with what efficiency each mode of sequestration is working in various cells, particularly inv(16) leukemic cells. The relative distributions of CBFβ-SMMHC/RUNX1 complexes between the nucleus and the cytoplasm were shown to depend on the level of expression of CBFβ-SMMHC and also the kind of cells studied. In NIH 3T3 cells, CBFβ-SMMHC/RUNX1 complexes were predominantly localized in the cytoplasm at a low expression level, but tended to move into the nucleus in increasing proportions with increasing expression levels (Adya et al., 1998). Of note, these changes were accompanied by progressive deformations of the cytoskeletal structure into scattered speckles. Thus, it seems likely that the cytoskeletal actin filaments function to accommodate CBFβ-SMMHC as a reservoir that is prone to disintegrate upon saturation. In an apparent coincidence with this speculation, ectopically overexpressed CBFβ-SMMHC was preferentially localized in the nucleus in a lymphoid cell line, BaF3, that retains a minimal cytoplasmic space with supposedly meager cytoskeletal actins (Cao et al., 1998; Kummalue et al., 2002). In this cell line, the nuclear accumulation of CBFβ-SMMHC and also the resultant inhibition of RUNX1's activities were both shown to be tightly correlated with its ACD-dependent multimerization (Kummalue et al., 2002). The observed correlation between the nuclear accumulation of CBFβ-SMMHC and the ACD closely echoes with the augmented nuclear export of the ACD-less mutant, β/SMMHCΔ95, as noted before. Accordingly, the results of Kummalue et al. (2002) may well be taken as a case of enhanced intranuclear sequestration (see below for another different interpretation). In this sequestration-centered view, the augmentary effect of the ACD on the nuclear import of CBFβ-SMMHC, per se, could be an indirect consequence of multimerization. Judging from its enhanced nuclear export as quoted before, β/SMMHCΔ95 might well harbor an NES-like sequence that would be sterically masked in the intact CBFβ-SMMHC undergoing extensive ACD-dependent multimerization. Lastly, what is the situation in leukemic cells with inv(16)? Evidence reported in a few previous reports consistently suggested approximately equal distributions between the nucleus and the cytoplasm (Liu et al., 1996; Kanto et al., 2000).

The corepressor-recruiting model was first suggested by Lutterbach et al. (1999) based on their findings that β/SMMHC formed ternary complexes with RUNX1 and repressor protein mSin3A, and also that the C-terminal 163-amino acid region of the myosin tail (approximately corresponding to exons 38–42) acted as a transcriptional repressor when it was fused to the Gal4 protein and assayed using a Gal4-TK reporter system. Subsequently, Durst et al. (2003) demonstrated that the putative repressor domain interacts with a corepressor mSin3A and histone deacetylase 8. mSin3A itself was further shown to interact with HDAC 1 and 2. As an additional support to this model, the β/SMMHC-mediated repression in the Gal4-TK reporter system was impaired by an HDAC inhibitor, TSA. Quite interestingly, the ACD was implicated to play critical roles here again: its internal deletion resulted in concurrent impairments of the repression by β/SMMHC as well as its ability to associate with both mSin3A and HDAC8. Taken altogether, these observations have dual rather unexpected implications. Contrary to our preceding interpretation focusing on the sequestration model, RUNX1 bound by multimerized RUNX1–β/SMMHC complexes could make productive interactions with DNA carrying cognate-binding sites unless HDACs are allowed to intervene. In addition, multimerization of the coiled-coil domain could facilitate, rather than hinder, its association with repressor proteins. How can these events happen? Previous electron microscopic studies have shown that SMMHC, in its native state, has a side-polar structure in which the coiled-coil myosin tail is first dimerized antiparallelly and then the resultant dimeric units are consecutively stuck on their lateral sides in a manner staggered to each other, thereby forming an elongated single-layered thin strip that is laced with paired myosin head pieces on both sides (Xu et al., 1996). Within this molecular assemblage, the coiled-coil domain is supposed to have two exposed faces on the top and bottom of the plane of the strip. CBFβ-SMMHC multimers associated with RUNX1 are also likely to adopt the same side-polar structure, and hence could interact relatively freely with any ligand molecule coming into contact, either DNA or protein. Thus, the ACD-mediated multimerization could possibly play a more active role in supporting the CBFβ-SMMHC-mediated repression by converting RUNX1 into a dedicated transcriptional repressor, rather than simply keeping the heteromeric assemblage in a functionally inert state. Since the ACD domain is required for multimerization, it is expected that β/SMMHCΔ95 binds with corepressors with less affinity and represses transcription less efficiently, even though it sequesters Runx1 with higher efficiency, as described before. It is also worth noting that the HDAC-dependency of the CBFβ-SMMHC-mediated repression has been thus far demonstrated only in the artificial experimental setting employing the Gal4-based reporter. An analogous experiment using an authentic RUNX1-dependent transcription system has been hampered, and deemed virtually impossible, because RUNX1 itself can associate with, and accordingly undergo inhibition by, mSin3A (Lutterbach et al., 1999; Imai et al., 2004).

Perspectives

Taken altogether, the foregoing considerations point to a unifying view that sequestration and repression are both contributing mechanisms to the function of CBFβ-SMMHC, and the relative importance of each depends on the cell type and experimental or physiological conditions. These dual inhibitory mechanisms combined with its hyper-heterodimerization activity would make CBFβ-SMMHC an all the more efficient dominant inhibitor of the RUNX1 function.

Puzzlingly, however, there have also been various lines of evidence suggesting that the CBFβ-SMMHC-mediated inhibition is not always complete. Ectopic expression of CBFB-MYH11 in ES cells retaining one or two normal Cbfb alleles did not inhibit definitive hematopoiesis in an in vitro colony-forming assay (Miller et al., 2001). Ectopic expression of CBFB-MYH11 in 32Dcl3 premyeloid cells did not inhibit their G-CSF-induced differentiation to neutrophils nor their expression of endogenous myeloperoxidase whose promoter has been known as RUNX1 dependent, although their growth was slowed at the step of G1 to S cell cycle transition (Cao et al., 1997). Similarly, induction of CBFB-MYH11 in Ba/F3 lymphoid cells did not inhibit, but rather stimulated, expression of mRNA from the endogenous p21WAF1 promoter (Cao et al., 1997), whereas the same promoter on a plasmid vector was contrarily inhibited (Lutterbach et al., 1999). Moreover, dominant-negative Runx2 may cooperate with CBFβ-SMMHC for leukemogenesis in our mouse model (see details in the section Genes cooperating with CBFB-MYH11 for the pathogenesis of AML). One potential explanation for these observed results is chromosomal context, whether a target gene is on the cellular chromosome or on plasmids may make difference. Coincidentally, all the studies evaluating the inhibitory potential of CBFβ-SMMHC referred to in the section Mechanisms of repression of RUNX1-mediated transcription by CBFβ-SMMHC employed plasmid-borne reporter systems. It is tempting to speculate that highly multimerized RUNX1 : CBFβ-SMMHC complexes for their large size might kinetically and probabilistically be able to enjoy only infrequent contacts with individual endogenous target genes embedded within giant chromosomal bodies. On the other hand, the same multimeric complexes could readily capture more diffused plasmid-borne targets.

Nevertheless, the RUNX1 : CBFB-MYH11 multimer might have potentially hazardous consequences for cellular gene regulation in a more global context. Since the multimer has a dense array of RUNX1 attached on its elongated filamentous backbone, there would be a high probability for its certain segment(s) to hit randomly one or more of numerous cognate-binding sites scattered all over chromosomes. On hitting such a site(s), it would progressively expand its zone of chromosomal contacts from there on. Its eventual chromosomal engagement could be tenacious and long lasting so that it might well interfere with various processes operating on the chromosome including not only transcription but also replication, recombination, pairing, and segregation, etc. To possibly make the matter worse, open chromosomal regions functioning actively would serve preferential targets. This speculative scenario, to be called ‘a chromosomal entrapment model’, may provide useful clues to various hitherto poorly addressed issues regarding the function of CBFβ-SMMHC, particularly its subtle dependencies on cell lineages, differentiation stages, gene contexts, and physiological or pathological conditions. A straightforward initial test for this newly suggested model would be to examine whether chromosomal DNA might actually be adhering to the CBFβ-SMMHC-generated nuclear inclusion bodies.

Gene expression changes in inv(16)+AML

As discussed above, even though it is not completely clear how CBFβ-SMMHC contributes to leukemogenesis, it is generally assumed that the fusion protein dominantly suppresses RUNX function in transcription regulation. It is then interesting to know which target genes are affected by CBFβ-SMMHC for their expression, which are critical for leukemogenesis. Unfortunately, we still do not have a definitive answer to this question. However, recently published studies on gene expression profiling with gene chips or cDNA microarrays have shown that AML cases with inv(16) have a pattern of gene expression that is distinctive from other subtypes of AML.

Molecular classification and characterization of acute myeloid leukemia (AML)

Genome-wide expression profiling has been shown to be useful for the classifications of many types of cancer (Chung et al., 2002), and recently several groups demonstrated the feasibility of classifying AMLs and acute lymphoid leukemia (ALL) by expression profiling. AML subtypes can also be distinguished solely by gene expression profiling using microarray technology.

Golub et al. (1999) were able to distinguish ALL and AML based exclusively on gene expression profile in bone marrow samples of patients. Only 50 of the 6817 analysed genes were necessary to divide the 27 ALL and 11 AML patients into the different leukemia types. Their 50-gene predictor was able to diagnose 29 of the 34 independent samples correctly.

Schoch et al. (2002) specified the gene expression classification for the distinction of three different AML subtypes: t(8;21)(q22;q22), inv(16)(p13q22), and t(15;17)(q22;q12). Using two independent methodologies for class prediction in AML, Schoch et al. identified 36 genes to divide the three AML subtypes, but they were also able to make the discrimination with a minimal set of 13 genes. The most informative genes for inv(16) classification were PRKAR1B (downregulated), MYH11, and HOXB2 (upregulated). It is interesting that inv(16) and t(8;21) AML cases have distinct gene expression patterns. MYH11 and HOXB2 are again upregulated in inv(16) samples. In addition, a hypothetical gene DKFZP586N1922 is upregulated in inv(16) samples relative to t(8;21) cases. On the other hand, PRKAR1B expression is not significantly different between the two groups.

In a follow-up study, Kohlmann et al. (2003) expanded their study to discriminate eight different types of acute leukemia (AML and ALL). In this study, they included the AML expression data from their previous study (Schoch et al., 2002) and added AML samples with t(11q23)/MLL, with a normal karyotype, with a complex aberrant karyotype, or with trisomy 8. Only a set of 25 genes was sufficient to subclassify the AML types. In this study, MYH11 was again the most informative gene for the inv(16) classification. In addition, CD81 and LAMB2 were differentially expressed between t(11q23)/MLL and inv(16) cases; ECM1 was differentially expressed between inv(16) and cases with normal karyotype, complex karyotype, or trisomy 8; SELL and HLA-DPA1 were differentially expressed between inv(16) and t(15;17); and CDK2AP1 was differentially expressed between inv(16) and t(8;21).

Debernardi et al. (2003) studied five different AML subtypes based on their karyotype: t(8;21), t(15;17), inv16, 11q23, and normal karyotype. With a gene set of 145, they grouped the samples according to their cytogenetics. MYH11, SLC7A7, LRP1, PTPRM, SDR1, NCF2, and TGFB1 were found to be overexpressed in inv(16) samples. HOXA9, RUNX3, RAD52, and HOXA4 were found to be underexpressed in inv(16) samples. However, only a few genes overlap between the ‘most significant’ gene-sets of Debernardi et al., Schoch et al., and Kohlmann et al., primarily TGFB1 and MYH11 (Schoch et al., 2002; Debernardi et al., 2003; Kohlmann et al., 2003). Not surprisingly, in all studies, increased expression of MYH11 in inv(16) cases was most informative in acute leukemia classification. The level of expression of CBFB-MYH11 transcripts seems to have predictive value for treatment outcome (Schnittger et al., 2003).

Presumably more overlaps would be seen if more genes were included in the predictive gene-set. Schoch et al. (2002) initially obtained 1000 differentially expressed genes to specify and identify the AML subtypes but eventually used only 36 genes (Schoch et al., 2002). It would be interesting to combine these data sets to confirm the predictive values of these genes.

Candidates in the pathogenesis of AML associated with inv(16)

As mentioned above, only very few genes were in common among the published studies to have distinctive changes in inv(16) AML cases. These are discussed briefly below. It is not clear if any of the reported genes with altered expression are real targets of CBFβ-SMMHC. Further studies are needed to determine if CBFβ-SMMHC directly regulates their expression, and if their expression change plays any role in leukemogenesis.

One of the interesting genes, which could be relevant in the pathogenesis of AML M4eo, is the HOXA9 gene. In two studies (Golub et al., 1999; Debernardi et al., 2003), the expression difference of HOXA9 was used to discriminate the acute leukemia types. The observation of reduced expression of HOXA9 in all inv(16) blasts in one study (Debernardi et al., 2003), while increased expression of the HOXA9 gene in AML compared to ALL cases in the other was remarkable (Golub et al., 1999). It is also interesting that the expression level of HOXA9 was shown to be associated with the effectiveness of AML treatment.

Another HOX gene, HOXB2 gene, was elevated in inv(16) samples compared with the other subtypes of myeloid leukemia (Schoch et al., 2002; Debernardi et al., 2003). Besides the importance of the HOXB2 gene in embryogenesis, especially in the development of hindbrain (Sham et al., 1993), expression of the HOXB2 gene has been seen in different hematopoietic progenitors (Vieille-Grosjean and Huber, 1995). The exact function of the HOXB2 gene in hematopoiesis and its potential role in AML M4eo remain to be revealed.

Higher expression of the transforming growth factor (TGFB1) was observed in inv(16) blasts (Schoch et al., 2002; Debernardi et al., 2003). Previous studies showed the pleiotropic effect of TGFB1 on cells from various tissues. TGFB1 has been shown to be a tumor suppressor, since it can inhibit cell growth and induce apoptosis (Coffey et al., 1988; Perlman et al., 2001). On the other hand, expression of TGFB1 seems to enhance migration and invasion of tumor cells (Akhurst and Derynck, 2001). Since TGFB1 plays an important role in differentiation and antiproliferation during hematopoiesis (Fortunel et al., 2000), TGFB1 and the genes in the TGFB1 signaling pathway (e.g. Smad family genes) are fascinating candidates as important players in the pathogenesis of AML M4eo and AML in general.

In inv(16) blasts, downregulation of one of the three RUNX family genes, RUNX3, was observed (Debernardi et al., 2003). Previous studies have shown that RUNX3 has an important role in the development of dorsal root ganglia and cell proliferation of gastric epithelium (Levanon et al., 2002; Li et al., 2002; Inoue et al., 2003). However, RUNX3 also appears to have a role as a tumor suppressor gene in various cancers, including gastric, bile duct, and pancreatic cancers (Guo et al., 2002; Li et al., 2002; Wada et al., 2004). The role of RUNX3 in tumorigenesis, the expression of RUNX3 in hematopoietic cells, and the recent observation of the possible role of RUNX3 in hematopoiesis in zebrafish (Kalev-Zylinska et al., 2003) make RUNX3 a very interesting gene in leukomogenesis. However, no mutations in RUNX3 have been associated with AML (Otto et al., 2003).

Summary

As shown in the recent reports described above, only a small set of genes are needed to distinguish subtype M4eo from other subtypes of acute leukemia. Obviously, overexpression of MYH11, which is part of the fusion gene CBFB-MYH11, is very informative for the AML classification. The importance and relevance of the other genes of the predictive gene-sets for AML M4eo pathogenesis remain to be characterized.

It is likely that true target genes are yet to be identified. As noted above, there was very little overlap in gene expression profiling data generated from different groups. This lack of data consistency has probably resulted from the heterogeneous nature of clinical samples (age, sex, stage of disease, percent of blasts in the sample, other chromosomal abnormalities, etc.) as well as technical reasons. Different technologies, for example, SAGE vs microarray hybridization, may result in differences in data. Within a given technology, different probe sets (e.g. different versions of Affymatrix genechips) can obviously lead to different results as well (Kohlmann et al., 2003).

Expanding the data sets by collecting more samples and combining the data sets will probably improve and enhance the reliability and reproducibility of gene expression profiling data with clinical samples. Alternatively, it would be useful to develop model systems to generate a more homogenous condition. Cell culture systems simulating hematopoiesis in the presence or absence of CBFB-MYH11 expression, regulated at different stages of hematopoiesis, are potentially useful to identify direct targets of CBFB-MYH11. Animal models established to simulate leukemia development in the presence of CBFB-MYH11will be useful to identify gene expression changes critical for leukemogenesis.

Genes cooperating with CBFB-MYH11 for the pathogenesis of AML

Cbfb-MYH11 knockin mouse model

We have previously generated a mouse model to study the function of CBFB-MYH11 by inserting the CBFB-MYH11 fusion gene into the mouse Cbfb locus in embryonic stem (ES) cells (Castilla et al., 1996). We demonstrated that CBFB-MYH11 dominantly inhibits Runx1/Cbfb functions since embryos heterozygous for the Cbfb-MYH11 allele exhibited a phenotype similar to Runx1- or Cbfb-null embryos (Castilla et al., 1996; Okuda et al., 1996; Wang et al., 1996). In addition, we demonstrated that Cbfb-MYH11 blocks normal hematopoiesis. In the embryos, definitive hematopoiesis is blocked at the stem cell level since c-kit+colony-forming cells are nearly undetectable in the fetal livers of E11 embryos (Castilla et al., 1996; Kundu et al., 2002). In the adult chimeric mice, Cbfb-MYH11-containing ES cells contributed to the hematopoietic stem cells but not myeloid and lymphoid lineages, indicating a blockage of differentiation by Cbfb-MYH11 (Castilla et al., 1999; Kundu et al., 2002).

Cbfb-MYH11 chimeras did not develop myeloid leukemia, even though after prolonged latency they developed T-cell lymphoma at a frequency higher than wild-type mice (Castilla and Liu, unpublished data). No leukemia developed in another mouse model for Cbfb-MYH11, in which Cbfb-MYH11 was driven by a myeloid-specific promoter, MRP8 (Kogan et al., 1998). Therefore, it seemed likely that Cbfb-MYH11 by itself is not sufficient to initiate leukemia, at least in the mouse. To test this hypothesis, we treated the Cbfb-MYH11 knockin chimeras with ENU to induce random point mutations in the genome. AML developed in the Cbfb-MYH11 knockin chimeras 4–7 months after ENU treatment at a high frequency. On the other hand, wild-type mice treated with ENU did not develop myeloid leukemia and only 1/20 developed T-cell lymphoma within 1 year after ENU injection. This result supports the hypothesis that Cbfb-MYH11 predisposes mice to AML, but requires additional genetic events for full transformation (Castilla et al., 1999).

Retroviral insertional mutagenesis

To identify the potential cooperating genes, we treated newborn Cbfb-MYH11 knockin chimeric mice with the retrovirus 4070A to induce changes in the genome by insertional mutagenesis. Amphotropic murine leukemia retrovirus (MLV) 4070A can induce myeloid leukemia in DBA/2N mice that were undergoing an intense chronic inflammatory response, but not in mice without the inflammation (Wolff et al., 1991). 4070A treatment resulted in the development of AML in 63% (27/43) of Cbfb-MYH11 knock-in chimeras but none of the control wild-type mice (n=40) (Castilla et al., 2004). The latency was 3–6 months and the phenotype of the leukemia was similar to that of ENU-induced leukemia. Therefore, leukemia development is dependent on the expression of the Cbfb-MYH11 knock-in gene and independent of the mutagens used.

Viral insertion sites were detected by Southern blot hybridization and identified by inverse PCR and sequencing (Castilla et al., 2004). Southern blot hybridization with a viral probe showed that there were 1–3 insertions per leukemia, indicating that as few as one insertion is enough for Cbfb-MYH11 leukemogenic cooperation. This supposition was supported by the finding that a single insertion was retained in several independent secondary leukemias induced by bone marrow/spleen leukemic cell transplantation to isogenic recipients, which suggested that the leukemia was clonal and that a single insertion was important for the leukemogenesis. Using inverse PCR, the viral insertion sites were cloned and the surrounding sequences were determined. A total of 67 insertion sites were identified in 20 leukemia samples, or 3.3 insertions/leukemia. This number was slightly higher than the estimation by Southern blot hybridization, probably due to technical differences between the two approaches. In addition, viral insertions in the surrounding nonleukemia cells could have been cloned by the inverse PCR approach. Searching the mouse genomic databases revealed that 90% of the insertion sites were located within 10 kb of a gene. More than half of the insertions (34/67) were located either within 10 kb upstream of the transcription initiation site or within 5′ UTR (part of the transcript but upstream of ATG), suggesting that transcriptional activation of cellular genes is the most common mechanism of insertional mutagenesis in this study. The second most common location of viral insertion is in the introns of the protein-coding region, accounting for 24 of the total 67 insertions. Such events are predicted to truncate the encoded proteins. In addition, one insertion disrupts a single-exon gene (Edg6), which should also result in protein truncation. There are only seven insertions located 10 kb away from the flanking genes and their effect on the flanking genes is unclear.

Candidate cooperating genes

Recent studies suggest that murine leukemia retroviruses may have a propensity to insert within genes, especially when nearing the transcription start sites (Wu et al., 2003). Therefore, some of the integrations identified above may result from random integration and do not contribute to leukemogenesis. Common integration sites, those with more than one independent insertion in our relatively small panel, would suggest nonrandom integration and biological significance. Six genes were inserted more than once, and they are Plag1, Plagl2, D6Mm5e, H2T24, Myb, and Runx2 (Castilla et al., 2004).

Plag1 and Plagl2 are especially interesting since they encode highly homologous proteins that belong to a new family of zinc-fingers. Plag1 was inserted eight times while Plagl2 was inserted twice. Therefore, these two genes are the most frequent insertion sites in the panel (15% of total), suggesting that they are bona fide cooperating genes with Cbfb-MYH11 for leukemogenesis. PLAG1 was initially identified at the breakpoint of a chromosome translocation t(3;8)(p21;q12), which is commonly found in human pleomorphic adenoma, a benign tumor of the salivary glands (Kas et al., 1997). The translocation leads to promoter swapping between PLAG1 and CTNNB1, which encodes the constitutively expressed β-catenin. As a result of the translocation, PLAG1 is activated by the CTNNB1 promoter (Kas et al., 1997). Subsequently, it was discovered that PLAG1 can be activated through similar mechanisms by the promoters of LIFR (encoding leukemia inhibitory factor receptor) and the elongation factor SII gene in pleomorphic adenomas (Voz et al., 1998; Astrom et al., 1999). The insertions in Plag1 and Plagl2 genes in the Cbfb-MYH11 chimeras were all close to the transcription start sites, which are predicted to also upregulate Plag1 and Plagl2 expression (Castilla et al., 2004).

Plag1 protein contains seven zinc-fingers near the amino-terminus and a serine-rich region near the carboxy-terminus. Two highly homologous genes have been isolated recently, named Plagl1 and Plagl2 (Kas et al., 1998). The three proteins share 70–80% identity in the zinc-finger region, but only 19–35% identity in the serine-rich region. All three proteins transactivate gene expression in GAL4 fusion protein assays (Kas et al., 1998). Even though their zinc-finger regions are highly conserved, the three proteins have different preferences and binding affinities for target DNA sequences, with Plagl1 different from Plag1 and Plagl2 (Hensen et al., 2002). Functionally, Plag1 and Plagl2 behave like classical oncogenes in NIH 3T3 transformation assays: they rendered the cells to grow in foci and anchorage independently, and the cells expressing either of these two genes formed tumors in nude mice (Hensen et al., 2002). On the other hand, Plagl1 behaves like a tumor suppressor. Also known as Zac1 and Lot1, Plagl1 has been isolated independently as a candidate tumor suppressor gene since it is located on 6q25, a chromosomal region frequently deleted in many solid tumors (Abdollahi et al., 1997). In cell cultures, Plagl1 is able to induce apoptosis and arrest cells at G1, leading to inhibition of tumor cell proliferation (Spengler et al., 1997). Plag1 and Plagl2 upregulate insulin-like growth factor II (Kas et al., 1998; Voz et al., 2000; Hensen et al., 2002), while Plagl1 may serve as a coactivator of p53 (Huang et al., 2001b). Interestingly, we observed multiple insertions in the Plag1 and Plagl2 genes, but none in the Plagl1 gene, consistent with their oncogenic and tumor suppressive roles, respectively. This is the first time that Plag1 and Plagl2 genes are linked to leukemogenesis.

It is intriguing that three insertions were found in the Runx2 gene (Castilla et al., 2004). Runx2 is one of the three Runx family members that are obligate heterodimeric partners of Cbfβ. Runx2 plays a key role in osteogenesis and chondrocyte differentiation (Komori et al., 1997; Otto et al., 1997). In a CD2-Myc-transgenic model, Runx2 was found to be a frequent target for MLV insertion to induce T-cell lymphoma (Stewart et al., 1997). The insertions in the T-cell lymphomas were located close to the upstream promoter of the Runx2 gene, resulting in elevated expression (Cameron et al., 2003). Transgenic mice coexpressing Myc and Runx2 under the control of CD2 promoter rapidly developed T-cell lymphoma, suggesting that Runx2 may serve as an oncogene and cooperate with Myc. The three insertions observed in the leukemic cells from the Cbfb-MYH11 chimeras, however, are located in intron 5 (Castilla et al., 2004). These insertions are predicted to result in the production of a truncated Runx2 protein, retaining the Runt-homology domain for DNA and Cbfβ binding but deleting the transactivation and repression domains located on the C-terminal side of the insertions. Therefore, the truncated protein is predicted to serve as a dominant repressor of Runx function, since the truncated protein can bind Runx DNA target sequences but do not regulate transcription. If verified by demonstrating cooperation between coexpressing a truncated Runx2 and Cbfb-MYH11 in transgenic mice for leukemia development, the result would suggest that Cbfb-MYH11 is a weak inhibitor of Runx/Cbfb pathway, at least in the mice, and that Runx/Cbfb pathway is antileukemogenic in the myeloid lineage.

Two insertions were located 80 kb upstream of the Myb gene, close to the so-called Mml-3 locus, which is a frequent site for MLV insertions in published murine leukemia models (Haviernik et al., 2002). A recent study suggests that retroviral insertions may have long-range effects on Myb expression (Hanlon et al., 2003). Additionally, these insertions may affect the expression of other genes in the region. The functional consequences of insertions in the D6Mm5e and H2T24 genes are less clear. The function of D6Mm5e is not known, but insertions in the introns of this gene may affect the expression of Dok1, a gene located 18 kb away downstream of D6Mm5e. Dok1 encodes P62-Dok, which is an adaptor protein that interacts with SHIP1 and Ras-GAP and is involved in the pathogenesis of CML (Wisniewski et al., 1994; Dunant et al., 2000). H2T24 is part of the MHC class 1 T-gene cluster and its role in leukemogenesis has not been demonstrated.

Summary and future directions

In recent years, a common theme in the pathogenesis of human acute myeloid leukemia has emerged, which is the cooperation between repressive mutations in transcription factors and gain-of-function mutations in receptor tyrosine kinases (Gilliland, 2002). Most common chromosomal translocations identified in human AML generate fusion proteins involving at least one transcription factor, which has some role in the regulation of hematopoietic differentiation (Look, 1997). On the other hand, the most frequent mutations besides chromosomal translocations in AML have been found in genes encoding receptor tyrosine kinases, such as FLT-3 and c-kit (Gilliland and Griffin, 2002; Care et al., 2003). Therefore, it is logical to propose that two steps are required for complete leukemogenesis, one is block of differentiation as a result of dysfunctional transcription factors, and the other is deregulated proliferation of immature blasts as a result of activated receptor tyrosine kinases (Gilliland, 2002). The retroviral insertion data in Cbfb-MYH11 mice support a model that is a variation of the common theme. Most of the common insertions are close to or within genes with demonstrated roles in oncogenesis, particularly in the process of cell cycle and proliferation regulation. However, they are not necessarily kinases, but transcription factors as well (Figure 4).

Figure 4
figure 4

Two-step leukemogenesis contributed by CBFB-MYH11 and the candidate cooperating genes identified by retroviral insertional mutagenesis. dnRunx2: dominant-negative Runx2

In the near future, the functional significance of these candidate cooperating genes close to viral insertion sites needs to be confirmed. Transgenic mice need to be generated, combining the expression of Cbfb-MYH11 with that of the candidate genes. Accelerated leukemia development in these mice would prove that the candidate genes cooperate with Cbfb-MYH11 in leukemogenesis. Once confirmed, it will be important to find out if the candidate genes also cooperate with CBFB-MYH11 in the pathogenesis of human AML. In addition, the study described above was relatively small scale with only 20 leukemia samples. Potential cooperating genes may be inserted only once or not at all and therefore overlooked in this study. In addition, a different virus may insert in a different set of genes. Therefore, it is important to carry out the retroviral insertional mutagenesis again with a larger number of mice and using more than one strain of retrovirus. Last but not the least, it will be interesting to find out if these candidate genes also cooperate with AML1-ETO, the fusion gene involving the Runx1 gene (Miyoshi et al., 1991; Erickson et al., 1992). If the mechanism of leukemogenesis by Cbfb-MYH11 is through dominant suppression of Runx1, which is shared by AML1-ETO (Yergeau et al., 1997), at least some of the candidate genes should also be able to cooperate with AML1-ETO.