Widespread Enhancer Activity from Core Promoters.

Gene expression in higher eukaryotes is precisely regulated in time and space through the interplay between promoters and gene-distal regulatory regions, known as enhancers. The original definition of enhancers implies the ability to activate gene expression remotely, while promoters entail the capability to locally induce gene expression. Despite the conventional distinction between them, promoters and enhancers share many genomic and epigenomic features. One intriguing finding in the gene regulation field comes from the observation that many core promoter regions display enhancer activity. Recent high-throughput reporter assays along with clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-related approaches have indicated that this phenomenon is common and might have a strong impact on our global understanding of genome organisation and gene expression regulation.

It is worth noting that many of the early characterised enhancers are located close to, or overlapping 98 with, the promoter region of inducible genes, such as metallothioneins, histones of early cleavage 99 stages, viral immediate-early genes (from some papovaviruses, cytomegaloviruses and retroviruses), 100 heat-shock genes and the antiviral interferon genes [24] ( Table 2). A characteristic example is the 101 IFNb enhancer, which is one of the most well-studied enhancers [26]. Although located immediately 102 upstream of the IFNb gene, it can also function as a classical enhancer element conferring virus 103 infection-dependent activation of heterologous promoters, even when it is placed kilobases away from 104 the targeted promoter [27,28]. Interestingly, the enhancer activity of the IFNb promoter depends on 105 loop formation mediated by critical sequence-specific transcription factors bound to the regulatory 106 sequences [29]. A more recent study reported that a promoter located upstream of the adeno-107 associated virus type 2 (AAV2) genome also display liver-specific enhancer activity, a finding that 108 might explain the pathogenic association between AAV2 integration events and human hepatocellular 109 carcinoma through insertional dysregulation of cancer driver genes via enhancer-mediated effects 110 [30].

112
A common characteristic of most of the aforementioned promoters is that they are associated with 113 inducible genes that have to quickly respond to environmental stress, which might take more time or 114 be less efficient with a remote enhancer [24]. These early studies already highlighted that enhancers 115 and promoters are very similar entities with some gene promoters having the intrinsic properties to 116 work as enhancers and raised the possibility that enhancer-like promoters could regulate distal genes 117 in their natural context. 118 119 II. Promoter-promoter interactions suggest distal regulation by gene promoters 120 Mammalian genomes are intricately and dynamically organized into higher-order conformation inside 121 the micron-sized nuclear space [31]. Such three-dimensional (3D) organization of the genome is 122 thought to have a role in the mechanisms of transcription regulation and coordination by mediating 123 dynamic looping between distantly located cis-regulatory elements while enabling fine-tuning of gene 124 expression. The development of different molecular methods for capturing the spatial organization of 125 the genome (Box 1), such as Chromosome Conformation Capture (3C) and related techniques has 126 provided an unprecedented view of the 3D organization of the genome as well as the spatial resolution 127 of interacting regions [31,32]. 128 129 Besides the expected interactions between distal enhancers and promoters of target genes, several 130 observations have led to the notion that promoters participate in long-range regulation of distal genes 131 through promoter-promoter (P-P) interactions. Different 3C-based methods such as 3C carbon copy 132 (5C) [33], Chromatin Interaction Analysis with Paired-End-Tag sequencing (ChIA-PET) [34-36], 133 promoter capture Hi-C (CHi-C) [37][38][39] or HiChIP [40] have revealed extensive P-P interactions. In 134 fact, based on promoter capture Hi-C approaches, P-P interactions represent ~30% of all promoter-135 centered interactions [41], suggesting that this particular type of multigene regulatory networks is 136 common in mammalian cells.

138
In general, promoters contact other promoters with similar expression levels [34, 36, 38], indicating 139 that 3D contacts between promoters are non-random. Therefore, promoter interaction networks may 140 facilitate the coordinated expression control of associated genes and allow for regulatory crosstalk 141 between them. Within this hypothesis it is plausible that a fraction of these P-P interactions represent 142 a more specific regulatory circuitry, whereby a given promoter might regulate the activity of distal 143 neighbour genes. High-throughput reporter assays have several intrinsic caveats that might over or under-estimate the 179 actual number of promoters with enhancer-like activity [2, 42]. These caveats include, the size of the 180 tested fragments, the heterologous promoters used in the assays, and the fact that candidate enhancers 181 are studied outside their endogenous chromatin context, which is likely required for their in vivo 182 function.

184
Another potential concern is that the enhancer activity in the reporter assays actually reflects intrinsic 185 properties of the promoter (e.g. acting as hotspot for the recruitment of transcription factors), which 186 not necessarily imply enhancer activity in vivo. Certainly, an equally valid argument is that episomal 187 reporter assays allow to unbiasedly studying enhancer function independently of any "perturbing" 188 chromatin or genomic context. In any case, it would be interesting to systematically assess enhancer activity from gene promoters using chromatinized episomal or viral-based high-throughput reporter 190 assays [45, 52-54].

192
IV. In vivo assessment of distal gene regulation by promoter elements 193 As mentioned above, the fact that that some promoters might display enhancer capacity, when tested 194 in episomal reporter assays, does not necessarily imply implIES that they could influence other 195 promoters in vivo. Therefore, a critical issue is whether gene promoters are able to function as bona-196 fide enhancers by regulating distal gene expression in their endogenous context. A pioneer study 197 showed that one enhancer of the a-globin locus located within the intron of the Nbl1 gene harbours 198 intrinsic promoter activity and induces the expression of a non-coding isoform [55], however, the 199 physiological function of this non-coding transcript remains elusive.

201
The advent of clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 genome 202 editing methods allows now to systematically study the role of cis-regulatory elements in their 203 endogenous context [56, 57] (Box 3). Several independent studies using CRISPR genome editing 204 demonstrated that some promoters function as enhancers in their endogenous context ( Figure 1A) 205 ( Table 3). Using a CRISPR/Cas9-based promoter deletion strategy, we showed that selected 206 promoters of coding genes with enhancer activity identified in a human STARR-seq reporter assay 207 (i.e. ePromoters), are indeed involved in cis-regulation of distal gene expression in their natural 208 context, therefore functioning as bona fide enhancers [46]. These ePromoters were shown to 209 physically interact with the promoters of the regulated genes, in some cases involving several target 210 genes, implying that in these P-P interactions, one promoter acts as an active regulatory element of the 211 other(s). Interestingly, inversion of one of the model promoters still retained significant enhancer 212 activity, suggesting that, like classical distal enhancers, enhancer-like promoters might display 213 orientation independent enhancer activity. globin locus mentioned above, it is difficult to ascertain whether the tested regulatory element is a 224 "functional" promoter of the lncRNA or rather a distal enhancer associated with a long eRNA.

226
The CRISPR/Cas9 approach has been implemented to assess enhancer function within large genomic 227 regions surrounding a given gene of interest [42,56]. In these studies, a reporter gene introduced at 228 the place of the target gene is used to monitor gene expression. Then, a tiling single guide RNA 229 (sgRNA) library covering the surrounding genomic regions is screened to identify deleted regions 230 with potential enhancer elements. Interestingly, two independent studies performing such screens of 231 cis-regulatory elements also found that the expression of some genes is controlled, at least partially, 232 by distal gene promoters [60, 61] ( The advent of high-throughput sequencing has allowed to map transcription initiation with an 278 unprecedented sensitivity and resolution [5]. This has revealed that cis-regulatory elements are 279 commonly associated with transcriptional initiation sites flanking the regulatory sequences ( Figure  280 2). Promoters can be associated with either unidirectional or bidirectional transcription, in the latter 281 the signal intensity being biased towards the sense of the gene. Enhancers produce RNAs (eRNA) in 282 vivo [8,9,11] with an initiation and chromatin architecture similar to that of promoters [7, 10, 12, 64]. 283 In particular, enhancers have been shown to generally produce bidirectional unstable transcripts with 284 no particular orientation bias. While the functional relevance of eRNAs is not fully understood, it is 285 clear that their relative abundance is positively correlated with enhancer activity [7, 12, 64].

287
In macrophages, promoters highly induced during the immune challenge are characterised by the 288 presence of divergent transcription initiation in which the sense and antisense TSSs are separated by 289 large distances [65]. This in turn correlates with enlarged nucleosome depleted regions and enhancer-290 like features such as higher transcription factor occupancy, binding of p300 and high level of 291 H3K4me1 and suggest that the (Figure 2, middle panel). Thus, the size of the nucleosome-depleted 292 region in bidirectional promoters appears to contribute toward enhancer-like properties. Reminiscent 293 of these findings promoter with enhancer activity are predominantly associated with bidirectional 294 transcription [46]. Similarly, testing gene promoters for enhancer activity in Drosophila embryos 295 revealed that when bidirectionally transcribed, promoters could function as enhancer in vivo, while 296 unidirectional promoters generally cannot [64]. Overall, these results point towards an unifying model 297 whereby there is a continuum of cis-regulatory activity with some elements acting strictly as either 298 enhancer or promoter, while others function predominantly as an enhancer with weak promoter 299 activity or vice versa, yet others can have both strong promoter and enhancer activities [4-6, 10, 64] 300 (Figure 2). This spectrum of activities might be highly correlated with the directionality of 301 transcription, which likely reflects the underlying sequence properties. In this context, bidirectional 302 transcription at enhancer-like promoters might provide enlarged nucleosome depleted regions serving 303 as hubs for transcription factor binding and establishment of highly active chromatin to further 304 regulate or enhance proximal and distal gene expression (Figure 2, middle panel). This would be 305 particularly relevant in the case of rapid and coordinated regulation of gene expression in response to 306 environmental or intrinsic cellular stimuli.

308
Another outstanding question is whether promoter and enhancer activities of enhancer-like promoters 309 are correlated ( Figure 1B). Nguyen  promoter and enhancer activity, whereas for others ePromoters both activities were anti-correlated. 318 Consistently, integrative analysis of epigenomes across human tissues revealed that a given genomic 319 region could have epigenetic features of enhancer or promoter in different tissues, suggesting that the 320 type of regulatory activity (i.e. enhancer or promoter) might be tissue-specific [66]. Therefore, it is 321 plausible that depending on the locus, enhancer-like promoters might either coordinate the mRNA 322 expression of clusters of genes (for instances, upon stress response signalling) or display context-323 dependent enhancer or promoter activities ( Figure 1B).

325
As it could be expected, enhancer-like promoters interact with the promoters of regulated genes [40, 326 46, 60]. Moreover the frequency of P-P interactions is higher when the interaction involves at least 327 one enhancer-like promoter [46]. This suggest that one of the properties defining enhancer-like 328 promoter might be to favour P-P interactions, likely by recruiting key transcription factors such as 329 ZNF143 or YY1, which are two factors involved in looping [67, 68] and enriched at enhancer-like 330 promoters [46]. However, in a given cell type, the number of promoters involved in P-P interactions 331 surpass the number of enhancer-like promoters that can be found in the same cells [46]. It is therefore 332 likely that not all P-P interactions require an enhancer-like promoter. Alternatively, it is possible that 333 not all enhancer-like promoters are detected by the enhancer reporter assays. Finally, whether 334 enhancer-like promoters represent a hub of interactions with multiple genes need to be explored in the 335 future. 336 337 VI. Promoter-centered transcription factories 338 The expression of interacting genes within multigene complexes is generally well correlated, 339 suggesting that 3D gene organization contributes to coordination of gene expression programs. multi-molecular assemblies of protein-nucleic acids complexes providing a general regulatory 354 mechanism to compartmentalize membrane-less nuclear compartments [73]. However, the precise 355 contribution of enhancer-like promoters within these transcription factories is currently unknown.

357
As mentioned above, the widespread occurrence of P-P interactions suggests that promoter-centered 358 chromatin structure contribute to the 3D organisation of the genome and has provided a structural 359 framework for the postulated transcription factories [34]. Indeed, the P-P interactions appear to define 360 a subset of co-regulated promoters sharing genomic and structural regulatory properties, which may 361 be critical for stabilizing the local 3D interactions and the activity of transcription factories. For 362 instances, compared to the interactions between enhancer and promoters, the P-P interactions form a 363 higher order chromatin structure involving many loci, have highly coordinated expression, and are Given the overall contribution of enhancer-like promoters to the regulation of neighbour genes [40, 372 46, 58, 60] as well was the intrinsic features described in the previous section (frequently involved in 373 P-P interactions; high density of transcription factor binding, etc), it is tempting to speculate that this 374 type of promoters might play a key role within the transcription factories ( Figure 3B). In this model, 375 the enhancer-like promoters could either facilitate the assembly or maintenance of the transcription 376 factories by tightening the P-P interactions or bring specific transcriptional regulators required for the 377 regulation of the neighbour genes. In any case, it will be essential to investigate the specific 378 contribution of enhancer-like promoters to the functioning of transcription factories. 379 380

VII. Genetic variation within promoters influence distal gene regulation 381
One of the major endeavours in genomic research in the past decade was the advent of Genome Wide 382 Association Studies (GWAS) in order to identify genetic variants associated with candidate genes for 383 human diseases. mechanisms. Integrating information about enhancer-like promoters (e.g. using high-throughput 432 reporter assays) along with 3D interaction data, eQTL and disease-associated variants (e.g. GWAS) 433 might led to the discovery of disease-associated regulation by distal promoters (Figure 1C). These findings also open up the intriguing possibility that developmental traits or disease-associated 459 variants lying within a subset of promoters might directly impact on distal gene expression. While 460 there is already work to be done on the understanding of the molecular mechanisms that govern the 461 enhancer-like activity from promoters in cell type or response specific regulatory systems (see 462 Outstanding Questions), the "ePromoters" concept stresses the fact that the identification of regulatory Interacting genomic regions can be identified by chromosome conformation capture (3C) and its 477 derivative methods, which involve cross-linking distal interacting DNA pieces, proximity ligation and 478 sequencing to map the interactions ([32] and references therein). Variations of 3C can focus on 479 interactions for a small number of genomic bait regions (4C), interactions within specific genomic 480 domains (5C), or analyse the whole set of chromosomal interactions within a cell population (Hi-C).

481
Since the HiC technique requires very high sequencing coverage, alternative methods have been 482 developed allowing exploration of the contacts of a subset of genomic regions, with higher resolution 483 at the same cost. Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) [ sorting) or quantitative (based on RNA-seq) and designed to test enhancer or promoter activity. 500 Recent quantitative methods have been developed aiming to characterize enhancers. In particular, two 501 approaches massively parallel reporter assay (MPRA) and self-transcribing active regulatory region 502 sequencing (STARR-seq), have been widely used in recent years. The MPRA method consists of the 503 generation of a library of reporter constructs based on microarray synthesis of DNA sequences 504 (generally, tested sequences are cloned upstream of a basal promoter) and unique sequence tags or 505 barcodes (placed in the 3' UTR of the reporter gene). To increase the sensitivity and reproducibility, 506 several barcodes could be added to any given sequence. The reporter library is then transfected into 507 cell lines of interest and RNA sequencing of the barcodes is performed, thus providing a quantitative 508 readout of the regulatory activity of the tested regions. STARR-seq is a massively parallel reporter 509 assay (reviewed in [93]) aimed to identify and quantify transcriptional enhancers directly based on 510 their activity across whole genomes. In brief, a bulk of DNA fragments from arbitrary sources is 511 cloned downstream of a core promoter and into the 3'UTR of a GFP reporter gene. Once in cellular 512 context, active enhancers will activate the promoter and transcribe themselves resulting in reporter 513 transcripts among cellular RNAs. Thus, each reporter transcript contains the reporter gene and the 514 "barcode" of itself. These reporter transcripts can be isolated separately by targeted PCR and 515 eventually detected by deep sequencing. The main advantage over the classical MPRA is that the 516 tested sequence itself is used as a "barcode", substantially simplifying the whole procedure of 517 quantifying the enhancer activity. Capture-based approaches can be used to enrich for particular 518 region of interest. For recent reviews on these methods, see [2,42]. 519 520 Box 3. CRISPR/Cas9 based approached to study cis-regulatory elements 521 Since its discovery, the clustered regularly interspaced short palindromic repeats (CRISPR)-522 associated protein 9 (Cas9) technology has been widely used for genome editing. This method permit 523 to target genome DNA using a small RNA fragment (referred as single-guide RNA; sgRNA). The 524 Cas9 enzyme recognizes the sgRNA/DNA complex and cuts the DNA, triggering the DNA repair 525 system of the cell. This strategy can help to study the cis-regulatory elements in their natural context: 526 I. Deletion of a cis-regulatory element by non-homologous end joining (NHEJ) repair using two 527 sgRNA flanking the regulatory region of interest (e.g. [46,58] Tables  569  Table 1: Features associated with active promoters and enhancers 570 571 572 573 574 575

Features (Active elements)
Promoter Enhancer

Intrinsic property
Induce transcription of a heterologous reporter gene Activate a distal (heterologous) promoter

Transcription initiation
Unidirectional or divergent Mainly divergent

Ratio between sense and antisense transcripts
Biased towards sense transcription Equilibrated

Transcription elongation
Produce long polyadenylated transcripts Some enhancers can produce low levels of polyadenylated transcripts

RNAPII and GTF Present Present
GpG islands Majority Very rare  Outstanding Questions 586 • What are the specific components within the promoter region driving promoter versus enhancer 587 activity? 588 • Are promoter and enhancer activities correlated across different tissues? 589 • Do ePromoter-promoter interactions rely on similar mechanisms as previously shown for 590 enhancer-promoter interactions? 591 • Are enhancer-like promoters a hub of P-P interactions? 592 • Are enhancer-like promoters involved in particular biological processes? 593 • Is the enhancer activity of promoters dependent on the genomic context? 594 • Is the regulation by enhancer-like promoters a specific process or rather an unspecific contribution 595 to gene expression within transcription factories? 596 • Is enhancer activity from promoters evolutionary conserved? Could enhancer-like promoters be 597 associated with evolutionarily new genes originated from distal enhancer elements? 598 • Finally, what are the contributions of enhancer-like activity of promoters to disease? 599 600 601 Highlights 602 • Promoters and enhancers share architectural and functional properties. 603 • When tested on episomal reporters, many promoters display enhancer activity.

604
• In vivo experiments demonstrated that enhancer like promoters function as bona fide enhancers. 605 • Genetic variants lying in enhancer-like promoters might impact on physiological traits or diseases 606 by altering the expression of distal genes. 607 608