1. What information is available in the SEdb?
  2. What samples are included in the SEdb?
  3. Database content and construction
  4. How to use the SEdb?
  5. CGI interface (Overlap with the user-submitted genome location)
  6. Explanation of the definitions used by the website
  7. Frequently Asked Questions
  8. Development environment
1.What information is available in the SEdb?

Super-enhancers are important for controlling and defining the expression of cell-specific genes. With research on human disease and biological processes, human H3K27ac ChIP-seq datasets are accumulating rapidly, creating the urgent need to collect and process these data comprehensively and efficiently. More importantly, many studies showed that super-enhancer-associated single nucleotide polymorphisms (SNPs) and transcription factors (TFs) strongly influence human disease and biological processes.

Here, we developed a comprehensive human super-enhancer database (SEdb, that aimed to provide a large number of available resources on human super-enhancers. The database was annotated with potential functions of super-enhancers in the gene regulation. Furthermore, SEdb provides detailed genetic and epigenetic annotation information on super-enhancers. Information includes common SNPs, motif changes, expression quantitative trait locus (eQTL), risk SNPs, transcription factor binding sites (TFBSs), CRISPR/Cas9 target sites and Dnase I hypersensitivity sites (DHSs) for in-depth analyses of super-enhancers. SEdb will help elucidate super-enhancer-related functions and find potential biological effects.

2.What samples are included in the SEdb?

The current version of SEdb documented a total of 331 601 Super-enhancers and 1 992 738 Super-enhancer elements over 540 samples, including the samples from NCBI GEO/SRA, ENCODE, Roadmap and GGR. Especially, unlike existing super-enhancer databases, we manually curated and classified 410 available H3K27ac samples from more than 2,000 ChIP-seq samples from NCBI GEO/SRA. For all samples, super-enhancers were identified by using a unified system environment and software parameters.

For more detailed statistics, please see the "Statistics" page.

3.Database content and construction.

SEdb-calculated super-enhancers based on H3K27ac ChIP-seq data. Genetic and epigenetic annotations were collected or calculated including common SNPs, eQTLs, risk SNPs, TFBSs, CRISPR/Cas9 target sites, DHSs, enhancers, motif changes and LD SNPs. Users query super-enhancers using seven options: Data sources, Biosample type, Tissue type, Biosample name, Chromosome, Start position and End position for details on super-enhancers. SEdb includes analytical tools and personalized genome browser to discover potential biological effects of super-enhancers.

4.How to use the SEdb?

The 'Data-Browse' page is an interactive table of alphanumeric sorting that allows you to quickly search for samples and customize filters through 'Data sources','Biosample type', 'Tissue type', and 'Biosample name'. Users can use the ‘Show entries’ drop-down menu to change the number of records per page. To view the super-enhancer for a given sample, users only need to click on the ‘Sample ID’ to view it.

Gene-SE Analysis: Accurate analysis of Gene-Super-enhancer

Users can submit a gene and search super-enhancers associated with it via relationships under different strategies (Five strategies: ROSE overlap, ROSE proximal, ROSE closest, Lasso, PreSTIGE) are obtained from determined samples.

Users can click Gene-SE analysis→Accurate analysis of Gene-Super-enhancer→ENCODE→Cell Line→Haematopoietic and Lymphoid Tissue_Bone Marrow→K562→Closest→FLJ35776, the gene's analysis results will be displayed on the next page.

Gene-SE Analysis: Fuzzy analysis of Gene-Super-enhancer

Users can submit a gene and search super-enhancers associated with it via relationships under different strategies (Five strategies: ROSE overlap, ROSE proximal, ROSE closest, Lasso, PreSTIGE) are obtained from indeterminate samples.

Users can click Gene-SE analysis→Fuzzy analysis of Gene-Super-enhancer→Closest→FLJ35776, the gene's analysis results will be displayed on the next page.

SNP-SE Analysis: Analyze common SNP in the Super-enhancer regions

Users can submit a common SNP and find the super-enhancers it falls into, the corresponding super-enhancers related annotation information and LD SNPs of five population.

The motif changed is calculated using the R package 'atSNP'. The linkage disequilibrium SNPs is calculated using the Phase 3 of the 1000 Genomes Project.

Users can click SNP-SE analysis→rs4881215, the SNP's analysis results will be displayed on the next page.

Overlap Analysis

Users can submit a bed file and identify super-enhancers that has an overlap relationship with the submitted regions in the bed file through setting the percentage of overlap.

To help users view proximity information of super-enhancers in genomes, we developed a personalized genome browser using JBrowse with useful tracks . Users see the proximity of super-enhancers to nearby genes, genome segments, SNPs, common SNPs, risk SNPs, DHSs, enhancers, TFBS conserved, TFBS by ChIP-seq, and conservative score.

The data of super-enhancers and super-enhancer elements of all samples are provided for download in the ‘Download’ page. Also supports .bed format and .csv format.

5.CGI interface (Overlap with the user-submitted genome location)

In the aspect of data sharing, a CGI program was specially built for SEdb. Users, especially website developers only needs to provide the location region of the genome and use the link of the SEdb website to determine which Super-enhancers overlap with the region. The data obtained by the feedback can be displayed directly on the platform.

Users can use 'iframe' to get feedback and display super-enhancer data on their platform.

Instructions: number)&start=(Genome start position)&end=(Genome end position)

For Example:

6.Explanation of the definitions used by the website

SE: super-enhancer

GGR: Genomics of Gene Regulation Project

DHS score: Indicates how dark the peak will be displayed

‘Important’ in the context of SE locations: To highlight the importance of this information, we added the label "important" after the super-enhancer's genomic location.

overlap rate: The overlap ratio between a certain super-enhancer regions and all super-enhancer regions.

super-enhancer element: The enhancer who constitutes super-enhancer, that are constituents of the super-enhancer

Data sources: Sources of sample, including ENCODE, Roadmap, NCBI GEO/SRA and GGR.

Biosample type: Cell type classification of samples.

Tissue type: Samples tissue type.

Biosample name: Biosample name is made of cell/tissue/cell line name, treatment condition, processing time, etc.

Chromosome: Chromosome.

Start position: The super-enhancer is at the start of the chromosome.

End position: The super-enhancer is at the end of the chromosome.

External links: Link to an external database for more information

7.Frequently Asked Questions
7.1:Why is some information fruitless?

Reply:It may be because there is no corresponding annotation for the sample or the region has no result.

7.2:Why are there some super-enhancer's enhancer annotation that are FANTOM5's enhancer?

Reply:Since the enhancer of the corresponding sample cannot be found temporarily, the enhancer of FANTOM5 is used as a reference extension for the annotation.

7.3:Why might web pages load slowly?

Reply:SEdb has advanced storage technology and sufficient bandwidth to meet the needs of most users for the speed of web page loading. However, it is not excluded that few users have poor user experience due to network reasons.

7.4:The database integrates H3K27ac ChIP-Seq, input control sequencing, and DNA hypersensitivity data, but it is not clear how these datasets are matched for each cell type. Were only samples with all three data types considered?

Reply:To ensure the quality of super-enhancer identification, each of the H3k27ac samples collected by SEdb need to contain H3K27ac ChIP-seq and the corresponding input control sequencing data. Furthermore, a sample, as well as cell type, will be contained in the database if super-enhancers were successfully identified in the sample using H3K27ac ChIP-seq and the corresponding input control sequencing. DNA hypersensitivity data are only used to annotate the identified super-enhancers. We match corresponding DNA hypersensitivity data to super-enhancers of a sample/cell type in the database if DNA hypersensitivity data exist for that sample/cell type.

7.5:How is the sample repetition removed?e.g. ENCODE samples contained within GEO/SRA.

Reply:Data in SEdb come from ENCODE, Roadmap, GGR, and GEO/SRA. We first downloaded the data for ENCODE, Roadmap, and GGR from the ENCODE/Roadmap website ( These data have already been de-duplicated on the website. In the process of screening NCBI GEO/SRA data, we did not consider samples that appeared in ENCODE, Roadmap, or GGR. Finally, all data from ENCODE, Roadmap, GGR, and GEO/SRA were further de-duplicated manually according to the unique GEO/SRA series number.

8.Development environment
Development environment

The current version of SEdb was developed using MySQL 5.7.17 ( and runs on a Linux-based Apache Web server ( We used PHP 7.0 ( for server-side scripting. We designed and built the interactive interface using Bootstrap v3.3.7 ( and JQuery v2.1.1 (http://jquery. com). We used ECharts ( and D3 ( as a graphical visualization framework, and JBrowse ( is the genome browser framework. We recommend using a modern web browser that supports the HTML5 standard such as Firefox, Google Chrome, Safari, Opera or IE 9.0+ for the best display.

The SEdb database is freely available to the research community using the web link ( Users are not required to register or login to access features in the database.

Material disclaimer

The materials and frameworks used by SEdb are shared by the network and do not contain intellectual property infringement. If there is any infringement, please write to us and we will change it in time.

Source of material