Super-enhancers are important for controlling and defining the expression of cell-specific genes. With research on human disease and biological processes, human H3K27ac ChIP-seq datasets are accumulating rapidly, creating the urgent need to collect and process these data comprehensively and efficiently. More importantly, many studies showed that super-enhancer-associated single nucleotide polymorphisms (SNPs) and transcription factors (TFs) strongly influence human disease and biological processes.
Here, we developed a comprehensive human super-enhancer database (SEdb, http://www.licpathway.net/sedb) that aimed to provide a large number of available resources on human super-enhancers. The database was annotated with potential functions of super-enhancers in the gene regulation. Furthermore, SEdb provides detailed genetic and epigenetic annotation information on super-enhancers. Information includes common SNPs, motif changes, expression quantitative trait locus (eQTL), risk SNPs, transcription factor binding sites (TFBSs), CRISPR/Cas9 target sites and Dnase I hypersensitivity sites (DHSs) for in-depth analyses of super-enhancers. SEdb will help elucidate super-enhancer-related functions and find potential biological effects.
The current version of SEdb documented a total of 331 601 Super-enhancers and 1 992 738 Super-enhancer elements over 540 samples, including the samples from NCBI GEO/SRA, ENCODE, Roadmap and GGR. Especially, unlike existing super-enhancer databases, we manually curated and classified 410 available H3K27ac samples from more than 2,000 ChIP-seq samples from NCBI GEO/SRA. For all samples, super-enhancers were identified by using a unified system environment and software parameters.
For more detailed statistics, please see the "Statistics" page.
SEdb-calculated super-enhancers based on H3K27ac ChIP-seq data. Genetic and epigenetic annotations were collected or calculated including common SNPs, eQTLs, risk SNPs, TFBSs, CRISPR/Cas9 target sites, DHSs, enhancers, motif changes and LD SNPs. Users query super-enhancers using seven options: Data sources, Biosample type, Tissue type, Biosample name, Chromosome, Start position and End position for details on super-enhancers. SEdb includes analytical tools and personalized genome browser to discover potential biological effects of super-enhancers.
The 'Data-Browse' page is an interactive table of alphanumeric sorting that allows you to quickly search for samples and customize filters through 'Data sources','Biosample type', 'Tissue type', and 'Biosample name'. Users can use the ‘Show entries’ drop-down menu to change the number of records per page. To view the super-enhancer for a given sample, users only need to click on the ‘Sample ID’ to view it.
Users determine the scope of the super-enhancer query by determining the sample and genome location for the results of interest. Brief information on the search results is displayed in a table on the results page.
Users can click Search→ENOCDE→Cell Line→Haematopoietic and Lymphoid Tissue_Bone Marrow→K562, the search results will be displayed on the next page.
The 'Super-enhancer search result' page, include sample overview, search parameters, pie charts for the number of chromosomes distributed by super-enhancer, search results table, sample file usage, annotation information usage and software parameters used.
Users can click 'SE_01_03900001', the super-enhancer detail information will be displayed on the next page.
Overview information includes SE ID, Data source, Biosample type, Tissue type, Biosample name, genome location, Size, Super-enhancer Rank, ChIP Density (Case), ChIP Density (input) and Genome Browser.
This is a detailed display of the 'SE_01_03900001' annotation.
This is a detailed display of the 'SE_01_03900001' associated genes.
This is a detailed display of the 'SE_01_03900001' elements.
Links to other super-enhancers identified in different samples which overlap with 'SE_01_03900001'.
Users can click 'Detail', the Super-enhancer element detail information will be displayed on the next page.
Users can submit a gene and search super-enhancers associated with it via relationships under different strategies (Five strategies: ROSE overlap, ROSE proximal, ROSE closest, Lasso, PreSTIGE) are obtained from determined samples.
Users can click Gene-SE analysis→Accurate analysis of Gene-Super-enhancer→ENCODE→Cell Line→Haematopoietic and Lymphoid Tissue_Bone Marrow→K562→Closest→FLJ35776, the gene's analysis results will be displayed on the next page.
Users can submit a gene and search super-enhancers associated with it via relationships under different strategies (Five strategies: ROSE overlap, ROSE proximal, ROSE closest, Lasso, PreSTIGE) are obtained from indeterminate samples.
Users can click Gene-SE analysis→Fuzzy analysis of Gene-Super-enhancer→Closest→FLJ35776, the gene's analysis results will be displayed on the next page.
Users can submit a common SNP and find the super-enhancers it falls into, the corresponding super-enhancers related annotation information and LD SNPs of five population.
The motif changed is calculated using the R package 'atSNP'. The linkage disequilibrium SNPs is calculated using the Phase 3 of the 1000 Genomes Project.
Users can click SNP-SE analysis→rs4881215, the SNP's analysis results will be displayed on the next page.
Users can submit a bed file and identify super-enhancers that has an overlap relationship with the submitted regions in the bed file through setting the percentage of overlap.
To help users view proximity information of super-enhancers in genomes, we developed a personalized genome browser using JBrowse with useful tracks . Users see the proximity of super-enhancers to nearby genes, genome segments, SNPs, common SNPs, risk SNPs, DHSs, enhancers, TFBS conserved, TFBS by ChIP-seq, and conservative score.
In the aspect of data sharing, a CGI program was specially built for SEdb. Users, especially website developers only needs to provide the location region of the genome and use the link of the SEdb website to determine which Super-enhancers overlap with the region. The data obtained by the feedback can be displayed directly on the platform.
Users can use 'iframe' to get feedback and display super-enhancer data on their platform.
Instructions:http://www.licpathway.net/sedb/search/overlap_cgi.php?chr=(Chromosome number)&start=(Genome start position)&end=(Genome end position)
For Example:http://www.licpathway.net/sedb/search/overlap_cgi.php?chr=chr18&start=3592913&end=3627821
SE: super-enhancer
GGR: Genomics of Gene Regulation Project
DHS score: Indicates how dark the peak will be displayed
‘Important’ in the context of SE locations: To highlight the importance of this information, we added the label "important" after the super-enhancer's genomic location.
overlap rate: The overlap ratio between a certain super-enhancer regions and all super-enhancer regions.
super-enhancer element: The enhancer who constitutes super-enhancer, that are constituents of the super-enhancer
Data sources: Sources of sample, including ENCODE, Roadmap, NCBI GEO/SRA and GGR.
Biosample type: Cell type classification of samples.
Tissue type: Samples tissue type.
Biosample name: Biosample name is made of cell/tissue/cell line name, treatment condition, processing time, etc.
Chromosome: Chromosome.
Start position: The super-enhancer is at the start of the chromosome.
End position: The super-enhancer is at the end of the chromosome.
External links: Link to an external database for more information
Reply:It may be because there is no corresponding annotation for the sample or the region has no result.
7.2:Why are there some super-enhancer's enhancer annotation that are FANTOM5's enhancer?Reply:Since the enhancer of the corresponding sample cannot be found temporarily, the enhancer of FANTOM5 is used as a reference extension for the annotation.
7.3:Why might web pages load slowly?Reply:SEdb has advanced storage technology and sufficient bandwidth to meet the needs of most users for the speed of web page loading. However, it is not excluded that few users have poor user experience due to network reasons.
7.4:The database integrates H3K27ac ChIP-Seq, input control sequencing, and DNA hypersensitivity data, but it is not clear how these datasets are matched for each cell type. Were only samples with all three data types considered?Reply:To ensure the quality of super-enhancer identification, each of the H3k27ac samples collected by SEdb need to contain H3K27ac ChIP-seq and the corresponding input control sequencing data. Furthermore, a sample, as well as cell type, will be contained in the database if super-enhancers were successfully identified in the sample using H3K27ac ChIP-seq and the corresponding input control sequencing. DNA hypersensitivity data are only used to annotate the identified super-enhancers. We match corresponding DNA hypersensitivity data to super-enhancers of a sample/cell type in the database if DNA hypersensitivity data exist for that sample/cell type.
7.5:How is the sample repetition removed?e.g. ENCODE samples contained within GEO/SRA.Reply:Data in SEdb come from ENCODE, Roadmap, GGR, and GEO/SRA. We first downloaded the data for ENCODE, Roadmap, and GGR from the ENCODE/Roadmap website (www.encodeproject.org/). These data have already been de-duplicated on the website. In the process of screening NCBI GEO/SRA data, we did not consider samples that appeared in ENCODE, Roadmap, or GGR. Finally, all data from ENCODE, Roadmap, GGR, and GEO/SRA were further de-duplicated manually according to the unique GEO/SRA series number.
The current version of SEdb was developed using MySQL 5.7.17 (http://www.mysql.com) and runs on a Linux-based Apache Web server (http://www.apache.org). We used PHP 7.0 (http://www.php.net) for server-side scripting. We designed and built the interactive interface using Bootstrap v3.3.7 (https://v3.bootcss.com) and JQuery v2.1.1 (http://jquery. com). We used ECharts (http://echarts.baidu.com) and D3 (https://d3js.org) as a graphical visualization framework, and JBrowse (http://jbrowse.org) is the genome browser framework. We recommend using a modern web browser that supports the HTML5 standard such as Firefox, Google Chrome, Safari, Opera or IE 9.0+ for the best display.
The SEdb database is freely available to the research community using the web link (http://www.licpathway.net/sedb). Users are not required to register or login to access features in the database.
Material disclaimerThe materials and frameworks used by SEdb are shared by the network and do not contain intellectual property infringement. If there is any infringement, please write to us and we will change it in time.
Source of material
http://image.baidu.com/search/detail?ct=503316480&z=0&ipn=d&word=%E6%84%9F%E8%B0%A2%E5%88%86%E4%BA%AB%E5%9B%BE%E7%89%87&step_word=&hs=2&pn=1&spn=0&di=162644303250&pi=0&rn=1&tn=baiduimagedetail&is=0%2C0&istype=0&ie=utf-8&oe=utf-8&in=&cl=2&lm=-1&st=undefined&cs=96333370%2C2322362863&os=1407405689%2C3128481222&simid=0%2C0&adpicid=0&lpn=0&ln=1830&fr=&fmq=1539419046163_R&fm=&ic=undefined&s=undefined&se=&sme=&tab=0&width=undefined&height=undefined&face=undefined&ist=&jit=&cg=&bdtype=0&oriquery=&objurl=http%3A%2F%2Fpic19.photophoto.cn%2F20110607%2F0013026326607935_b.jpg&fromurl=ippr_z2C%24qAzdH3FAzdH3Fooo_z%26e3Bri5p5ri5p5_z%26e3BvgAzdH3FrtvAzdH3Fabcnc9ab_z%26e3Bip4s&gsm=0&rpstart=0&rpnum=0&islist=&querylist=
http://www.cssmoban.com/