Description

Transcription starts at genomic positions called transcription start sites (TSSs) to produce RNAs, and is mainly regulated by genomic elements and transcription factors binding around these TSSs. This indicates that TSSs may be a better unit to integrate various data sources related to transcriptional events, including regulation and production of RNAs. We constructed a reference dataset of TSSs (refTSS) version 4 for the human and mouse genomes by updating refTSS version 3 and collecting publicly available TSS sequencing data sets. The data set consists of genomic coordinates of TSS peaks, their gene annotations between human and mouse. We also developed new web interface to browse the refTSS (https://reftss.riken.jp/).

Methods

We collected publicly available human and mouse 5'-end sequencing data from public repositories and databases as described in PMID: 31075273. After obtaining the 5'-end sequence data, we applied the following process:

  1. Reprocessing of 5'-end sequence data
  2. Conversion of the genomic coordinates to the latest genome assembly
  3. Reprocessing of raw sequence reads
  4. Reprocessing of the mapped reads in BAM format
  5. Quality evaluation of TSS sequencing data
  6. Integration of TSS regions with the previous version of refTSS
  7. Add new ID to the integrated TSS regions

For the updating to version 4, we merged all additional TSS data with the refTSS version 3 coordinates. Finally, we used publicly available annotation to annotate the refTSS versions

The structure of individual tracks provides genomic coordinates of TSS peaks for the refTSS and the processed source 5' end data set:

Data files

The data files (BED / text) is available for download from https://reftss.riken.jp/datafiles/.

Contact Information

Track hub is prepared by Masaki Suimye Morioka, Laboratory for Large-Scale Biomedical Data Technology, RIKEN IMS.

Please send us any questions regarding to this trackHub and underlying data.

References

Abugessaisa, I., Noguchi S, Hasegawa A, Kondo A, Kawaji H, Carninci P, Kasukawa T. (2019). "refTSS: A Reference Data Set for Human and Mouse Transcription Start Sites." J Mol Biol. doi: 10.1016/j.jmb.2019.04.045. PMID: 31075273.