PlantTFDB - Plant Transcription Factor Database @ CBI, PKU

Plant Transcription Factor Database

Transcription Factor Information

TF ID

The ID of transcription factor collected in PlantTFDB. For species with genome annotation, IDs from genome annotation were adopted as the PlantTFDB ID directly. For species without genome annotation, a unique TF ID was assigned for each TF, which consists of three characters which represent the species (e.g. Aan represents

Artemisia annua

) and 6 figures.

Taxonomy

The taxonomic ID and lineage for each organism was collected from NCBI Taxonomy.

Common name

The common names of transcription factors were collected from TAIR10, MSU and UniProt.

Gene Model

The gene (data source) coding for this transcription factor.

Gene Model ID

The ID of gene model, which was extracted from the original data source. Gene model ID can be searched in advanced search page.

Gene Model Type

The type of gene model. There are three types of gene model in PlantTFDB:

'genome' -- gene models came from genome annotation;

'PU_ref' -- gene models came from PlantGDB and UniGene, and they were selected as a representation of a cluster of PUTs and Unigene;

'PU_unref' -- gene models came from PlantGDB and UniGene, but they were not selected as a representation of a cluster of PUTs and Unigene;

Source

The source where gene model was got

Signature Domain

The Domain used to identify and classify transcription factors.

Protein Features

Domain and other features identified by InterProScan v5.

Plant Ontology

Plant Ontology (PO) was downloaded from TAIR10 for

A. thaliana

and Plant Ontology Consortium for other species.

Nucleic Localization Signal

Nucleic Localization signal (NLS) predicted by predictnls.

3D Structure

The best Blast hit from PDB.

Expression

The express description (tissue specificity and developmental stage) was collected from UniProt. The best Blast hit from UniGene, GEO, Genevisible and the direct links to Expression Atlas, AtGenExpress and ATTED-II were added.

Function description

Expert-curated functional descriptions were collected from UniProt, TAIR and GeneRIF.

Regulation

Manually curated regulations are collected from ATRM.

Interaction

Protein-promoter and protein-protein interaction data were collected from BioGRID, IntAct, and BIND.

Phenotype

Mutation informations were collected from UniProt, T-DNA express, and riceGE.

Annotation

The best Blast hit from GenBank, Refseq,SwissProt, TrEMBL and STRING.

Link Out

The links to well-known resources such as Phytozome, wikigene, iHOP .etc.

Publications

Publications related to the corresponding TF were collected from Entrez gene, GeneRIF, UniProt and ATRM.

Multiple Sequence Alignment

Protein alignment

Multiple sequence alignment for full length transcription factors was inferred using T-Coffee(v9.03).

Domain alignment

Multiple sequence alignment for domain was constructed through Hidden Markov Model-guided method.

Phylogenetic Trees

Phylogenetic trees for TFs within a family intra-species and within the same orthologous group are inferred using MrBayes (v3.2.6) based on the WAG model for 50,000 generations, and the result tree is an unrooted tree.

Phylogenetic trees for TFs of a family from all species are inferred using FastTree (v2.1.9) based on the WAG model with 100 times bootstraps, and the result tree is an unrooted tree.

Quick Search

In quick search box, you can search the TF using TF ID or common name.

TF Prediction Server

A TF prediction server has been upgraded in this version. The family assignment rules and thresholds determined by established methods (see details in the supplemental materials) are used to identify transcrption factors in the input sequences. When users input nucleic acid sequences, ESTScan 3.0 is employed to identify CDS regions of input nucleic acid sequences and translate them to protein sequences. When GC content of input sequences is less than 48%, the ESTScan model trained from the mRNA of

Arabidopsis thaliana

will be used. Otherwise, the model trained from

Oryza sativa

will be used. By checking "Best hit in

Arabidopsis thaliana

", links to the best hits in

Arabidopsis thaliana

will be added in the result for predicted transcription factors. Users can access it here to identify TFs in multiple sequences.