PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG045048t1
Common NameTCM_045048
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family Trihelix
Protein Properties Length: 353aa    MW: 38754.9 Da    PI: 10.0265
Description Trihelix family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG045048t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1trihelix51.52.6e-1657140186
          trihelix   1 rWtkqevlaLiearremeerlrrgklkkplWeevskkm....rergferspkqCkekwenlnkrykkikegekkrtsessstcpyfdqle 86 
                       +W++  v  L+ea+++++   +r+klk+++We+v++++      ++  ++++qCk+k+e+++kry+ + +++ +      s++p++ +l+
  Thecc1EG045048t1  57 EWSEGAVSSLLEAYENKWVLRNRAKLKGHDWEDVARYVsaraNCTKSPKTQTQCKNKIESMKKRYRSESATADG------SSWPLYPRLD 140
                       5*************************************844444455556679****************99997......4699999986 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
PfamPF138376.9E-2055140No hitNo description
Sequence ? help Back to Top
Protein Sequence    Length: 353 aa     Download sequence    Send to blast
MDKETNQENP SLLSNNNANS ITKEDCSPKK HPGSTAVIGG GSGGGGGGSN DRLKRDEWSE  60
GAVSSLLEAY ENKWVLRNRA KLKGHDWEDV ARYVSARANC TKSPKTQTQC KNKIESMKKR  120
YRSESATADG SSWPLYPRLD LLLRGSAPPP PQPPLQLQPP SAVPQAPAPL STNPPLTLSE  180
PSMVVVLQHQ QPPPLPPPSI PPQVPGTAQN SHGSNGVDRI PKEDGAGTKL SDHLSDKVAM  240
ETDSSTPALY SDKEKLRSKK LKMKMEKKKR RKKEEWEIAE SIRWLAEVVL KSEQARMETM  300
REIEKMRVEA EAKRGEMDLK RTEILANTQL EIARLFAGSS KGVDSSLRIG RS*
Nucleic Localization Signal ? help Back to Top
NLS
No. Start End Sequence
1258269KKLKMKMEKKKR
2259272KLKMKMEKKKRRKK
3266270KKKRR
4266271KKKRRK
5266272KKKRRKK
6267271KKRRK
7267272KKRRKK
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007010932.10.0PREDICTED: trihelix transcription factor ASIL1
TrEMBLA0A061FSU00.0A0A061FSU0_THECC; Sequence-specific DNA binding transcription factors
STRINGEOY197420.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM98592533
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT3G54390.11e-102sequence-specific DNA binding transcription factors
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]