PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG017892t1
Common NameTCM_017892
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HD-ZIP
Protein Properties Length: 675aa    MW: 74995.6 Da    PI: 6.9853
Description HD-ZIP family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG017892t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Homeobox63.62.9e-202879556
                      SS--HHHHHHHHHHHHHSSS--HHHHHHHHHHCTS-HHHHHHHHHHHHHHHH CS
          Homeobox  5 ttftkeqleeLeelFeknrypsaeereeLAkklgLterqVkvWFqNrRakek 56
                      +++t++q+ +Le++F+++++p+ ++r++L+++lgL+ +q+k+WFqN+R++ k
  Thecc1EG017892t1 28 HRHTAHQISTLEAYFKECPHPDDNQRRQLSNQLGLEPKQIKFWFQNKRTQTK 79
                      5789********************************************9988 PP

2START134.21.4e-421964153206
                       HHHHHHHHHHHHHHC-TT-EEEEEXCCTTEEEEEEESSS............SCEEEEEEEECCSCHHHHHHHHHCCCGGCT-TT-S....EEE CS
             START   3 aeeaaqelvkkalaeepgWvkssesengdevlqkfeeskv...........dsgealrasgvvdmvlallveellddkeqWdetla....kae 80 
                       a +a++el+++ +++ep+W ks+  ++g + +q+++ +k+            ++e +++s++v+m++a+lv  +ld   +W   ++    ka+
  Thecc1EG017892t1 196 AASAMDELIRLLQVNEPLWAKSP--SDGRYAIQRESYQKTfpratrlrspsARIESSKDSALVTMNAAQLVDMFLDAD-KWVDLFPtivtKAK 285
                       6789*******************..6666666666666666666799*9*****************************.99999988899*** PP

                       EEEEECTT......EEEEEEEEXXTTXX-SSX.EEEEEEEEEEE.TTS-EEEEEEEEE-TTS--.-TTSEE-EESSEEEEEEEECTCEEEEEE CS
             START  81 tlevissg......galqlmvaelqalsplvp.RdfvfvRyirqlgagdwvivdvSvdseqkppesssvvRaellpSgiliepksnghskvtw 166
                       t++ +++         lqlm+  +  lsp v+ R+f+f+R+++q + g wv+vdvS + + +++ +s++ +   +pSg++i++++ng skvtw
  Thecc1EG017892t1 286 TIQLLETRmvgnknVSLQLMYERMHILSPFVApREFYFLRHCKQIETGLWVLVDVSYSYCFFKE-TSHSWK---FPSGCMIQEMPNGCSKVTW 374
                       ******99999999***************9999******************************9.899888...******************* PP

                       EE-EE--SSXX.HHHHHHHHHHHHHHHHHHHHHHTXXXXXX CS
             START 167 vehvdlkgrlp.hwllrslvksglaegaktwvatlqrqcek 206
                       vehv++++++  h l+r l+  + a ga +wv tlqr ce+
  Thecc1EG017892t1 375 VEHVEVDDKIHtHRLYRDLICGSSAYGAERWVITLQRMCER 415
                       *********999***************************97 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
Gene3DG3DSA:1.10.10.601.4E-20775IPR009057Homeodomain-like
SuperFamilySSF466892.57E-18781IPR009057Homeodomain-like
PROSITE profilePS5007116.6652181IPR001356Homeobox domain
SMARTSM003891.0E-172385IPR001356Homeobox domain
CDDcd000864.50E-182482No hitNo description
PfamPF000461.0E-172879IPR001356Homeobox domain
PROSITE patternPS0002705679IPR017970Homeobox, conserved site
PROSITE profilePS5084842.54185418IPR002913START domain
CDDcd088751.07E-102189414No hitNo description
SuperFamilySSF559617.83E-29192415No hitNo description
SMARTSM002349.4E-29194415IPR002913START domain
PfamPF018528.9E-36196415IPR002913START domain
Gene3DG3DSA:3.30.530.209.4E-8247380IPR023393START-like domain
SuperFamilySSF559613.98E-12434633No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0008289Molecular Functionlipid binding
GO:0043565Molecular Functionsequence-specific DNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 675 aa     Download sequence    Send to blast
MDPVMGSSGS SGNEQEASDS GNGKKSFHRH TAHQISTLEA YFKECPHPDD NQRRQLSNQL  60
GLEPKQIKFW FQNKRTQTKS QHERADNSAL RAENERMKCE NFAMLEALKM VICPACGGPP  120
IGEEERQRKA VSQSESLMIS IPTSSLPLPP ANFTTQGMGH LPLYNYLDPK HWDNMALPYQ  180
FNGVTDVEKA LMSETAASAM DELIRLLQVN EPLWAKSPSD GRYAIQRESY QKTFPRATRL  240
RSPSARIESS KDSALVTMNA AQLVDMFLDA DKWVDLFPTI VTKAKTIQLL ETRMVGNKNV  300
SLQLMYERMH ILSPFVAPRE FYFLRHCKQI ETGLWVLVDV SYSYCFFKET SHSWKFPSGC  360
MIQEMPNGCS KVTWVEHVEV DDKIHTHRLY RDLICGSSAY GAERWVITLQ RMCERLSVSN  420
GETEHIHDLG GVLSLPEGRR SIMRLAHRMV KSFCSILNMS GELDFPQLSE ENNSGVRVSV  480
RQSIEPGQPR GMIVSAATSL WLPLPCQSVF NLLNDEKVRF QWDVLCHGNR VKEMANISTG  540
NTPGNCISII TPSVPSGNIL MLQEMSCTES LGSMVVYAPM GIAAMHTAIN SGDSSNIPIL  600
PSGFIISGDG RSELGVGANS TRSSGSLLTV AYQIMACSPS SSMELSVQSV ATVNTLISST  660
VQRIKAVLNT FNLD*
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_017974480.10.0PREDICTED: homeobox-leucine zipper protein ROC8
TrEMBLA0A061ELR00.0A0A061ELR0_THECC; Homeodomain GLABROUS 11, putative
STRINGEOY032240.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM30042665
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT3G03260.10.0homeodomain GLABROUS 8
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]