PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG011330t1
Common NameTCM_011330
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HD-ZIP
Protein Properties Length: 753aa    MW: 82056 Da    PI: 6.0549
Description HD-ZIP family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG011330t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Homeobox62.37.1e-2090145156
                       TT--SS--HHHHHHHHHHHHHSSS--HHHHHHHHHHCTS-HHHHHHHHHHHHHHHH CS
          Homeobox   1 rrkRttftkeqleeLeelFeknrypsaeereeLAkklgLterqVkvWFqNrRakek 56 
                       +++ +++t++q++e+e++F+++++p+ ++r+eL ++lgL+  qVk+WFqN+R+++k
  Thecc1EG011330t1  90 KKRYHRHTQHQIHEMEAFFKECPHPDDKQRKELGRELGLEPLQVKFWFQNKRTQMK 145
                       688999***********************************************999 PP

2START227.73.5e-712734931206
                       HHHHHHHHHHHHHHHHC-TT-EEEE....EXCCTTEEEEEEESSS......SCEEEEEEEECCSCHHHHHHHHHCCCGGCT-TT-S....EEE CS
             START   1 elaeeaaqelvkkalaeepgWvkss....esengdevlqkfeeskv.....dsgealrasgvvdmvlallveellddkeqWdetla....kae 80 
                       ela +a++el+++a+ +ep+W++s     +++n++e++++f+++ +     ++ ea+r+++vv+m++ +lve+l+d+  qW++ +     ka+
  Thecc1EG011330t1 273 ELAVAAMEELIRMAQMGEPLWMTSLdgttSMLNEEEYIRTFPRGIGpkptgFKCEASRETAVVIMNHINLVEILMDVH-QWSTVFSgivsKAS 364
                       57899**************************************999********************************.************** PP

                       EEEEECTT......EEEEEEEEXXTTXX-SSX.EEEEEEEEEEE.TTS-EEEEEEEEE-TTS--.-TTSEE-EESSEEEEEEEECTCEEEEEE CS
             START  81 tlevissg......galqlmvaelqalsplvp.RdfvfvRyirqlgagdwvivdvSvdseqkppesssvvRaellpSgiliepksnghskvtw 166
                       tl+v+s+g      galq+m+ae+q++splvp R++++vRy++q+ +g+w++vdvS+d+ ++ p+    vR++++pSg+li++++ng+skvtw
  Thecc1EG011330t1 365 TLDVLSTGvagnynGALQVMTAEFQVPSPLVPtRESYYVRYCKQHAEGTWAVVDVSLDNLRPSPT----VRCRRRPSGCLIQEMPNGYSKVTW 453
                       **************************************************************996....************************ PP

                       EE-EE--SSXXHHHHHHHHHHHHHHHHHHHHHHTXXXXXX CS
             START 167 vehvdlkgrlphwllrslvksglaegaktwvatlqrqcek 206
                       vehv++++r +h+l+++lv+sg+a+gak+w atl+rqce+
  Thecc1EG011330t1 454 VEHVEVDDRGVHNLYKQLVSSGHAFGAKRWIATLDRQCER 493
                       **************************************97 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
Gene3DG3DSA:1.10.10.609.9E-2269145IPR009057Homeodomain-like
SuperFamilySSF466893.8E-1976147IPR009057Homeodomain-like
PROSITE profilePS5007116.58487147IPR001356Homeobox domain
SMARTSM003891.3E-1988151IPR001356Homeobox domain
CDDcd000862.04E-1890148No hitNo description
PfamPF000461.7E-1790145IPR001356Homeobox domain
PROSITE profilePS5084845.652264496IPR002913START domain
SuperFamilySSF559613.3E-36266495No hitNo description
CDDcd088759.70E-132268492No hitNo description
SMARTSM002343.0E-68273493IPR002913START domain
PfamPF018522.2E-60274493IPR002913START domain
Gene3DG3DSA:3.30.530.204.8E-6369479IPR023393START-like domain
SuperFamilySSF559615.63E-24513744No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0010090Biological Processtrichome morphogenesis
GO:0048497Biological Processmaintenance of floral organ identity
GO:0005634Cellular Componentnucleus
GO:0003677Molecular FunctionDNA binding
GO:0008289Molecular Functionlipid binding
Sequence ? help Back to Top
Protein Sequence    Length: 753 aa     Download sequence    Send to blast
MPAGVMIPAR NMPSMISGNG NVGGFGTSSG LTLGQPNMME GQLHPLEMTQ NTSESEIARM  60
RDEEFDSTTK SGSENHEGAS GDDQDPRPKK KRYHRHTQHQ IHEMEAFFKE CPHPDDKQRK  120
ELGRELGLEP LQVKFWFQNK RTQMKTQHER QENTQLRTEN EKLRADNMRF REALSTASCP  180
NCGGPTAVGQ MSFDEHHLRL ENARLREEID RISAIAAKYV GKPVVNYPLL SSPMPPRPLD  240
FGAQPGTGEM YGAGDLLRSI SAPSEADKPM IIELAVAAME ELIRMAQMGE PLWMTSLDGT  300
TSMLNEEEYI RTFPRGIGPK PTGFKCEASR ETAVVIMNHI NLVEILMDVH QWSTVFSGIV  360
SKASTLDVLS TGVAGNYNGA LQVMTAEFQV PSPLVPTRES YYVRYCKQHA EGTWAVVDVS  420
LDNLRPSPTV RCRRRPSGCL IQEMPNGYSK VTWVEHVEVD DRGVHNLYKQ LVSSGHAFGA  480
KRWIATLDRQ CERLASVMAT NIPTGDVGVI TNQDGRKSML KLAERMVISF CAGVSASTAH  540
TWTTLSGTGA DDVRVMTRKS VDDPGRPPGI VLSAATSFWL PVSPKRVFDF LRDENSRSEW  600
DILSNGGVVQ EMAHIANGRD TGNCVSLLRV NSANSSQSNM LILQESCADP TASFVIYAPV  660
DIVAMNVVLN GGDPDYVALL PSGFAILPDG TTASAGGIGD AGSAGSLLTV AFQILVDSVP  720
TAKLSLGSVA TVNNLIACTV ERIKASLSCE NA*
Functional Description ? help Back to Top
Source Description
UniProtProbable transcription factor. {ECO:0000250}.
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007045611.10.0PREDICTED: homeobox-leucine zipper protein HDG2 isoform X3
SwissprotQ94C370.0HDG2_ARATH; Homeobox-leucine zipper protein HDG2
TrEMBLA0A061E9X10.0A0A061E9X1_THECC; Homeodomain GLABROUS 2 isoform 1
STRINGEOY014430.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM49128149
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT1G05230.40.0homeodomain GLABROUS 2
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]