PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG029886t1
Common NameTCM_029886
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HSF
Protein Properties Length: 680aa    MW: 77537.9 Da    PI: 8.4376
Description HSF family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG029886t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1HSF_DNA-bind89.25.3e-28191462102
                       HHHHHHHHHC........TGGGTTTSEESSSSSEEEES-HHHHHHHTHHHHSTT--HHHHHHHHHHTTEEE---SSBTTTT............ CS
      HSF_DNA-bind   2 Flkklyeile........deelkeliswsengnsfvvldeeefakkvLpkyFkhsnfaSFvRQLnmYgFkkvkdeekksks............ 74 
                       Fl k+y +le         ++ k+++sw+++g+ fvv+ + ef++  Lp+yFkh+nf+SF+RQLn+Y    + + +k + +            
  Thecc1EG029886t1  19 FLWKTYALLEegeegaetADDRKKIVSWNAEGTGFVVWSPAEFSELTLPRYFKHNNFSSFIRQLNTYVIGLSMNGQKAEFRcrdlnpglsges 111
                       999*******6666665555999********************************************88887777766555889999999988 PP

                       .......XTTSEEEEESXXXXXXXXXXXXXXXXXX CS
      HSF_DNA-bind  75 .......kekiweFkhksFkkgkkellekikrkks 102
                              ++k weF+h++F++g k++l +i rkk 
  Thecc1EG029886t1 112 rqgfkktSSKRWEFRHEKFQRGCKHMLVEITRKKM 146
                       877765449***********************986 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
Gene3DG3DSA:1.10.10.101.5E-3015144IPR011991Winged helix-turn-helix DNA-binding domain
SMARTSM004151.0E-3015144IPR000232Heat shock factor (HSF)-type, DNA-binding
SuperFamilySSF467854.22E-2216144IPR011991Winged helix-turn-helix DNA-binding domain
PfamPF004475.1E-2519144IPR000232Heat shock factor (HSF)-type, DNA-binding
PROSITE profilePS513755.7245279IPR002885Pentatricopeptide repeat
Gene3DG3DSA:1.25.40.102.2E-12303366IPR011990Tetratricopeptide-like helical domain
PROSITE profilePS513759.262314348IPR002885Pentatricopeptide repeat
TIGRFAMsTIGR007567.6E-4317346IPR002885Pentatricopeptide repeat
PfamPF015350.0019317346IPR002885Pentatricopeptide repeat
SuperFamilySSF484521.7E-5324373IPR011990Tetratricopeptide-like helical domain
PROSITE profilePS513759.701349383IPR002885Pentatricopeptide repeat
TIGRFAMsTIGR007565.0E-6352384IPR002885Pentatricopeptide repeat
PfamPF130412.6E-8352395IPR002885Pentatricopeptide repeat
PROSITE profilePS513757.552384418IPR002885Pentatricopeptide repeat
PROSITE profilePS513755.853420450IPR002885Pentatricopeptide repeat
Gene3DG3DSA:1.25.40.102.2E-12423471IPR011990Tetratricopeptide-like helical domain
SuperFamilySSF484521.7E-5435472IPR011990Tetratricopeptide-like helical domain
PROSITE profilePS513756.051455485IPR002885Pentatricopeptide repeat
PROSITE profilePS513755.722490524IPR002885Pentatricopeptide repeat
PfamPF015350.87496521IPR002885Pentatricopeptide repeat
Gene3DG3DSA:1.25.40.102.2E-12507594IPR011990Tetratricopeptide-like helical domain
PROSITE profilePS513756.719525559IPR002885Pentatricopeptide repeat
PROSITE profilePS513755.503560594IPR002885Pentatricopeptide repeat
SuperFamilySSF484521.7E-5567589IPR011990Tetratricopeptide-like helical domain
PROSITE profilePS513755.623634670IPR002885Pentatricopeptide repeat
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0003700Molecular Functiontranscription factor activity, sequence-specific DNA binding
GO:0005515Molecular Functionprotein binding
GO:0043565Molecular Functionsequence-specific DNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 680 aa     Download sequence    Send to blast
MSKITTSSLS PRTKGAAPFL WKTYALLEEG EEGAETADDR KKIVSWNAEG TGFVVWSPAE  60
FSELTLPRYF KHNNFSSFIR QLNTYVIGLS MNGQKAEFRC RDLNPGLSGE SRQGFKKTSS  120
KRWEFRHEKF QRGCKHMLVE ITRKKMEPSV FPAFLKASDE DKANHAEENS CQTLLEENEI  180
LRREKVELQT QIAQFKALEV KLLDSLAQHM GNTNHKERRI SPLGSPDKSV EPELEDWLKN  240
GKKIRVAELQ RIIHDLRKRK RFTQALEVSE WMNKKGICTF SHTEHAVQLD LIGRVRGFLS  300
AESYFDSLKD QDKTDKTYGA LLNCYVRQRQ TDKSLSHLQK MKDLGFTSSP LTYNGIMCLY  360
TNIGQHEKVP DIMREMKENK VSPDNFSYRI CINAYGVMSD LEGMERILKE MESQSHIKMD  420
WNTYAVVANF CIKAGLTERA IDALKKSEQK LDNKDGTAFN HLISLYANLG NKAEVLRLWG  480
LEKASCKRYI NKDFITMLQS LVKLDAFEEA EKVLEEWASS GNCYDFRVPS IIIIGYAEKG  540
LHEKSEAMLE NLMEKGKVTT PNSWGVVAAG YLDKGQVRKA LECMKTALFL TVENKGWRPN  600
LRVITSILEW LGNEGSIQDA EDFVASLRTV IPVDRKMYNA LLKATIRDGK GVDKLLDLMK  660
ADKIDEDEET KTILAMKSS*
3D Structure ? help Back to Top
Structure
PDB ID Evalue Query Start Query End Hit Start Hit End Description
5i9f_A5e-192356505385pentatricopeptide repeat protein dPPR-U10
Search in ModeBase
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_017978727.10.0PREDICTED: pentatricopeptide repeat-containing protein At4g21705, mitochondrial
SwissprotQ84JR30.0PP334_ARATH; Pentatricopeptide repeat-containing protein At4g21705, mitochondrial
TrEMBLA0A061GGI70.0A0A061GGI7_THECC; Tetratricopeptide repeat-like superfamily protein, putative
STRINGEOY282620.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM546323
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT5G62020.14e-27heat shock transcription factor B2A
Publications ? help Back to Top
  1. Lurin C, et al.
    Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis.
    Plant Cell, 2004. 16(8): p. 2089-103
    [PMID:15269332]
  2. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]