PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG000519t1
Common NameTCM_000519
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family B3
Protein Properties Length: 891aa    MW: 100601 Da    PI: 9.4524
Description B3 family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG000519t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1B362.47.2e-2023113299
                       EEE-..-HHHHTT-EE--HHH.HTT---..--SEEEEEETTS-EEEEEE..EEETTEEEE-TTHHHHHHHHT--TT-EEEEEE-SSSEE..EE CS
                B3   2 fkvltpsdvlksgrlvlpkkfaeehggkkeesktltledesgrsWevkliyrkksgryvltkGWkeFvkangLkegDfvvFkldgrsefelvv 94 
                       fkv     +++ g l +p +f++++g +   s  ++le +sg +W+v+l  +k+++r++l+kGW+eF++ n L+ g f+vF+++g+  f  +v
  Thecc1EG000519t1  23 FKVI-LDETFRDGKLGIPTNFVRKYGRQ--LSSPIRLEVPSGAVWQVEL--TKCDERVWLQKGWQEFAEHNSLEYGYFLVFRYEGNAHF--HV 108
                       4544.245677789***********977..5567***************..********************************998888..** PP

                       EEE-S CS
                B3  95 kvfrk 99 
                        +f++
  Thecc1EG000519t1 109 LIFDT 113
                       *9997 PP

2B3598.3e-19221318197
                       EEEE-..-HHHHTT-EE--HHH.HTT---..--SEEEEEETTS-EEEEEE....EEETTEEEE-TTHHHHHHHHT--TT-EEEEEE-SS.SEE CS
                B3   1 ffkvltpsdvlksgrlvlpkkfaeehggkkeesktltledesgrsWevkliy..rkksgryvltkGWkeFvkangLkegDfvvFkldgr.sef 90 
                       f  v+ ps+v+ s r+++p  fa+++ + +++   ++l+ +sg+sW  k+    +    r  l++GW+eF+k n L++gD++vF+l+++  e 
  Thecc1EG000519t1 221 FMVVMQPSYVSFSYRMSVPDGFARKYFKMTQG--NVILRISSGQSWPAKYYCrpNIDNPRAQLRDGWQEFAKHNALEVGDVCVFELTRTsPEI 311
                       778999********************644333..6***************545644444588************************9875666 PP

                       ..EEEEE CS
                B3  91 elvvkvf 97 
                        l+v ++
  Thecc1EG000519t1 312 LLKVVIC 318
                       5666666 PP

3B358.21.5e-18488584197
                       EEEE-..-HHHHTT-EE--HHH.HTT---..--SEEEEEETTS-EEEEEE..EEETTE.....EEE-TTHHHHHHHHT--TT-EEEEEE-SSS CS
                B3   1 ffkvltpsdvlksgrlvlpkkfaeehggkkeesktltledesgrsWevkliyrkksgr.....yvltkGWkeFvkangLkegDfvvFkldgrs 88 
                       f  v+ ps+vl  g l++p +f++++ +k+ +  ++tl+ ++gr+W v ++    +++      + ++ W+ Fv +n+Lk+gD++vF+l++ +
  Thecc1EG000519t1 488 FTVVMQPSYVLPGGSLSIPSQFVKRYFKKN-G--EVTLRVSDGRTWIVDYNG--EGDGqcpkgKFRSRSWRAFVLDNNLKVGDVCVFELIKAN 575
                       778999********************7552.3..7***************54..44443444578889*********************9877 PP

                       EE..EEEEE CS
                B3  89 efelvvkvf 97 
                       + ++ v +f
  Thecc1EG000519t1 576 GNSFDVVIF 584
                       666666665 PP

4B356.55.1e-18740834196
                       EEEE-..-HHHHTT-EE--HHH.HTT---..--SEEEEEETTS-EEEEEE......EEETTE...EEE-TTHHHHHHHHT--TT-EEEEEE-S CS
                B3   1 ffkvltpsdvlksgrlvlpkkfaeehggkkeesktltledesgrsWevkliy....rkksgr...yvltkGWkeFvkangLkegDfvvFkldg 86 
                       f+ v  ps+ +++  +++p +fa+++  k++++ + +l  ++g+sW+vk+ y    +    +     +++GWk+F+ +n+L +gD++vF+l +
  Thecc1EG000519t1 740 FLVVIQPSHISRNYKMCIPSNFARKYFTKTHGG-ETVLCLSDGKSWSVKY-YrrgdD----GnprGQFSGGWKKFALDNNLVVGDVCVFELLK 826
                       566777889999999***********7666665.66667779********.624443....335555889********************875 PP

                       SSEE..EEEE CS
                B3  87 rsefelvvkv 96 
                          +   +kv
  Thecc1EG000519t1 827 G--ADISFKV 834
                       3..3334555 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
Gene3DG3DSA:2.40.330.103.8E-2813114IPR015300DNA-binding pseudobarrel domain
SuperFamilySSF1019362.55E-2715114IPR015300DNA-binding pseudobarrel domain
CDDcd100176.46E-2320112No hitNo description
PROSITE profilePS5086314.97421114IPR003340B3 DNA binding domain
SMARTSM010197.8E-2022114IPR003340B3 DNA binding domain
PfamPF023623.7E-1724113IPR003340B3 DNA binding domain
Gene3DG3DSA:2.40.330.108.8E-24214317IPR015300DNA-binding pseudobarrel domain
SuperFamilySSF1019364.9E-23215315IPR015300DNA-binding pseudobarrel domain
CDDcd100173.48E-24219319No hitNo description
SMARTSM010195.0E-9221321IPR003340B3 DNA binding domain
PROSITE profilePS5086313.817221319IPR003340B3 DNA binding domain
PfamPF023622.1E-17221319IPR003340B3 DNA binding domain
Gene3DG3DSA:2.40.330.108.2E-23480584IPR015300DNA-binding pseudobarrel domain
SuperFamilySSF1019365.89E-20481584IPR015300DNA-binding pseudobarrel domain
CDDcd100173.32E-24486585No hitNo description
PROSITE profilePS5086313.042488587IPR003340B3 DNA binding domain
SMARTSM010191.1E-16488587IPR003340B3 DNA binding domain
PfamPF023623.0E-17488583IPR003340B3 DNA binding domain
Gene3DG3DSA:2.40.330.107.0E-24732835IPR015300DNA-binding pseudobarrel domain
SuperFamilySSF1019366.28E-24733829IPR015300DNA-binding pseudobarrel domain
CDDcd100172.42E-26738835No hitNo description
SMARTSM010193.9E-17740838IPR003340B3 DNA binding domain
PfamPF023621.4E-16740829IPR003340B3 DNA binding domain
PROSITE profilePS5086314.48740838IPR003340B3 DNA binding domain
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0003677Molecular FunctionDNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 891 aa     Download sequence    Send to blast
MTSHQRRSDN EHSMFTSKTP HFFKVILDET FRDGKLGIPT NFVRKYGRQL SSPIRLEVPS  60
GAVWQVELTK CDERVWLQKG WQEFAEHNSL EYGYFLVFRY EGNAHFHVLI FDTSASEIEY  120
PHTNTTEEDD GFDNALVCKK SKGKSDIPYP QPHKEMKVDS PNEIGTHLKS KISAPAAMGG  180
GVSGQRSPQI EVLETVGHLT ADEKTKALQK ASGFKTKNPF FMVVMQPSYV SFSYRMSVPD  240
GFARKYFKMT QGNVILRISS GQSWPAKYYC RPNIDNPRAQ LRDGWQEFAK HNALEVGDVC  300
VFELTRTSPE ILLKVVICKR FFEDAIAARP LAGGSIAYRV KKRRLFSDTE TNCLQNQPAI  360
REYRVPKTEQ NENIHTSIEI LDDFPLNQIT KKKLPFPGFQ PCRMMKTNPS QVKGIELGKQ  420
KTSLDFQYST NELGGEFKFS GKDESVGMSG AQRCSKPDFL GRMQPLTTTE KKIALKRAMA  480
FKSANPSFTV VMQPSYVLPG GSLSIPSQFV KRYFKKNGEV TLRVSDGRTW IVDYNGEGDG  540
QCPKGKFRSR SWRAFVLDNN LKVGDVCVFE LIKANGNSFD VVIFPDANIA SCSSSKLDSR  600
YQCKEAEDEG SIEILECTAP CQKTREKSSI QCPRPQKMMK INMINKTEKI LESEYIDPRF  660
RPFCNKACGI KLEEPKGSTS SSCCKQEVGL KPATRTGTST EKGWECPEQA EILRSQKLTA  720
KVKAKTLRIA KAFNSKNPFF LVVIQPSHIS RNYKMCIPSN FARKYFTKTH GGETVLCLSD  780
GKSWSVKYYR RGDDGNPRGQ FSGGWKKFAL DNNLVVGDVC VFELLKGADI SFKVLRDIHA  840
RPRQLLLQTF DSSEESCSRL ICILAVWTYH LSLSRNILSQ IQKCNPSSCE *
3D Structure ? help Back to Top
Structure
PDB ID Evalue Query Start Query End Hit Start Hit End Description
4i1k_A7e-2019983423137B3 domain-containing transcription factor VRN1
4i1k_B7e-2019983423137B3 domain-containing transcription factor VRN1
Search in ModeBase
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
TrEMBLA0A061DMT40.0A0A061DMT4_THECC; Uncharacterized protein
STRINGEOX912730.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM1837549
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT3G18990.15e-46B3 family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]