PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG043101t2
Common NameTCM_043101
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family MYB
Protein Properties Length: 1385aa    MW: 151772 Da    PI: 5.4528
Description MYB family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG043101t2genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Myb_DNA-binding27.57.3e-09815856346
                       SS-HHHHHHHHHHHHHTTTT-HHHHHHHHTTTS-HHHHHHHHHH CS
   Myb_DNA-binding   3 rWTteEdellvdavkqlGggtWktIartmgkgRtlkqcksrwqk 46 
                       +WT eE e++ d  + +G++ +++Ia+ +  ++t  +c+++++k
  Thecc1EG043101t2 815 PWTSEEKEIFMDKLAAFGKD-FRKIASFLD-HKTTADCVEFYYK 856
                       8*****************99.*********.***********98 PP

2Myb_DNA-binding33.87.6e-1110351074445
                        S-HHHHHHHHHHHHHTTTT-HHHHHHHHTTTS-HHHHHHHHH CS
   Myb_DNA-binding    4 WTteEdellvdavkqlGggtWktIartmgkgRtlkqcksrwq 45  
                        WT eE   +++av ++G++ ++ I+r++g +R++ qck ++ 
  Thecc1EG043101t2 1035 WTDEEKSVFIQAVSLYGKD-FAMISRCVG-TRSRDQCKVFFS 1074
                        *****************99.*********.********8776 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
SuperFamilySSF466894.86E-14799860IPR009057Homeodomain-like
PROSITE profilePS5129315.02811862IPR017884SANT domain
SMARTSM007176.5E-8812860IPR001005SANT/Myb domain
Gene3DG3DSA:1.10.10.601.2E-5812857IPR009057Homeodomain-like
PfamPF002495.5E-6814856IPR001005SANT/Myb domain
PROSITE profilePS5129311.87610301081IPR017884SANT domain
SMARTSM007171.0E-810311079IPR001005SANT/Myb domain
Gene3DG3DSA:1.10.10.601.9E-610341075IPR009057Homeodomain-like
SuperFamilySSF466896.94E-1110341081IPR009057Homeodomain-like
PfamPF002492.8E-810351074IPR001005SANT/Myb domain
CDDcd001671.28E-710351073No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0005634Cellular Componentnucleus
GO:0003677Molecular FunctionDNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 1385 aa     Download sequence    Send to blast
MPPEPLPWDR KDFYKERKHE RTESQPQQPS TARWRDSSSM SSYQHGSFRE FTRWGSADLR  60
RPPGHGKQGS WHLFAEENGG HGYVPSRSGD KMLDDESCRQ SVSRGDGKYS RNSSRENNRA  120
SYSQRDWRAH SWEMSNGSPN TPGRPHDVNN EQRSVDDMLT YPSHAHSDFV STWDQLHKDQ  180
HDNKTSGVNG LGTGQRCERE NSVGSMDWKP LKWSRSGSLS SRGSGFSHSS SSKSLGGVDS  240
GEGKLELQQK NLTPVQSPSG DAAACVTSAA PSDETMSRKK PRLGWGEGLA KYEKKKVEGP  300
DTSMNRGVAT ISVGNTEPNN SLGSNLAEKS PRVLGFSDCA SPATPSSVAC SSSPGVEEKS  360
FGKAANIDND ISNLCGSPSL GSQNHLEGPS FNLEKLDMNS IINMGSSLVD LLQSDDPSTV  420
DSSFVRSTAM NKLLLWKGDV LKALETTESE IDSLENELKT LKANSGSRYP CPATSSSLPM  480
EENGRACEEL EAISNMIPRP APLKIDPCGD ALEEKVPLCN GDLEEVNADA KDGDIDSPGT  540
ATSKFVEPSS LEKAVSPSDV KLHECSGDLG TVQLTTMGEV NLAPGSSNEG TSVPFSGEGS  600
ALEKIDNDVH GPEPSNSVAD IENIMYDVII ATNKELANSA SKVFNNLLPK DWCSVISEIA  660
NGACWQTDSL IREKIVKRKQ CIRFKERVLM LKFKAFQHAW KEDMRSPLIR KYRAKSQKKY  720
ELSLRSTLGG YQKHRSSIRS RLTSPGNLSL ESNVEMINFV SKLLSDSHVR LYRNALKMPA  780
LFLDEKEKQV SRFISSNGLV EDPCAVEKER ALINPWTSEE KEIFMDKLAA FGKDFRKIAS  840
FLDHKTTADC VEFYYKNHKS ECFEKTKKKL DLSKQGKSTA NTYLLTSGKK WSRELNAASL  900
DVLGEASVIA AHAESGMRNR QTSAGRIFLG GRFDSKTSRV DDSIVERSSS FDVIGNDRET  960
VAADVLAGIC GSLSSEAMSS CITSSADPGE SYQREWKCQK VDSVVKRPST SDVTQNIDDD  1020
TCSDESCGEM DPADWTDEEK SVFIQAVSLY GKDFAMISRC VGTRSRDQCK VFFSKARKCL  1080
GLDLIHPRTR NLGTPMSDDA NGGGSDIEDA CVLESSVVCS DKLGSKVEED LPSTIVSMNV  1140
DESDPTGEVS LQTDLNVSEE NNGRLVDHRD SEAVETMVSD VGQPEPICES GGDMNVENVP  1200
KRSYGFWDGN RIQTGLSSLP DSAILVAKYP AAFVNYPSSS SQMEQQALQT VVRSNERNLN  1260
GVSVYPSREI SSNNGVVDYQ VYRGRDCTKV APFTVDMKQR QEMFSEMQRR NRFDAIPNLQ  1320
QQGRGGMVGM NVVGRGGVLV GGPSISDPVA VLRMQYAKTE QYGGQSGSIV REEESWRGKG  1380
DIGR*
3D Structure ? help Back to Top
Structure
PDB ID Evalue Query Start Query End Hit Start Hit End Description
4a69_C3e-16773864494NUCLEAR RECEPTOR COREPRESSOR 2
4a69_D3e-16773864494NUCLEAR RECEPTOR COREPRESSOR 2
Search in ModeBase
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Nucleotide ? help Back to Top
Source Hit ID E-value Description
GenBankJX5788051e-133JX578805.1 Gossypium hirsutum clone NBRI_GE10901 microsatellite sequence.
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_017984689.10.0PREDICTED: uncharacterized protein LOC18586364 isoform X2
TrEMBLA0A061FNE60.0A0A061FNE6_THECC; Duplicated homeodomain-like superfamily protein isoform 2
STRINGEOY185960.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM52602744
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT3G52250.10.0MYB family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]