PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Thresholds for domain identification
In PlantTFDB, bit-score was used instead of the e value as the cutoff for each domain. Similar to Pfam, there are two thresholds for each domain: sequence cutoff and domain cutoff. Thresholds for auxiliary domain (except self-built domains) and forbidden domain were directly retrieved from Pfam (v27.0). Thresholds for DNA-binding domain and self-built auxiliary domains were determined by the following method.
  1. GO annotations (not including IEA annotations) were used as the first evidence to determine rough thresholds for DNA-binding domain, including a Trusted Cutoff (TC) and a Noise Cutoff (NC). TC is the lowest score of proteins possessing this domain and having "transcription factor activity". NC is the highest score of proteins without this domain or without "transcription factor activity".
  2. Because GO annotation could not provide enough information to determine all TC and NC for each domain, Pfam cutoffs (TC and NC) were used to adjust the TC and NC which were not well determined (less supported).
  3. TAIR annotations and Uniprot annotations were also used to refine TCs and NCs.
  4. We employed manual inspection for HMM alignments and further refined TCs and NCs, and chose a reasonable score between TC and NC as the cutoff.