September 9th, 2016. This release corresponds to Release 20.0 of EcoCyc.
New structural properties of two TFs were identified: the NarL receptor domain is able to stimulate gene transcription in a nitrate-responsive manner Katsir G et al. (2015) and CRP whose Thr127 and Ser128 residues provide high cAMP affinity and play a key role in stabilization of the CRP inactive form Gunasekara SM et al. (2015). On the other hand, the response regulators KdpE and RcsB are capable of driving gene expression Narayanan A et al. (2014) and form complexes with other proteins in a unphosphorylated manner Pannen D et al. (2016). Also, under anaerobic and iron-dependent conditions, Fur binds to more sites across the genome, increasing the number of target genes Beauchene NA et al. (2015).
The notes for rrsE, rrsH, rrsD, and rrsB rRNAs Maeda M et al. (2015), ExuR Tutukina MN et al. (2016), UxuR Tutukina MN et al. (2016), BaeR Yao Y et al. (2015), GlpR Vimala A et al. (2016), RpoS Guo M et al. (2015), CsrB Zere TR et al. (2015), CspA, CsgD Soo VW et al. (2013), Dps Lee SY et al. (2015), IHF Lee SY et al. (2015), CueR Szunyogh D et al. (2015), Mlc Bréchemier-Baey D et al. (2015), UvrY Zere TR et al. (2015), and NarL Katsir G et al. (2015) transcriptional regulators, and CsrB small regulatory RNA Zere TR et al. (2015) were updated.
We have now curated the published literature through the end of December 2015.
April 7th, 2016. This release corresponds to Release 19.1 and 19.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
NimR (formally YeaM) confers resistance to 2-nitroimidazole, an antibacterial and antifugal agent and plays a regulatory role in divergent transcription of the nimT and nimR genes Ogasawara H et al. (2014). Based on Genomic SELEX screening, the two-component system (TCS) YedVW was characterized Urano H et al. (2015). The YedVW and CusSR TCSs form a unique regulation system, where both TCSs recognize the same DNA sequence for binding in the hiuH; YedVW sensing H2O2 and CusSR sensing Cu(II) Urano H et al. (2015). YdeO regulon plays an important role in survival under, both acidic and anaerobic conditions Durban J et al. (2013).
Two transcriptional regulators were identified. YjjQ, a transcriptional repressor of genes required for flagellar synthesis, capsule formation, and other genes related to virulence Wiebe H et al. (2015), as well as YebK, a transcriptional regulator implicated in the adaptation to the transition from rich medium to cellobiose minimal medium, reducing the length of the lag phase Parisutham V et al. (2015). On the other hand, it also was determined that both YbiB and Dam bind to DNA and could play a role in the transcriptional regulation Schneider D et al. (2015), Horton JR et al. (2015).
Summaries for MazE, McbR, CRP, RutR, BolA, FadR, Zur, H-NS, OmpR, H-NS, CRP, LacI, FadR, NemR, DksA, SdiA, PhoB, HU, CadC, and SoxR transcriptional regulators were updated.
We have curated the published literature through the end of July 2015.
September 15, 2015. This versión of RegulonDB(9.0) uses the same release data from Ecocyc (19.0) as the previous versión (8.8).
Updated TF families, position-weight matrices and their grouping in clusters
We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families.
Comprehensive semiautomatic curated elementary Gensor Units
We redesigned theWeb page for GENSOR units, and this page now contains three sections: the graphical map of the elementary GENSOR unit, its general properties, including the written summary and a section for the properties of each reaction.
Coexpression distance around the Regulatory Network
We have implemented tools for a full comparison of expression of groups of genes across all conditions. The 'Coexpression' page can be reached directly from the search option. A single query gene or a group of genes are added either manually, based on the set of interest to the user, or are automatically uploaded as a collection of genes defining operons or regulons. In addition, we offer a coexpression overview for two groups of input genes: operons and regulons.
May 5th, 2015. This release corresponds to Release 19.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
SutR (formally YdcN) was identified as a transcriptional dual regulator of genes involved in the utilization of sulfur metabolism Yamamoto K et al. (2014).
The crystal structure of the DinJ-YafQ complex was resolved at 1.8 Å Ruangprasert A et al. (2014). Notes for BaeR, AcrR, Fur, SoxS, PspF, and CpxR were updated Srivastava SK et al. (2014), Lee JO et al. (2014), Méhi O et al. (2014), Molina-Quiroz RC et al. (2014), Darbari VC et al. (2014), Vogt SL et al. (2014).
A new view was added to the display of search results by regulon. When the user selects "regulon search" without giving a term, all the regulons are displayed. The table shows the regulon name, the total of regulated genes, the total of regulated operons, the total of binding sites, and the total of regulatory interactions.
A link to download the weight matrices in consensus format was added to the downloads page of the website.
February 2nd, 2015. This release corresponds to Release 18.1 and 18.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
MraZ was identified as a transcriptional repressor involved in the control of cell division and cell wall genes Eraso JM et al. (2014). It binds to a region of DNA containing three successive TGGGN direct repeats that are separated by two consecutive 5-nt-spacer close to mraZp promoter Eraso JM et al. (2014). Also, the summaries for MraZ, CRP, RcsB- BglJ, HipB, LacI, PhoB, YehT, YpdB, HypT, MarR transcriptional regulators were updated.
On the other hand, YdcI was also identified as a transcriptional repressor involved in the survival, stress response, and cell interactions in Salmonella enterica serovar Typhimurium Solomon L et al. (2014). Based on N- and C-terminal exchange between S. Typhimurium and Escherichia coli, it was also possible to determine that YdcI is a transcriptional repressor in E. coli Solomon L et al. (2014).
The crystal structure of LsrR, with its native signal (phosphor-Al-2), SdiA, and DhaR, has been determined Ha JH et al. (2013), Kim T et al. (2014), Shi R et al. (2014). Summaries for OmpR, H-NS, CRP, IHF, PhoB, transcriptional regulators, and the specialized sigma in response in heat shock and misfolded proteins, σE; were updated.
We have curated the published literature through the end of September 2014.
April 11th, 2014. This release corresponds to Release 18.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
RclR (formerly YkgD) has been experimentally determined to be a redox-sensitive transcriptional activator of essential genes for survival under reactive chlorine stress Parker BW et al. (2013). In addition, ArcA was shown to utilize its diverse binding site architecture for global control of carbon oxidation pathways Park DM et al. (2013). Also, the summaries for MlrA, ArcA, FeaR, MalT, OmpR, CspA, BaeR, Crp, AraC, H-NS, and FadR transcriptional regulators were updated.
In addition, we have added data from high-throughput experiments to RegulonDB as a dataset, with data for LeuO, H-NS, and CRP transcriptional regulators from genomic SELEX analysis and for transcriptional start site mapping based on dRNA-seq, see High-throughput Datasets.
We have curated the published literature through the end of December 2013.
November 28th, 2013. This release corresponds to Release 17.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We added the tetrameric conformation for the transcriptional regulator LsrR. In the presence of phosphorylated autoinducer 2 (AI-2), the tetramer dissociates into dimers, and the interaction of LsrR with DNA is greatly reduced Wu M et al. (2013). We also added its inactive conformation, LsrR-AI-2 . Two new conformations, MetJ-MTA and MetJ-adenine, for the MetJ transcriptional regulator were also added. The metabolites 5´-deoxy-5´-(methylthio) adenosine (MTA) and adenine (Ade) bind with high affinity to MetJ, but their biological effects are not known Martí-Arbona R et al. (2012).
The summaries for NemR,RbsR, MarR, NrdR, ArcA, LsrR, PspF, and YpdB transcriptional regulators were updated.
In version 8.5, we made a major change to the main pages of RegulonDB. The pages were reorganized to provide a more structured access to the data, based on the two dominant types of users: those conducting individual search queries and those accessing the data collections.
We also added the option "Gensor Unit Groups" within the Integrated views & Tools menu, which enables display of all Gensor Units so far reviewed in RegulonDB. Currently, we have 53 GUs, and they are grouped into 5 categories.
July 29th, 2013. This release corresponds to Release 17.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
As part of our curation on transcriptional regulation, we have finished linking references to their corresponding evidence for 225 promoters in which this relationship did not exist.
We have corrected and relocated the transcription factor binding sites of PuuR. The BSs of PuuR identified by Nemoto et al. consist of 15 nucleotides, with the following recognition sequence: AAAATATAATGAACA Nemoto et al. (2012). Analysis done by the curator on the experimental assays and the sequences identified by Nemoto et al., showed that the binding sites of PuuR may have a length of 20 nucleotides with an inverted repeated symmetry (ATGGACAATATATTGACCAT). The consensus sequence identified by Nemoto et al. in 2012 is included in the consensus sequence proposed by the curator and the nucleotides conserved between the two sequences are underlined.
April 22th, 2013. This release corresponds to Release 17.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
Four new transcription factors have been identified, PgrR, RcdA, YdfH, and YpdB; the functional conformation for IscR has been included and we enriched summaries for nine TFs as detailed below.
PgrR, a repressor of the expression of genes related to peptidoglycan degradation Shimada et al. (2013). RcdA is involved in the regulation of a number of stress response genes, biofilm formation and of transcription regulators genes Shimada et al. (2012). YdfH belongs to the GntR transcription factor family is a repressor of the rspAB operon |CITS:| and YpdB, an activator that participates in the carbon control network and may participate in nutrient scavenging before entry into stationary phase Fried et al. (2013).
The new conformation IscR-2Fe-2S for the transcription factor IscR was included in this release. IscR-2Fe-2S represses the transcription of the operon iscRSUA, which encodes genes for the Fe-S cluster biogenesis pathways Giel et al. (2013).
Summaries for FadR, NikR, BluR, LeuO, HNS, MarA, SoxS, Rob and PspF were enriched. In addition, it was determined that MqsRA complex does not bind to DNA instead it functions to destabilize the MqsA-DNA complex Brown et al. (2013).
We have reclassified the evidence supporting the knowledge in the database as weak, strong, or confirmed Weiss et al. (2013). The level of confidence is assigned in two stages; in stage I we classify single evidence into weak and strong, and in stage II we validate data by integrating multiple evidence items in a process termed "analytical cross-validation," where the result is the confidence of the knowledge (strong, weak, or confirmed), see the page regarding evidence. This process has been automated to report relevant changes in each release.
In the gene page, we have created a new section named "Elements in the selected gene context region unrelated to any object in RegulonDB." In this section are included the biological objects that are not associated with a transcription unit.
In addition, in the same page in the operon section, called operon arrangement, are links to the operon page. Each promoter field is linked with the corresponding operon page. In the submenu related to the data sets, included in downloads, we have integrated new information related to the transcription start sites (TSSs) experimentally determined in the laboratory of Dr. Morett. The TSSs are included in the file named "High-throughput transcription initiation mapping. Illumina directional RNA-seq experiments where total RNA received different treatments to enrich for 5'-monophosphate or 5'-triphosphate ends. "These objects are included in the new section, "Elements in the selected gene context region unrelated to any object in RegulonDB," previously described.
In addition, HTTIM evidence has been removed, and associated promoters with this evidence have been reclassified as follows: 267 as TIM, 42 as ROMA, and 39 as RS (classification based on Weiss HTP evidence; Weiss et al. (2013).
December 17th, 2012. This release corresponds to Release 16.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have annotated 35 predictions for TFBSs of 13 regulators. The matrices used for these predictions were constructed by RegulonDB database Medina-Rivera et al. (2010). Four TFs (ArgR, AscG, Cra and Rob) have regulatory interactions with weak evidence and interactions for eight regulators have with strong evidence: CRP, EvgA, ExuR, FIS, LexA, NtrC, PhoP, TorR, and UxuR.
A new response regulator, YehT, of two component system was curated Kraxenberger et al. (2012).
We have identified that mntS is included within the coding region of the rybA gene. rybA transcribes two different functional products, a small RNA (rybA) and a small protein (MntS), both are transcribed from the rybA promoter Waters et al. (2011). RybA is rapidly processed at the 5´end as well as at multiple sites at the 3´end.
We have completed adding references and evidence codes to 85 transcriptional Regulatory_Interactions. Now every manually curated Regulatory_Interaction has a reference and an evidence code associated with it. We continued to enrich the summaries of ten TFs: AraC, ChbR, IscR, LacI, MarA, NorR, PhoB, RcnR, Rob and SdiA.
October 2nd, 2012.
We describe next, two elements of our efforts toward obtaining higher integration levels: (i) GUs and (ii) the organization of multiple TFBSs into regulatory phrases.
Fur, a complex gensor unit
In 2011, we described the new concept of genetic sensory-response units, or "gensor units", (GUs) which are composed of four components: (i) the signal, (ii) the signal-to-effector reactions that end with activation or inactivation of the TF, (iii) the regulatory switch (resulting in activation or repression of transcription of target genes), and (iv) the consequence, i.e., the effects and roles of the regulated genes.
RegulonDB contains 25 completed GUs for local TFs and small regulons. We curated a much larger GU as a first step toward eventually compiling information on GUs of global regulators. Fur regulates transcription initiation of 66 TUs, including 9 TFs, a regulatory small RNA (sRNA), and two sigma factors (σ19 and σ38). Its diagram has more than 200 reactions and close to 300 nodes. In order to facilitate interpretation of this GU, we included a high-level illustration that provides an overview of all classes of genes and functions subject to Fur regulation. Search gensor unit in the main menu in RegulonDB and select Fur overview.
For years we have displayed the collection of sites in upstream regions affecting each promoter, leaving it to the user to decipher how these multiple sites, which bind the same or different TFs, work in a coordinated fashion, or not, to regulate transcription. We have implemented the first version of regulatory phrases, grouping transcription factor binding sites (TFBSs) that work together in a single promoter, as well as by grouping all arrangements of the same TF with the same effect in different promoters.
Enriched classifications based on classic and HT evidence
We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated via high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. See the new page of evidence in "About RegulonDB".
Tracks display of HT data sets and submission forms for HT data sets
We implemented a new tool in the main menu for use of a browser with the option of several tracks, based on GBrowser v.248.
The menu page where users choose which sets to display now contains a variety of data sets, including manually curated RegulonDB collections of objects. We have also included a mechanism that enables the display of "Data Sets" in the GBrowser. On the GBrowser page, a user can proceed to "Select tracks" to see the full set of options currently available, classified by type of object, including operons, TFs, Chip-Seq TFBSs, promoters, HT-mapped TSSs, sRNAs, and TF conformations, among others. An additional category called "Genome regions", for genes as well untranslated regions of 5´and 3´ends of TUs are also included.
Submission forms for HT-datasets
Every single data set can be documented as requested when authors submit their experimental data, with specific formats for each type of source (i.e., TSS, Chip-Seq). We implemented a Web format for those interested in submitting their data sets directly online.
Evolutionary conservation of promoters and regulatory interactions
For the first time, we have added the evolutionary evidence for promoters and TFBSs within gammaproteobacteria. These are available from the gene and regulon pages, with graphics showing a summary of the number of genomes where conservation is found and the alignment and conserved sequences available as multiple alignments.
A new Regulon page: addressing user needs and suggestions
Based on comments and suggestions by RegulonDB users, we modified the page displaying information about regulons and simplified the search for all TFBSs of a single TF. The new page includes an icon linking a regulon to the GU, the summary for the TF, followed by a section displaying the functional and nonfunctional conformation(s), a classification of the effector based on its source as internal, external, or dual; a category for the TF based on its connectivity, the target regulated genes, and the operon where the TF gene belongs. Subsequent sections describe functional properties of the regulon, the set of TFBSs and their organization patterns and phrases, logos, PWMs, and additional properties.
August 29th, 2012. This release corresponds to Release 16.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have updated the lengths and included the consensus sequences of TFBSs for 17 regulators: AgaR, AraC, ArcA, AscG, CaiF, DnaA, FhlDC, IclR, KdpE, LeuO, MalT, MelR, NanR, PrpR, PutA, RhaS, and XylR.
Three new transcription factors have been included: FliZ, MatA, and YjiE.
FliZ is a repressor that contains an α-helix that is similar to helix 3.0 of σS and that represses genes involved in the regulation of the motility system and curli expression. Pesavento et al. in 2012 determined that this regulator binds to regions of σS-dependent promoters, can recognize alternative σS promoter-like sequences, and can also discriminate vegetative promoters Pesavento et al. 2012.
MatA is a transcriptional dual regulator in meningitis isolate E. coli strain IHE 3034, and it interferes with bacterial motility and flagellar synthesis in E. coli K-12 Lehti et al. 2012. in E. coli K-12. Given the high similarity between the two strains, we have added this regulator to the information for E. coli K-12.
QseD, a putative transcriptional LysR-type regulator, was renamed YjiE and is now considered a DNA-binding transcriptional dual regulator. It regulates genes involved in cysteine and methionine biosynthesis, sulfur metabolism, iron acquisition, and homeostasis Gebendorfer et al. 2012. A new function was identified for OxyR, in controlling genes under nitrosative stress during anaerobic respiration (Seth et al. 2012).
March 29th, 2012. This release corresponds to Release 16.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
In this release we added the consensus sequences, lengths, and symmetries corresponding to 10 TFs. We update the binding sites for 4 TFs that belong to the LysR family (ArgP, IlvY, MetR, and NhaR) and 3 response regulators that correspond to two-component systems (BaeR, CitB, and CpxR); DinJ is included in the toxin/antitoxin system, and PurR regulates genes involved in purine/pyrimidine biosynthesis. Finally, PdhR is involved in central metabolic fluxes and, more recently, has been found to be involved in the utilization of glycolate and cell division.
In these cases we used different strategies to identify the characteristics of the TFBSs. We performed alignments of the sequences upstream of genes regulated by these proteins and compared orthologous intergenic regions, and we also used other databases, such as RegPrecise Novichkov et al. 2010. In addition, the binding sites of the regulator MetR were corrected based on comparisons with homologous sequences reported for Salmonella typhimurium. In all cases we also analyzed the available experimental evidence that corresponded to each regulatory interaction.
On the other hand, we are continuing with the annotation of allosteric regulation of the RNAP by ppGpp and DksA. In this sense we have expanded the notes for GreB, GreA and DksA. In addition we also have enriched notes for different transcriptions factors, such as: AidB, ArgP, AtoC, DcuS, DpiB, Fur, HNS, LacI, MalT, MntR, PaaX, PhoB, PutA and SoxS.
Nov 1st, 2011. This release corresponds to Release 15.1 and 15.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We continue with the effort to update and assign the correct lengths and central positions of the binding sites of TFs. In this release we have analyzed consensus binding sequences for 20 transcription factors (TFs), and as a result we have corrected and generated new regulatory interactions and updated the consensus sequences, lengths, and symmetries of the transcription factor binding sites (TFBSs). In these cases we used different strategies to identify the characteristics of the TFBSs. We performed alignments of the sequences upstream of genes regulated by these proteins and compared orthologous intergenic regions. In all cases we also analyzed the experimental evidence that corresponded to each regulatory interaction.
We corrected and relocated the TFBSs of 7 response regulators of the two-component systems: DcuR, EvgA, NtrC, OmpR, PhoB, PhoP, and RstA. We updated the sites of 5 TFs involved in the acid resistance system: BglJ, GadE, GadX, GadW, and RcsB. We added new consensus sequences for 4 local TFs: SoxR, YqhC, YqjI, and CspA.
The experimentally characterized TFBSs for the transcriptional regulatory components of the HipBA, MqsAR, RelBE, and YefM-YoeB toxin/antitoxin systems have been updated.
On the other hand, we continue with the annotation of other mechanisms of regulation. In this sense we have curated mechanisms of regulation affecting allosterically RNA polymerase at transcription initiation. ppGpp is a nucleotide that binds RNA polymerase alone or forms a complex with DksA and affects transcription in either a positive or negative manner. Genes involved in responding to nutrient limitation as well as amino acid biosynthesis were positively affected by ppGpp and DksA. The genes related to rRNA promoters and to the stringent response were negatively controlled by both regulators. Currently, 67 promoter interactions regulated by ppGpp, as well as some regulated by DksA, have been curated.
May 6th, 2011. This release corresponds to Release 14.6 and 15.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We are continuing the analysis of binding sites of different transcription factors (TFs). In this release we have included new consensus sequences for 45 local TFs that have three or fewer binding sites in the database. Most local TFs bind to small sequence motifs (11 to 24 nucleotides) with different symmetries, and these are arranged as inverted repeats (39), direct repeats (2), or asymmetrical (4) sequences with a variable space sequence between them. In these cases we performed alignments of the sequences upstream of the genes regulated by these proteins and evaluated the lengths and symmetries of the consensus sequences. In general, the sequences of unique binding sites are highly conserved, and the length and symmetry are evident.
TFs with inverted repeat symmetry include the following: AcrR, AllR, ArsR, AtoC, BaeR, BirA, BetI, CueR, CusR, EnvR, FabR, GlrR, HcaR, HyfR, KdgR IdnR, ilvY, LacI, LldR, MalI, MarR, MhpR, MntR, MurR, NadR, NemR, NikR, NorR, PrpR, RbsR, RcnR, TdcA, TreR, UhpA, UidR, YiaJ, YoeB-YefM, ZntR, and Zur.
TFs with directed repeat symmetry: CreB and MngR.
TFs with asymmetric symmetry: ChbR, RhaR, XapR, and ZraR.
Curation of transcription factors (TFs) for this release included updates to the summaries for MalT, UlaR, ArgR, MlrA, McbR, TreR, and YqhC. In addition, GO terms were updated for different TFs. The names of TFs were revised, and the category "DNA-binding" has been added.
On the other hand, a new TF, YqjI, has been added. The local regulator YqjI was reported to act as a repressor of the synthesis of an NADPH-dependent ferric reductase and its autorepression. Recently, Wang et al. described experimental evidence showing that this regulator maintains iron homeostasis in the presence of high levels of nickel Wang et al. 2011.
In this period we have completed the curation of the new Sensory Response Unit TyrR-L-tyrosine, L-phenylalanine involved in the synthesis and transport of aromatic amino acids.
Our publication concerning the Gensor Units Gama-Castro et al. (2011), corresponding to release 7.0, was chosen by the editors of Nucleic Acids Research to appear on their Featured Articles page: http://www.oxfordjournals.org/our_journals/nar/featured_articles.html.Feature Articles in Nucleic Acids Research represent the top 5% of papers in terms of significance, originality, and scientific excellence.
Our paper contains information for the release corresponding to 2008, 2009, and 2010.
January 26th, 2011. This release corresponds to Release 14.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
In this database release version, our main goal was to model the regulatory pathways, including integration of the metabolic pathways with the different objects represented in the database. For this reason, RegulonDB has expanded the biological context, and we now refer to this integration in terms of genetic sensory response units, or Gensor Units (GUs) Gama-Castro et al. (2011).
The inclusion of Gensor Units brings a dramatic change and expansion of RegulonDB, due to the fact that we are adding several new types of interactions, reactions and superreactions that summarize concatenated sets of reactions, linked to the other databases that contain such information.
Gensor Units: An elementary genetic sensory response unit, or Gensor Unit, is formed by four components, all of them concatenated in a loop of processing of information that initiates with a signal or stimulus (i), which can be of external or internal origin. The second component is represented by the signal transduction pathway (ii), a concatenated set of reactions that affect gene expression. The third component is represented by the core of regulation or genetic switching (iii) and contains all regulatory elements necessary for modifying gene expression, inducing and or repressing a collection of regulated genes, and ends with an response (iv) that corresponds to the collection of biological capabilities derived from the affected gene products Gama-Castro et al. (2011).
We have now initiated the curation of five GUs related to the signal transduction of the sigma factors and 21 related to the two-component systems. And we have completed the curation of 15 GUs involved in carbon source utilization, and five involved in the metabolism of amino acids.
GUs related to the signal transduction of the sigma factors.
Sigma19 Sigma24 Sigma28
GUs related to the two-component systems.
ArcA AtoC BaeR DcuR DpiA EvgA KdpE NarL NarP OmpR PhoB PhoP QseB RcsB RstA TorR UhpA ZraR
GUs related to carbon source utilization.
AlsR AraC ChbR FucR GatR GntR GutR-SrlR IdnR LacI MelR RbsR RhaS TreR UidR XylR
Previously, the structure of this database was accessible via the internet through four major navigation paths, by Genes, Operon, Regulon, and Growth Condition, combining graphics and literature information. Here, we provide three new types of searches: by Gensor Unit, Sigmulon, and small RNA (sRNA).
On the other hand, we have also corrected and relocated the DNA binding sites for FhlA, Ada, CaiF, NhaR, and YiaJ. Initially, Leonhartsberger et al. in 2000 showed that FhlA binds to inverted repeat sequences of 16 bp (CATTTCGTACGAAATG) Leonhartsberger et al. (2000). However, our alignment results for all the regions that FhlA binds showed that this sequence is not conserved. This result also showed that the motif TGTCGnnnnTGACA is conserved in the sequences examined, and for this reason we have relocated, reassigned, and corrected the binding sites of the FhlA regulon in the database. The FhlA-binding sites are represented in the database by an inverted repeat motif of 14 bp.
In the cases of Ada and CaiF, we performed alignments of the sequences upstream of the genes regulated by these proteins and evaluated the previous consensus sequences of the binding sites Teo et al. (1986), Nakamura et al. (1988), Buchet et al. (1999). In addition, the lengths of the degenerate binding sites of NhaR were defined according to the matrix shown for this regulator in the database RegPrecise. That database contains matrices generated from alignments of orthologous regions Novichkov et al. (2010).
In 2000, Ibañez et al. showed that transcription of the divergent operon yiaJ-yiaKLMNO-lyxK-sgbHUE depends on the YiaJ repressor. However, those authors suggested that this regulator binds a long region of 35 bp Ibañez et al. (2000). The alignment of this region with the orthologous sequence of Klebsiella pneumoniae showed a conserved palindrome of 21 bp Campos et al. (2008). The position and length of the YiaJ-binding site reported in our database have been changed to reflect this.
A new portable drawing tool for genomic features is available, as well as several new forms for downloading the data, including web services, files for several relational database manager systems, and files in BIOPAX format.
August 18, 2010. This release corresponds to Release 14.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
The motif obtained from aligning OxyR binding sites is highly variable due to the length of sequences, even though, through manipulation of the alignment it is possible to detect four conserved regions. For this reason we have relocated, reassigned, and corrected binding sites of the OxyR regulon, corresponding to 19 transcription units. Toledano et al. (1994) showed that OxyR binds in tandem to four ATAG elements and defines a consensus motif, ATAGntnnnanCTATnnnnnnnATAGntnnnanCTAT covering around 40pb (Toledano et al. (1994)).
We now propose a new consensus sequence, GATAGGTTnAACCTATCnnnnnGATAGGTTnAACCTATC, which contains two inverted repeat motifs, GATAGGTTnAACCTATC, of 17 bp separated by 5 bp. This sequence consensus is based on agreement of alignments realized by the curator of these upstream regions and on the corresponding evidence, obtained in the bibliography for every operon, including the similarity to the consensus sequence, data from footprinting assays, computational analysis of these sequences, and profiling of OxyR-dependent gene expression. In the database the OxyR-binding sites are represented by an inverted repeat motif of 17 bp.
During this last period, we have updated curation on transcription initiation including publications until end of April, 2010.
March 24, 2010. This release corresponds to Release 14.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have corrected and relocated the binding sites of the CytR transcription factor. This regulator negatively controls the expression of genes that encode the proteins required for transport and utilization of ribonunucleosides and deoxyribonucleosides. The CytR binding sites were previously represented as long regions which were determined by footprinting of several promoter sequences.
Computational analysis of these sequences showed that the optimal CytR binding site consists of two octamer repeats, GTTGCATT, in direct o invert orientation and preferably separated by 2 bp. Experimental support of this consensus sequence was obtained from footprinting, site-directed mutagenesis experiments and gene expression. (Pedersen et al. (1997), Jorgensen et al. (1998) ) We have updated curation on transcription initiation including publications until end of December, 2009.
We have different genetic networks available by using pre-computed datasets, web services, dump files and direct connection to a mysql database.
New dataset files have been created in order to have a complete repertoire of genetic networks. They are available at the Downloads/Data Sets option.
The new files are:
The network between TFs and operon. The network between TFs and their regulated TFs. The network between sigma factors and their regulated operons. The network between sigma factors and their regulated sigma factors.
A set of pre-computed images of different genetic networks are available at the Tool menu in Transcriptional Regulatory Network option.
3.- Web services
The description of the NetWork Web service is also available. Perl and java clients were developed for this service.
4.- Connection to the database
We created an additional public repository in mysql for those users who want to connect directly to the database to have access to these genetic networks. The configuration is the following:
Password: *** (request it by sending an e-mail to email@example.com) Using the mysql database driver.
Aug 10,2009. This release corresponds to Release 13.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
Our curation update project is progressing; we have substantially curated information on regulation of transcription initiation up to the end of June 2009.
We have now completed summaries for all 170 Transcription Factors (TFs) that have at least one experimentally characterized binding site or interaction. These regulators represent 33 families of TFs, and the summaries describe relevant characteristics of each regulatory protein. A summary of the functions of these 170 TFs is the following:
Seven TFs are considered to be global regulators and are involved in regulating multiple operons and genes of different functional classes or gene ontologies, including DNA architecture, such as: anaerobiosis (ArcA and FNR), carbon source (CRP), factor for inversion stimulation (FIS), organization, maintenance of nucleoid, as well as other cellular processes (HNS, Lrp, and IHF).
Additionally, 21 response regulators belong to two-component systems, 42 TFs are included in the carbon sources system, 17 TFs are related to processes such as transport, biosynthesis and catabolism of the amino acids, 13 TFs are involved in the transport and metabolism of different nitrogen sources, and 8 TFs are classified as metallo-regulators. Note that the TFs can be involved in more than one function.
The rest of the TFs are considered to be local regulators that control the genetic transcription of different cellular processes and functional classes, for instance, flagellar and chemotaxis systems, metabolism of nucleosides, transport and synthesis of fatty acids, DNA replication, quorum sensing, toxin-antitoxin systems, adaptation and resistance to different conditions of stress, among others.
We have completed adding references and evidence codes to 210 promoters. Now every manually curated promoter has a reference and an evidence code associated with it.
In the web page that displays gene information, users now can get fasta files from gene nucleotides sequence and amino acid sequence of its products. Also in the gene, operon and regulon web page we are including links to M3D database.
We implemented an Object-Relational mapping technology based on Ibatis framework, with the idea of getting better time in retrieving results on a query. Also, we are replacing the network tool application with a new version that implements different interface graphics for network regulation of Genes, Operons and Transcription Factors.
Finally, we are releasing web services to enable to developers to access RegulonDB’s data via SOAP.
Feb 10, 2009. This release, corresponds to that of 12.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
Our general curation update project is progressing; this version contains curated information on regulation of transcription initiation up to end of May 2008.
July 10, 2008. This release, corresponds to that of 12.1 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
In order to expand the information on transcription regulation of E. Coli, the laboratory of Dr. Julio Collado has been making an effort to generate and analyze data coming from high-throughput experimental mapping of promoters. The initial results of this approach are available in this release, which includes a collection of 259 new transcription start sites (TSSs) that have been experimentally determined using a high-throughput experimental modified RACE approach (with the corresponding new evidence code: EV-EXP-IDA-HPT-TRANSCR-INIT-M-RACE-MAP). Of those 259 sites, 110 are from TUs with hypothetical genes for which no function has been inferred.
These promoters were linked to new transcription units with exactly the same number of genes as previous existing ones.
To validate the accuracy of this strategy, we used it to identify the previously published TSSs for 50 TUs, 92% of which showed a perfect match (with a discrepancy of up to one nucleotide with respect to the published TSS). The rest showed slight ambiguity, inherent to the RACE protocol, of up to six nucleotides. We detected more than one TSS in 14 of these TUs. Interestingly, for only two of them, additional TSSs had been reported. Thus, our results are highly accurate and determine additional promoters to >25% of TUs previously determined.
The experiments were performed in the laboratory of Dr. Enrique Morett, Institute of Biotechnology, in collaboration with the laboratory of Dr. Julio Collado-Vides, both at UNAM. This mapping has been supported by NIGMS grant RO1-GM71962.
This version has curated information on regulation of transcription initiation up-to-date as of March, 2008.
April 15, 2008. This release, corresponds to that of 12.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We are currently updating our curation related to transcriptional regulation in E. coli, including the recent literature. In this release we have initiated the annotation of some promoters and DNA binding sites from computational predictions and from high-throughput experiments such as microarrays and ChIP-chip experiments. Only promoters or DNA binding sites that have evidence from at least two of these three types of experiments have been added to EcoCyc. Some examples are: Fur DNA binding sites identified by computational prediction and binding of purified protein in Chen et al. (2007), Sigma32 promoters identified by ChIP-chip, microarray analysis and in vitro transcription assays in Wade et al. (2006), and Sigma32 promoters identified by microarray analysis, transcription initiation mapping and in vitro transcription assays in Nonaka et al. (2006). Promoters identified by libraries of fluorescent transcriptional fusions (Zaslaver et al. (2006)) are also included in this release.
January 15, 2008. This release, corresponds to that of 11.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
The current liberation highlights the modifications and improvements that are transforming RegulonDB into a more comprehensive model of gene expression regulation.
In order to expand our knowledge of the regulatory universe of E. coli beyond literature searches, we started a genome-wide project to experimentally map as many promoters as possible in this organism. For this purpose, we made use of a modified 5'RACE protocol with gene-specific oligonucleotides. A total of 317 TSSs for 269 TUs (38 have more than one TSS) have been mapped with the 5'RACE methodology, 110 of which correspond to TUs with hypothetical genes for which no function has been inferred. The newly-mapped TSSs have been included in RegulonDB. A detailed compendium of these findings will be published elsewhere. In addition to the already existing data from σ70 promoters, we have generated computational predictions for four different promoters of the σ70 family: sigma 24, 28, 32, and 38. Promoter predictions have also been generated for the σ54 factor, which defines a different sigma factor family than σ70. The putative +1 of transcription initiation, along with the -35 and -10 boxes, can be downloaded from RegulonDB.
In order to provide a more comprehensive annotation of gene expression regulation, RegulonDB is now modeling not only transcriptional regulation data, but also other kinds of regulatory elements, such as small-RNAs. This inclusion consists of a graphic representation and textual information about their sequences, location, evidence, and references, which are shown on the Operon page.
RegulonDB literature can now be searched using the Textpresso text mining engine, customized for E. coli. Textpresso allows direct exploration of curated literature, both at the level of highly-specific keywords and with entire categories or ontology classes (derived from GO concepts or customized word lists). The user can, for example, search for papers that feature a type of regulation in which a gene or operon and a specific TF are mentioned in the same sentence. Currently, the tool can search through 2472 full-text papers, 3125 paper abstracts, and over 4200 curator notes. The addition of this text mining tool to RegulonDB will expand the possibilities for the end user to traverse the knowledge space of E. coli metabolism and gene regulation, and it will allow our curators to refine and confirm their annotations.
To facilitate the implementation of more regulatory objects, an additional classification of transcription factors has also been included. TFs have been labeled as "global" or "local" regulators based on the number of genes they directly regulate, the number of co-regulators they work with, the number of TFs they regulate, the diversity of types of promoters they regulate, and the number of different functional classes of their regulated genes. From the set of 160 TFs currently annotated in RegulonDB with experimental data, seven TFs are identified as global regulators: CRP, IHF, FNR, FIS, ArcA, Lrp, and HNS, while the rest are termed local regulators. In addition, TFs are classified in RegulonDB according to their "Sensing class" as: internal, external, hybrid or unknown, depending on the origin of their effectors. All genes that code for known and predicted TFs have been annotated with their corresponding Gene Ontology class, and we uploaded those for the rest from EcoCyc.
Furthermore, the objects presented in RegulonDB (promoters, sigma factors, TUs, and regulatory interactions), now feature tables and graphs for different relationships among the database objects.
Evidence associated to RegulonDB objects was classified as strong or weak in accordance to the level of reliability of the experiment that supports the object properties and relationships between them. The more reliable evidence was called "strong" and the less reliable ones, "weak". Evidence strength can be distinguished graphically by a solid (strong) or dashed (weak) line. If the same object has evidence form both types, it is displayed according to the strongest one.
Finally, the RegulonDB WEB application server upgrade from version 8 to 9.0 provides a major performance and stability, which allows a faster response to the user. In addition, RegulonDB users can now download data and schema in dump files for the most popular database management systems like MySQL, Postgres, Oracle, and Apache Derby. It is also available in XML and flat file formats.
September 17, 2007. This release, corresponds to that of 11.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We have corrected and relocated the binding sites of the ArcA response regulator. ArcA is considered to be a global regulator and is involved in respiratory metabolism and controlling the expression of about 60 transcription units. The ArcA binding sites were previously represented as long regions of 60 bp, which were determined by footprinting of several promoter sequences. Computational analysis of these sequences showed a shorter 15 bp site, GTTAnnnnnnnGTTA, consisting of two direct repeats of 4 bp separated by 7 bp. Experimental support of this consensus sequence was obtained from footprinting, site-directed mutagenesis experiments and profiling ArcA-P dependent gene expression.
June 1, 2007. This release, corresponds to that of 11.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We finished curation of information about transcriptional regulation of genes involved in the transport and metabolism of different nitrogen sources (including the preferred source, ammonia). This curation included the annotation of sigma54 promoters and nine trascription factors and their regulons, involved in nitrogen metabolism: NtrC, FlhA, NorR, PspF, PspR, HyfR, NacC, ZraR and RtcR.
• The annotation of transcriptional promoters regulated by the sigma factors sigma19 (FecI), sigma28 (FliA), and sigma54 (RpoN) has been reviewed and updated. These factors are required for the transcription of specific sets of genes involved in the iron stress response, the flagellar system, and in nitrogen metabolism, respectively. Where experimental data was available, appropriate literature citations and notes were added.
• The TyrR, TrpR and Lrp regulons have been updated. These regulons are related to processes such as transport, biosynthesis and catabolism of the amino acids tyrosine, phenylalanine, and tryptophan (aromatic), and also serine, glycine, glutamate, leucine, isoleucine, valine and threonine. The Lrp regulon is also important for the assimmilation of ammonia in poor nitrogen conditions.
• All transcription factors have now been assigned Gene Ontology and MultiFun terms.
January 15, 2007. This release, corresponds to that of 10.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
Release 5.6 of RegulonDB includes an update of the genomic sequence of E. coli K-12. GenBank entry U00096.2 now replaces the original U00096.1 deposited by the Blattner laboratory in 1997. The major new feature of this release is the addition of computationally predicted riboswitches and attenuators to RegulonDB, that are now properly curated, with their associated evidence, and displayed in our graphics.
This latest release includes several improvements to the web site and the underlying database:
- A new search mode that supports common names, sentences and even incomplete words. We can now search for terms such as “Lac Z” or “protein source”.
- We have enhanced the graphic display of objects, adding tooltips to genes, promoters, DNA binding sites, terminators, attenuators and riboswitches. For instance, binding site tooltips show their central position when moused-over.
- The names of objects are now visible within diagrams, simplifying their identification by the user.
- We have implemented supervised automatic consistency checks that improve our data integrity.
As always, the RegulonDB team (firstname.lastname@example.org) is permanently curating relevant literature to keep the database up to date.
- We have been expanding notes for 30 regulatory proteins and now include short notes about the evolutionary family to which they belong, their domain composition and the cellular processes in which the regulated genes are involved. When available, an indication of the active conformation of a complex (dimer, tetramer...) is given. Relevant physiological data about the effectors of transcription factors is also covered, with the aim of helping the understanding of regulation physiology. These summaries also have descriptive information about binding site features (size, consensus sequence, relative position to the transcription start, spatial arrangement of the site sequences). Appropriate literature citations were added for these 30 regulatory proteins: AraC, AscG, BglJ, BetI, BolA, CdaR, CueR, DicA, FabR, FeaR, GadX, GcvA, HcaR, HdfR, HipB, IdnR, MalI, Nac, NanR, PdhR, PerR, PhnF, PrpR, SlyA, TreR, UidR, YdeO, YeiL, ZntR and Zur.
- The evidence codes attached to 132 transcription factors have been updated, including experimental and computational evidence. Appropriate literature citations were added for 67 of these regulators.
October 30, 2006. This release, corresponds to that of 10.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
• We finished curating information on the transcriptional regulation of genes involved in both flagellar and chemotaxis systems; eight new promoters and five new transcription units as well as 21 new DNA-binding sites for the transcriptional regulator FlhDC were added.
• We have completed the curation of 364 transcription units based on single-gene directons. A directon is one or a set of genes transcribed in the same direction, organized into one or several transcription units and operons (Salgado et al. (2000)). In other words, these 364 genes are surrounded by genes that are transcribed in a different direction, and therefore they must be transcribed in isolation.
May 12, 2006. This release, corresponds to that of 10.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
EcoCyc and RegulonDB have recently been updated with additional regulatory information and represent the largest comprehensive and constantly curated regulatory network of E. coli K-12. A report on our progress has been published in Salgado et al. (2006).
Regulation of degradation pathways: We expanded a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in the degradation of carbon sources, including the catabolism of sugars, polysaccharides and sugar derivatives. Pathways whose gene regulation has been curated are:
METABOLISM OF SUGAR DERIVATIVES: SUGAR CARBOXYLATES
Conversion of succinate to propionate
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ALCOHOLS
Glycerol degradation I
Glycerol degradation II
Superpathway of glycol metabolism and degradation
CATABOLISM OF AROMATIC COMPOUNDS
3-phenylpropionate and 3-(3-hydroxyphenyl)propionate degradation
Regulation of expression of enzymes involved in the degradation or utilization of melibiose, maltose, fructose, chitobiose, N-acetylgalactosamine, and beta-glucosides was curated.
Regulation of the following additional pathways was curated:
Superpathway of gluconate degradation
March 16, 2006. This release, corresponds to that of 9.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
o We have curated within EcoCyc gene regulatory interactions identified in datasets from Ma et al. (2004) and Shen-Orr et al. (2002) that were not present in EcoCyc.
o Regulation of respiration pathways: We completed a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in respiration pathways in E. coli, which include aerobic and anaerobic phases, as well as those for electron transfer, electron donors and electron acceptors. Specific attention involved modifying all NarL binding sites with new central positions resulting from a consensus sequence of the site now defined by 7 nucleotides. Pathways whose gene regulation has been curated are:
Aerobic electron transfer
Aerobic respiration (electron donors reaction list)
Electron transfer (anaerobic)
Respiration (anaerobic)-electron acceptors reaction list
Respiration (anaerobic)-electron donors reaction list
Regulation of degradation pathways: We completed a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in the degradation of carbon sources, including the catabolism of sugars, polysaccharides and sugar derivatives. Regulation of operons encoding enzymes of glycolysis, the pentose phosphate pathway, the TCA cycle, and the Entner-Doudoroff pathway were also curated in this phase. Pathways whose gene regulation has been curated are:
CATABOLISM OF SUGAR AND POLYSACCHARIDES
Lactose degradation III
Galactose degradation I
Glucose and glucose-1-phosphate degradation
Trehalose biosynthesis and degradation-low osmolarity
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ACIDS
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ALCOHOLS
Superpatway of hexitol degradation
CATABOLISM OF SUGAR DERIVATIVES: AMINO SUGARS
N-acetylneuraminic acid dissimilation
Regulation of the following additional pathways was curated:
Non-oxidative branch of the pentose phosphate pathway
Oxidative branch of the pentose phosphate pathway
Pyruvate oxidation pathway
Entner-Doudoroff pathway I
Our general curation update project is progressing. Of the 4471 polypeptides within EcoCyc, 3758 now have comments or citations or are components of a complex that has a comment or citations. The database now contains 12026 citations.
Any comments please send an email to: email@example.com