About ProNIT

Due to the rapid progress in genome analysis, complete genome sequences of many organisms become available. This enables us to study biological function at the genome scale. Proteins play critical roles in many biological processes. In order to elucidate molecular function of proteins, we need to have knowledge of their amino acid sequence and structure. However, the sequence and structural information is not enough to infer the protein function, because protein is a microscopic entity and its behavior obeys the law of thermodynamics. Thus, the thermodynamic knowledge of proteins is as fundamental as sequence and structural information. Although the databases for sequence and structure are well established, available databases for thermodynamic quantities on proteins and their interactions are seldom.

Thermodynamic data for protein-nucleic acids interactions are very useful for understanding the principles of molecular recognition. Although a large number of interacting systems have been structurally characterized, the mechanism of specific molecular recognition is still poorly understood. The integration of the structural and thermodynamic knowledge of molecular recognition would help us to delineate the molecular mechanism of affinity and specificity of interactions. Gene regulation is achieved by a complex network of many transcription factors, cofactors and target genes. There are some databases describing binary relations about molecular interactions. However, the presence or absence of interactions critically depend on thermodynamic quantities, namely, binding constant, protein concentration and environmental conditions including temperature, pH and ion concentrations in cells. Hence, thermodynamic data of interacting systems are solely needed to comprehend quantitatively the processes involved in gene regulation. The resulting knowledge will also lead to a wide spectrum of applications such as the design of novel nucleic acid binding proteins, predictive methods for the target sites, and the quantitative simulation of gene regulation network. In this regard, we have started collecting thermodynamic data of protein-nucleic acid interactions from published articles, and constructed an online database, ProNIT: Thermodynamic Database for Protein-Nucleic Acid Interactions, and opened it to public through the Internet in 2000.

Content of ProNIT
ProNIT currently contains more than 4900 data. Each entry of the database contains the following information. Protein information: name, source, fragment and sequence of the protein, EC, PIR, and PDB codes, information about monomeric or oligomeric state, ProTherm number, details of mutation with mutant residue, position, secondary structure and accessibility at the mutant sites; Nucleic acid information: name, source and sequence of the nucleic acid, information on mutation and sequence of mutant nucleic acid, GenBank accession number and NDB code; Complex information: codes for PDB and NDB, links to protein-nucleic acid complex structures, and description about conformational changes of protein and nucleic acid upon binding; Experimental information: temperature, pH, buffers, ions, additives and experimental method; Binding thermodynamic data: dissociation constant,Kd, ΔG, ΔH and ΔCp for wild and mutant entities, stoichiometry of binding and activity (Km and kcat). Literature information: journal name, authors, publication year, keywords and remarks.

ProNIT is implemented in 3DinSight , an integrated relational database for structure, function and property of biomolecules. 3DinSight has been designed to integrate PDB structures, their non-redundant subsets, structures of protein-nucleic acid complexes, PROSITE motif, ligand information, mutations, disease-related mutations, amino-acid sequences and thermodynamic data into a relational database (SYBASE). These data are connected in relational tables, and their correspondence can be efficiently searched by flexible queries. The visualization tools depicts the relations in 3D-space and as graph plots, e.g., motif sites, mutation site and ligand binding sites are automatically mapped on the structures and can be viewed by 3D viewers such as RasMol and VRML. The relevant objects in RasMol and VRML images are hyperlinked to the corresponding document data so that the documents can be easily viewed by click them. The entries in ProNIT are linked to Protein-Nucleic Acid Complex Database built within 3DinSight, where complex structures are classified according to the recognition motif and other characteristics, and one can examine the complex structures and sequence-dependent conformation and flexibility of DNA molecules in each complex. The ProNIT data are also linked to Base-Amino Acid Interaction Database available via 3DinSight, in which specific pairs of base-amino acid interaction can be analyzed in detail: If the users want to examine the specific base-amino acid interactions involved in the complex for comparing with the binding thermodynamic data, they can search for the pairs by specifying atom, residue and distance criteria. The specific base-amino acid pairs are automatically highlighted in the complex and visualized by the 3D viewers. 3DinSight has several form-based WWW interfaces with search, display and sorting options, that allow users to retrieve relevant information according to their purpose and convenience.

We will try to continue the data collection and the improvement of the database. It is very laborious work to collect data from literature. In the future, therefore, we would like to implement a system by which we can collect experimental data directly from researchers. Please send your opinion and suggestions to pronit@rtcmain.bio.kyutech.ac.jp, which will help us to improve the database.