Help for using Qgrid: |
To get a Qgird cluster tree diagram, following steps are required:
1. File upload: If you wish to get a Qgrid plot for a coordinate file, you may upload it here. Please note that the uploaded file should be less than 1MB and must strictly follow PDB format.
Alternatively you can go to step 2:
2. PDB code: Please enter Protein Data Bank four letter code to make a choice of the protein. Please do not use .pdb at the end or pdbxxx.ent type names. Your PDB code should simply be the four letter code e.g. 2lzm.
3. Chain name: Please choose the chain name of the protein data bank file, which need to be clustered. Note that the chain name is case sensitive and the same case should be used as it appears in the corresponding PDB file.
4. Choice of Clustering type. You may get a hydrophobic cluster tree or a charged cluster tree. Please use the pull-down menu for making your choice about the clustering type here.
5. Which charges to cluster: If you are trying to find charged clusters, you can choose here if you are looking for positive charged clusters or the negative charged ones. This option has no effect for hydrophobic clusters.
5. Charge cutoff: While making charged grids, some charged grids needs to be removed from the clustering process, as they have insignificant amount of charge on them. You can choose a cutoff and all grids with less than this charge will be eliminated from clustering. Please note that the hydrophobic clusters are based on the assumption that CA atoms of hydrophobic residues have a net 1.0 pseudo-charge and all other atoms have no (pseudo)charge. Note that currently, no charge is assigned to DNA, HETATOMS, LIGANDS and non-standard residues.
6. Grid size: Here you can change the size of the grid so you can accommodate more number of atoms/ charge in each grid before starting to cluster. Choosing a large grid size will decrease the number of grids and (perhaps) increase the amount of charge on each grid. This option should be used in conjunction with charge cutoff to get a suitable plot of your charges/ hydrophobic clusters. Please note that the server has been restrict to plot not more than 300 nodes as cluster tree diagrams. Grid size and charge cutoffs should be selected accordingly. We find that the proteins we tested can be successfully plotted within the allowed ranges. However, if you notice any protein which cannot be plotted at all within the ranges of allowed values, please inform us by clicking here. In NMR and modeled structures, only the first model will be plotted.
7. Tree joining method: Two types of methods can be used for clustering. In the single linkage method the distance between two clusters means the distance between the nearest pair of points taken one each from the two clusters. In the group average or average linkage method the distance between the two clusters is calculated by first getting the geometric center of each cluster and then the distances between these two centers is assumed to be the distance between the two clusters. The two methods of clustering here differ only in their definition of distances. After the distance method is selected, a hierarchical method of tree construction will be used, which mean first the nearest neighbors are joined and successively the nearest groups are joined until all nodes have been clustered. Single linkage method will be helpful if the two clusters maybe associated to each other by an association between any pair of its members. Perhaps, a tranmembrane helical segment could be clustered this way. However, when an overall distribution of charges or hydrohobicity is intended, a group average clustering method should be used.
A note on interpreting the Qgrid output results: For every query the server will return some basic information about the protein structure at first. This includes the dipole moment, charge and their per residue values, as also the total number of atoms read from the PDB file etc.
The tree diagram can then be downloaded from a link provided at the bottom of this page. Three formats are currently being provided viz Postscript, Acrobat PDF and Image in GIF format. Please note that due to the pagination problems PDF format may not work properly if the size of your tree is greater than one A4 page. In such case, postscript plot (or GIF) should be used.
Every node in the tree diagram represents a grid in the protein. Lines joining two adjacent grids show the distance between the two grids and lines joining two clusters represent the distance between the midpoints of the two clusters. This method of tree joining is called "Average Linkage". In other words the location of a cluster corresponds to the simple arithmetic averages of the co-ordinates of all the nodes making up that cluster. To have an idea on the shape of a cluster, one can look at the residue numbers in the grid lables. If the residue numbers are close to each other, it suggests several residues nearby on the sequence also come close for the cluster formation. In a single linkage cluster tree this may indicate helical or strand regions coming from one segment of the protein sequence.
How to trace a node on a protein structure?
Link to complete map of atoms in each grid is provided at the bottom of query result. If necessary, one can view that file. However in most cases examining the node labels may be enough. Each node has been labeled by numbers such as
First three numbers are numerical indices of the grid and do not need to be located on the protein. However complete atomic contents of the grid can be viewed by using these three numbers, which make the first three columns of the grid map file mentioned above. Next is the residue name followed by residue number. These two indices are the most convenient way of locating the corresponding grid on the protein. Next is the name of one of the atoms which fall within that grid. Note that only one atom is written here, but there may actually be more atoms in that grid (complete map in the grid map file). This is followed by the index of this atom e.g. in the above notation CZ is the 8th atom of ARG118 in the given PDB file. The last index shows the total amount of charge on that grid. In case of the hydrophobic clusters, this is simply the number CA atoms of the hydrophobic residues within that grid.
Distances between the clusters can be estimated from the scale provided at the bottom of the tree diagram. All distances are in Angstrom units.