Selection of random sequences in ReadOut:
Energy Z-scores in ReadOut web server assume a universal space of DNA sequences drawn randomly. These random sequences serve as a reference data set against which the given (target) sequence (in the protein-DNA complex) has a certain specificity. The mean energy and standard deviation in these random sequences are used to compute the final Z-score. For a target DNA of length N, there are 4^N possible DNA strands and this number becomes enormously large (e.g. four a 5 nucleotide sequence, there are 1032 possible sequences. For 6 nucleotide sequence, this number is 4,128 and so on.) It is not possible to compute energies of all combinations. We thus, try to construct a random or reprsentative space of these sequences. For a rapid calculation, number os sequences should be small. In most cases increasing this number beyond a few thousand does not make much difference as the random space has already converged i.e. is large enough to be fully reprsentative.