Selection of random sequences in ReadOut:
Energy Z-scores in ReadOut web server assume a universal space of
DNA sequences drawn randomly. These random sequences serve as
a reference data set against which the given (target) sequence
(in the protein-DNA complex) has a certain specificity. The mean
energy and standard deviation in these random sequences are used
to compute the final Z-score. For a target DNA of length N, there
are 4^N possible DNA strands and this number becomes enormously
large (e.g. four a 5 nucleotide sequence, there are 1032 possible
sequences. For 6 nucleotide sequence, this number is 4,128 and so on.)
It is not possible to compute energies of all combinations. We thus,
try to construct a random or reprsentative space of these
sequences. For a rapid calculation, number os sequences should be small.
In most cases increasing this number beyond a few thousand does not make
much difference as the random space has already converged i.e. is
large enough to be fully reprsentative.