DNA oligomers: All posible DNA words between 3 to 6 nucleotides can be used to discover putative new functional motifs. Depending on the size and strand selected there are diferent number (N) of posible DNA words.


SIZEN STRAND
FORWARD
N
PALINDROMES
N NON
PALINDROMES
N STRAND
BOTH
3 64 0 32 32
4 256 16 120 136
5 1024 0 512 512
6 4096 64 2016 2080


For example, selecting size=4 and strand=both in a dataset of plant promoters extracted from EPD, 3 out of 136 posibles oligomers were significant with p-values < 1e-20. TATA, ATAG & ATAA (or ATAT, CTAT & TTAT) show positional overrepresentation between -20 to -40 relative to the transcription start site. These 3 tetramers correspond to the core binding motif for the TBP.