Rna Parse Pseudoknots
An RNA pseudoknot is a secondary structure containing a minimum of two stem and loop structures in such a way that each interacts or intercalates with the other. The structural configuration of pseudoknots does not lend itself well to bio-computational detection due to its context-sensitivity or “overlapping” nature. This makes the presence of pseudoknots impossible to predict using recursive energy scoring techniques over a data set greater than a few 100 nucleotides. Thus, popular programs like Mfold or biological sequence search engines such as BLAST cannot predict them.
The methods we use here essentially ignore rna molecular energy and “look” or match a specific configuration which can then be further verified via slower and more computationally expensive methods. This grammatical method bypasses the NP problem of trying every possible fitting permutation by instead matching a predefined system of stems and loops in a linear fashion.
A match is achieved by; matching as a unit, one stem of some length with another of some length in such a way that the two must be connected by canonical base pairs – loops are added to the grammar as appropriate.
http://rnavlab.utep.edu/portal rna searchable database
Type 1 Languages - RNA Pseudoknots, Complement repeats and other interleaved structures.
RNA Pseudoknot: In stem one, some number of nucleotides must match the same number followed by a loop of some number of nucleotides followed by a another loop and the second part of stem two. The agreement is crossing and thus context-sensitive: stem1-loop-stem2-loop-stem1'-loop-stem2'
A context-sensitive grammar does so by accepting (as a unit) some number of nucleotides, constructing its compliment, find a second number of nucleotides, construct its compliment in such a way that the first overlaps the second. Loops and bulges predicates are easily added to the grammar as required.