Terminology used in XRNA
All terminology used in XRNA and the CMB-RNA XRNA faq and web pages, are for
the creation, annotation and viewing of RNA secondary structure graphs created
by XRNA only. The terminology may or may not have a counterpart in RNA 3D viewing
programs or any other type of RNA software.
A secondary structure graph is the final visual output of XRNA. The output contains 1 or
more rna strands each containing 1 or more nucleotides that may be base paired. The
base pairing may occur across rna strands. The graph may also contain labels either specific
to the nucleotides, rna strand, collection of rna strands or to the graph in general
(see What is an extraneous label?).
A nucleotide is a nucleotide letter 'A', 'U', 'G', 'C', 'R', 'Y', 'N'.
These letters are default for XRNA (and what are initially viewed) and will
always be known by the program. To change the viewed nucleotide letter to
something else (like the lowercase equivalent, or some kind of mutation symbol
see the CMB-RNA XRNA faq for annotation.)
A viewed nucleotide also implies a possible nucleotide number. This can be a line
pointing from a nucleotide to a number or just a line. When editing and refering
to a nucleotide a translation of a nucleotide will also translate its nucleotide
number.
Refers to a single stranded region within a rna strand. A single stranded
region is a contiguous set of nucleotides which contains no base pairs.
For editing purposes, a single stranded region has 2 nucleotides that are
considered the 5' and 3' ends of the single strand. These are called the single
strand delineators. In a rna strand there are naturally occuring single strand
delineators:
- If the first nucleotide of a rna strand is not base paired then
that first nucleotide is a 5' delineator and its corresponding 3' delineator is
the first base paired nucleotide in the strand.
- Likewise, if the last nucleotide in a rna strand is not base paired then it is a
3' delineator and its corresponding 5' delineator is the last base paired nucleotide
in the rna strand.
- Any single stranded region connecting two helices has as the 5' delineator the
last nucleotide in the 5' side of the most 5' helix and the 3' delineator
the first nucleotide of the 3' most helix.
- Any hairpin loop of a helix has the last base pair in the helix as the
delineators with the 5' nucleotide of the base pair the 5' delineator and the
3' nucleotide of the base pair the 3' delineator.
When editing a secondary structure graph, delineators can be arbitrarily added
to redefine the endpoints of a single stranded region. These delineators are only
defined for the life of the editing session so care must be used in redefining
arbitrary delineators on subsequent editing sessions. In this fashion single
stranded regions can be easily edited to include arbitrary runs of straight
lines and arcs.
Refers to 2 nucleotides base paired together. Any 2 nucleotide types
can be base paired together. The most 5' nucleotide in a base pair is referred
to as the 5' side of the base pair. The most 3' side is referred to as the
3' side of the base pair. The default viewing of base pairs is:
- Watson-Crick: straight line
- GU Wobble: small solid dot
- everything else: larger open dot with nucleotide letters bulged out a little
- Refers to a contiguous set of rna base pairs. A single, isolated, base pair can also be
regarded as a helix of length 1. The start of a helix is the base pair that
contains the most 5' nucleotide as well as the most 3'. The end of a helix is
the last base pair on the opposite end of the start of a helix. The reference nucleotide
to a helix is the 5' most nucleotide (i.e., if a rna strand has nucleotides 10->20
base pairing with nucleotides 30->40 then the helix can be refered to as helix 10).
The side of a helix that is most 5' is referred to as the 5' side of the helix.
The side most 3' is referred to as the 3' side of the helix.
- A hairpin loop helix is a helix that ends in a single stranded loop. This loop could
be of length 0, but usually is 3 or more nucleotides.
- A helix may be visually represented either in a clockwise (or left-handed) or
counter-clockwise (or right-handed) fashion. This representation can be visualized
by imagining being able to walk along a strand from the 5' end to the 3' end.
In a clockwise representation of a hairpin helix, for instance, walking around it
would place your left hand outside of the helix. In a counter-clockwise
representation your right hand would point outside of the helix. A typical
secondary structure graph of tRNA is counter-clockwise. The standard secondary
structure graph for e.coli 16S is mostly represented by helices formatted in a
clockwise fashion, but this changes to counter-clockwise at the stacked helix
1404 and changes back at helix 1506.
First read the terminology for a rna cycle. A stacked helix is a run of helices,
5' to 3', connected by cycles that have 1 entry helix and 1 exit helix. In the
e.coli 5S example the helices at 16, 18, 28, and 31 form a stacked helix. A
stacked helix doesn't necessarily imply that they are visually aligned. A stacked
helix may end in a hairpin loop helix.
Refers to a rna helix and every nucleotide in the 5' direction from the end
of the helix, 5' side, back to the end of the helix, 3' side.
Similar to a "rna color unit" but nucleotides are grouped together by
virtue of a common name. This is useful for the same reasons a "rna
color unit" is and further for refering to commonly known "domains"
in familiar secondary structure graphs like the 16S e.coli 5' domain.
Refers to a collection of nucleotides that can all be referred to at once
by virtue of their visual color only. This is useful for the creation of
complicated secondary structure graphs (like the group I Intron from the Cech lab)
where there are many disjoint regions, arbitray groupings of structure,
and mixtures of helices with both clockwise and counter-clockwise winding.
A typical usage would be to annotate groups of nucleotides (using the standard
grouping contraints) with some arbitrary but distinct color. One can then
edit this similarily colored group of nucleotides as a unit.
A cycle is a collection of helices, and their connecting single stranded regions,
within a single rna strand that have the following definitions and relationships:
- there is a starting cycle, called the cycle at level 0, that contains
the 5' end of the rna strand and the 3' end.
- every cycle, except for the cycle at level 0, has what is called an entry
helix. An entry helix is also an exit helix from another cycle.
- Every cycle has 0 or more exit helices. If the cycle at level 0 has no
exit helices then the rna strand doesn't have any base pairs.
- A cycle with 1 entry helix and 0 exit helices is a helix with a hairpin loop.
- In the example of e.coli 5S, helix 1 is the entry helix to a cycle that
has 2 exit helices: helix 16 and helix 70. Helix 1 is also the exit helix to
the cycle at level 0. Helix 16 is the entry helix to a cycle that has 1 exit helix,
helix 18.
The concept of a cycle is extremely useful during editing. A cycle is easily
made circular. If it is circular then it can be interactively expanded or contracted
maintaining the original helical angles. Also an exit helix and its subdomain
can be edited such that it can move around the existing circle keeping an angle
that is perpendicular to the circular cycle.
A way of referring to a collection of nucleotides that could be single
stranded or base paired. 2 nucleotides are chosen in a rna strand as end
points, inclusive, of the collection. The endpoints are chosen using
the right mouse button only. After a nucleotide is chosen a properties
menu will appear prompting to choose the next nucleotide. When the
second nucleotide is chosen with the right mouse button then editing or
annotating through the properties menu can commence.
A contiguous set of nucleotides, numbered 1 -> N, that may have single
stranded regions and base paired regions. The base paired regions may
base pair with a different rna strand. The contiguous set of nucleotides
may contain gaps in the numbering and still be considered a contiguous
set.
Refers to a grouping of one or more rna strands.
Sometimes used in editing if extraneous labels have been intermixed
with rna structure. This allows one to pick only the labels and
never pick structure.
Refers to the entire scene in a secondary structure graph and is a container
for "rna strand groups" and labels global to the whole scene. This
concept is useful for edit and annotate commands that need to affect the entire scene.
A nucleotide always belongs to a rna strand. The rna strand always starts its
numbering from 1 and ends with the total number of nucleotides in the strand.
The ordinal value of a nucleotide is therefore the number of the position in
the strand that the nucleotide resides at.
Any nucleotide can have a number next to it with a line pointing from the
number to the nucleotide (alternatively, just the line pointing to the
nucleotide). This number isn't necessarily the ordinal number
of the nucleotide in its strand as sometimes it is useful to use a numbering
system that is derived from a different strand. Typically every 10th
nucleotide number in a secondary structure graph is labeled. Sometimes
every 50th nucleotide is numbered while every 10th has a line next to it.
Any label that is not a nucleotide letter, nucleotide symbol, schematic,
or nucleotide label. These can be text labels, arrows, circles, triangles, and
parallelograms. There is an association that must be made between an
extraneous label and the container in which it resides. This is important
for any translation that is done on a container to also translate its extraneous
labels. When adding an extraneous label check the "Main Panel:" button called
"Pick Strand." See How do I use the Pick Strand button?' for details.
Refers to the single stranded region at the end of a helix.
Any single stranded region that isn't a hairpin loop. This includes single
stranded regions that connect helices (including a single "bulged" nucleotide)
and the first and last contiguous sets of unbasepaired nucleotides in a rna strand.
The term "loop" is still used no matter how the individual nucleotides are
positioned in a single stranded region.
A 2d vector whose tail is at the midpoint of the starting base pair in a helix
and whose head is at the midpoint of the end base pair in the same helix.
The helical axis for a RNA stacked helix and for a RNA subDomain will be
the helical axis of the starting helix in each.