|
Bioinformatics Glossary
Part:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
(Continued from previous part...)
Carboxyl group: The -COOH functional group,
acidic in nature, found in all amino acids
cDNA (complementary DNA): A DNA strand copied
from mRNA using reverse transcriptase. A cDNA library represents all of
the expressed DNA in a cell.
cDNA library: A set of DNA fragments prepared
from the total mRNA obtained from a selected cell, tissue or organism.
Chimeric clone: A cloning artifact created
by a foreign gene being inserted into a vector in an incorrect orientation
resulting in the expression of a protein consisting of a fusion of two
different gene products.
Chromat: Data file output from most popular
DNA sequencers. Chromat files consist of the fluorescent traces generated
by the sequencer for each of the four chemical bases, A, C, G, and T, together
with the sequence and measures of the error in the traces at each sequence
position.
Chromatin: The chromosome as it appears in
its condensed state, composed of DNA and associated proteins (mainly histones).
Chromosome: The structure in the cell nucleus
that contains all of the cellular DNA together with a number of proteins
that compact and package the DNA.
Clone: A population of genetically identical
cells or DNA molecules.
Cloning: The formation of clones or exact genetic
replicas.
Cluster: The grouping of similar objects in
a multidimensional space. Clustering is used for constructing new
features which are abstractions of the existing features of those objects.
The quality of the clustering depends crucially on the distance metric
in the space. In bioinformatics, clustering is performed on sequences,
high-throughput expression and other experimental data. Clusters of partial
or complete gene sequences can be used to identify the complete (contiguous)
sequence and to better identify its function. Clustering expression data
enables the researcher to discern patterns of co-regulation in groups of
genes.
Coding regions (CDS): The portion of a genomic
sequence bounded by start and stop codons that identifies the sequence
of the protein being coded for by a particular gene.
Codon: A sequence of three adjacent nucleotides
that designates a specific amino acid or start/stop site for transcription.
Combinatorial chemistry: The use of chemical
methods to generate all possible combinations of chemicals starting with
a subset of compounds. The building blocks may be peptides, nucleic acids
or small molecules. The libraries of compounds formed by this methodology
are used to probe for new pharmaceutical reagents (see high-throughput
screening).
Complementary determining region (CDR): The
hypervariable regions of an antibody molecule, consisting of three loops
from the heavy chain and three from the light chain, that together form
the antigen-binding site.
Complexity (of gene sequence): The term "low
complexity sequence" may be thought of as synonymous with regions of locally
biased amino acid composition. In these regions, the sequence composition
deviates from the random model that underlies the calculation of the statistical
significance (P-value) of an alignment. Such alignments among low
complexity sequences are statistically but not biologically significant,
i.e., one cannot infer homology (common ancestry) or functional similarity.
Conformation: The precise three-dimensional
arrangement of atoms and bonds in a molecule describing its geometry and
hence its molecular function.
Consensus sequence: A single sequence delineated
from an alignment of multiple constituent sequences that represents a "best
fit" for all those sequences. A "voting" or other selection procedure is
used to determine which residue (nucleotide or amino acid) is placed at
a given position in the event that not all of the constituent sequences
have the identical residue at that position.
Constitutive synthesis (expression): Synthesis
of mRNA and protein at an unchanging or constant rate regardless of a cell’s
requirements (see housekeeping genes).
Contig: A length of contiguous sequence assembled
from partial, overlapping sequences, generated from a "shotgun" sequencing
project. Contigs are typically created computationally, by comparing
the overlapping ends of several sequencing reads generated by restriction
enzyme digestion of a segment of genomic DNA. The creation of contigs
in the presence of sequencing errors, ambiguities and the presence of repeats
is one of the most computationally challenging aspects of the role of Bioinformatics
in genome analysis.
Convergence
The end-point of any algorithm that uses iteration or recursion to guide
a series of data processing steps. An algorithm is usually said to have
reached convergence when the difference between the computed and observed
steps falls below a pre-defined threshold.
Cosmids
DNA vectors that allow the insertion of long fragments of DNA (up to
50 kbases).
Crystal structure
Term used to describe the high resolution molecular structure derived
by x- ray crytallographic analysis of protein or other biomolecular crystals.
Cytoplasm
The medium of the cell between the nucleus and the cell membrane.
Cytosine
A pyrimidine base found in DNA and RNA.
(Continued on next part...)
Part:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
|