|
Bioinformatics Glossary
Part:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
(Continued from previous part...)
Data Cleaning
A process whereby automated or semi-automated algorithms are used to
process experimental data, including noise, experimental errors and other
artifacts, in order to generate and store high-quality data for use in
subsequent analysis. Data cleaning is typically required in high-throughput
sequencing where compression or other experimental artifacts limit the
amount of sequence data generated from each sequencing run or "read."
Data Mining
The ability to query very large databases in order to satisfy a hypothesis
("top-down" data mining); or to interrogate a database in order to generate
new hypotheses based on rigorous statistical correlations ("bottom-up"
data mining).
Data Processing
Data processing is defined as the systematic performance of operations
upon data such as handling, merging, sorting, and computing. The semantic
content of the original data should not be changed, but the semantic content
of the processed data may be changed.
Data Warehouses
Vast arrays of heterogeneous (biological) data, stored within a single
logical data repository, that are accessible to different querying and
manipulation methods.
Database
Any file system by which data gets stored following a logical process.
(see also relational database)
Deconvolution
Mathematical procedure to separate out the overlapping effects of molecules
such as mixtures of compounds in a high-throughput screen, or mixtures
of cDNAs in a high density array.
Deletion
A chromosomal alteration in which a portion of the chromosome or the
underlying DNA is lost.
Deletion mapping
Process in which different deletions in a region of DNA are created
and used to map the functionally critical areas of that DNA. e.g the minimal
region of DNA required for a test promoter can be ascertained by systematic
deletions in the region of interest.
Dendrogram
A graphical procedure for representing the output of a hierarchical
clustering method. A dendrogram is strictly defined as a binary tree
with a distinguished root, that has all the data items at its leaves.
Conventionally, all the leaves are shown at the same level of the drawing.
The ordering of the leaves is arbitrary, as is their horizontal position.
The heights of the internal nodes may be arbitrary, or may be related to
the metric information used to form the clustering.
Dimer
A composite molecule formed by the binding of two molecules (see homo
and heterodimers).
Disulfide bond
Covalent link formed between the sulfur atoms of two different cysteine
residues in a protein. Important in maintaining the folded structure of
a protein, and also for linking different proteins in a complex.
DNA (deoxyribonucleic acid)
The chemical that forms the basis of the genetic material in virtually
all organisms. DNA is composed of the four nitrogenous bases Adenine, Cytosine,
Guanine, and Thymine, which are covalently bonded to a backbone of deoxyribose-phosphate
to form a DNA strand. Two complementary strands (where all Gs pair with
Cs and As with Ts) form a double helical structure which is held together
by hydrogen bonding between the cognate bases.
DNA fingerprinting
A technique for identifying human individuals based on a restriction
enzyme digest of tandemly repeated DNA sequences that are scattered throughout
the human genome, but are unique to each individual.
DNA microarrays
The deposition of oligonucleotides or cDNAs onto an inert substrate
such as glass or silicon. Thousands of molecules may be organized spatially
into a high-density matrix. These DNA chips may be probed to allow expression
monitoring of many thousands of genes simultaneously. Uses include study
of polymorphisms in genes, de novo sequencing or molecular diagnosis of
disease.
DNA polymerase
An enzyme that catalyzes the synthesis of DNA from a DNA template given
the deoxyribonucleotide precursors.
DNA probes
Short single stranded DNA molecules of specific base sequence, labeled
either radioactively or immunologically, that are used to detect and identify
the complementary base sequence in a gene or genome by hybridizing specifically
to that gene or sequence.
DNA sequencing
The technique in which the specific sequence of bases forming a particular
DNA region is deciphered.
DNase (Deoxyribonuclease)
One of a series of enzymes that can digest DNA.
Domain (protein)
A region of special biological interest within a single protein sequence.
However, a domain may also be defined as a region within the three-dimensional
structure of a protein that may encompass regions of several distinct protein
sequences that accomplishes a specific function. A domain class is a group
of domains that share a common set of well-defined properties or characteristics.
Drug
An agent that affects a biological process. Specifically, a molecule
whose molecular structure can be correlated with its pharmacological activity.
Drug discovery cycle
The cycle of events required to develop a new drug. Typically this involves
research, preclinical testing and clinical development, and can take from
5 to 12 years.
(Continued on next part...)
Part:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
|