(Continued from previous part...)
A process whereby automated or semi-automated algorithms are used to
process experimental data, including noise, experimental errors and other
artifacts, in order to generate and store high-quality data for use in
subsequent analysis. Data cleaning is typically required in high-throughput
sequencing where compression or other experimental artifacts limit the
amount of sequence data generated from each sequencing run or "read."
The ability to query very large databases in order to satisfy a hypothesis ("top-down" data mining); or to interrogate a database in order to generate new hypotheses based on rigorous statistical correlations ("bottom-up" data mining).
Data processing is defined as the systematic performance of operations upon data such as handling, merging, sorting, and computing. The semantic content of the original data should not be changed, but the semantic content of the processed data may be changed.
Vast arrays of heterogeneous (biological) data, stored within a single logical data repository, that are accessible to different querying and manipulation methods.
Any file system by which data gets stored following a logical process. (see also relational database)
Mathematical procedure to separate out the overlapping effects of molecules such as mixtures of compounds in a high-throughput screen, or mixtures of cDNAs in a high density array.
A chromosomal alteration in which a portion of the chromosome or the underlying DNA is lost.
Process in which different deletions in a region of DNA are created and used to map the functionally critical areas of that DNA. e.g the minimal region of DNA required for a test promoter can be ascertained by systematic deletions in the region of interest.
A composite molecule formed by the binding of two molecules (see homo and heterodimers).
Covalent link formed between the sulfur atoms of two different cysteine residues in a protein. Important in maintaining the folded structure of a protein, and also for linking different proteins in a complex.
DNA (deoxyribonucleic acid)
The chemical that forms the basis of the genetic material in virtually all organisms. DNA is composed of the four nitrogenous bases Adenine, Cytosine, Guanine, and Thymine, which are covalently bonded to a backbone of deoxyribose-phosphate to form a DNA strand. Two complementary strands (where all Gs pair with Cs and As with Ts) form a double helical structure which is held together by hydrogen bonding between the cognate bases.
A technique for identifying human individuals based on a restriction enzyme digest of tandemly repeated DNA sequences that are scattered throughout the human genome, but are unique to each individual.
The deposition of oligonucleotides or cDNAs onto an inert substrate such as glass or silicon. Thousands of molecules may be organized spatially into a high-density matrix. These DNA chips may be probed to allow expression monitoring of many thousands of genes simultaneously. Uses include study of polymorphisms in genes, de novo sequencing or molecular diagnosis of disease.
An enzyme that catalyzes the synthesis of DNA from a DNA template given the deoxyribonucleotide precursors.
Short single stranded DNA molecules of specific base sequence, labeled either radioactively or immunologically, that are used to detect and identify the complementary base sequence in a gene or genome by hybridizing specifically to that gene or sequence.
The technique in which the specific sequence of bases forming a particular DNA region is deciphered.
One of a series of enzymes that can digest DNA.
A region of special biological interest within a single protein sequence. However, a domain may also be defined as a region within the three-dimensional structure of a protein that may encompass regions of several distinct protein sequences that accomplishes a specific function. A domain class is a group of domains that share a common set of well-defined properties or characteristics.
An agent that affects a biological process. Specifically, a molecule whose molecular structure can be correlated with its pharmacological activity.
Drug discovery cycle
The cycle of events required to develop a new drug. Typically this involves research, preclinical testing and clinical development, and can take from 5 to 12 years.
(Continued on next part...)