Glossary Molecular Biology and Computational Biology

Part: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

(Continued from previous part...)

P value The probability of an alignment occurring with the score in question or better. The p value is calculated by relating the observed alignment score, S, to the expected distribution of HSP scores from comparisons of random sequences of the same length and composition as the query to the database. The most highly significant P values will be those close to 0. P values and E values are different ways of representing the significance of the alignment.

P1-derived artificial chromosome (PAC) One type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells. Based on bacteriophage (a virus) P1 genome.

PAM Point Accepted Mutation. A unit introduced by Dayhoff et al. to quantify the amount of evolutionary change in a protein sequence. 1.0 PAM unit, is the amount of evolution which will change, on average, 1% of amino acids in a protein sequence. A PAM(x) substitution matrix is a look-up table in which scores for each amino acid substitution have been calculated based on the frequency of that substitution in closely related proteins that have experienced a certain amount (x) of evolutionary divergence.

Paralogous Homologous sequences within a single species that arose by gene duplication.

Partly open barrel Has the edge strands not properly hydrogen bonded because one of the strands is in two parts connected with a linker of more than than one residue. These edge strands can be treated as a single but interrupted strand, allowing classification with the effective strand and shear numbers, n* and S*. In the few open barrels the beta sheets are connected by only a few side-chain hydrogen bonds between the edge strands. (SCOP)

Patent In genetics, conferring the right or title to genes, gene variations, or identifiable portions of sequenced genetic material to an individual or organization.

Pedigree A family tree diagram that shows how a particular genetic trait or disease has been inherited.

Penetrance The probability of a gene or genetic trait being expressed. "Complete" penetrance means the gene or genes for a trait are expressed in all the population who have the genes. "Incomplete" penetrance means the genetic trait is expressed in only part of the population. The percent penetrance also may change with the age range of the population.

Peptide Two or more amino acids joined by a bond called a "peptide bond."

Phage A virus for which the natural host is a bacterial cell.

Pharmacogenomics The study of the interaction of an individual's genetic makeup and response to a drug.

Phenocopy A trait not caused by inheritance of a gene but appears to be identical to a genetic trait.

Phenotype The physical characteristics of an organism or the presence of a disease that may or may not be genetic.

Physical map A map of the locations of identifiable landmarks on DNA (e.g., restriction-enzyme cutting sites, genes), regardless of inheritance. Distance is measured in base pairs. For the human genome, the lowest-resolution physical map is the banding patterns on the 24 different chromosomes; the highest-resolution map is the complete nucleotide sequence of the chromosomes.

Plasmid Autonomously replicating extra-chromosomal circular DNA molecules, distinct from the normal bacterial genome and nonessential for cell survival under nonselective conditions. Some plasmids are capable of integrating into the host genome. A number of artificially constructed plasmids are used as cloning vectors.

Pleiotropy One gene that causes many different physical traits such as multiple disease symptoms. )

Pluripotency The potential of a cell to develop into more than one type of mature cell, depending on environment.

Polygenic disorder Genetic disorder resulting from the combined action of alleles of more than one gene (e.g., heart disease, diabetes, and some cancers). Although such disorders are inherited, they depend on the simultaneous presence of several alleles; thus the hereditary patterns usually are more complex than those of single-gene disorders.

Polymerase chain reaction (PCR) A method for amplifying a DNA base sequence using a heat-stable polymerase and two 20-base primers, one complementary to the (+) strand at one end of the sequence to be amplified and one complementary to the (-) strand at the other end. Because the newly synthesized DNA strands can subsequently serve as additional templates for the same primer sequences, successive rounds of primer annealing, strand elongation, and dissociation produce rapid and highly specific amplification of the desired sequence. PCR also can be used to detect the existence of the defined sequence in a DNA sample.

Polymerase, DNA or RNA Enzyme that catalyzes the synthesis of nucleic acids on preexisting nucleic acid templates, assembling RNA from ribonucleotides or DNA from deoxyribonucleotides.

Polymorphism Difference in DNA sequence among individuals that may underlie differences in health. Genetic variations occurring in more than 1% of a population would be considered useful polymorphisms for genetic linkage analysis.

Polypeptide A protein or part of a protein made of a chain of amino acids joined by a peptide bond.

Population genetics The study of variation in genes among a group of individuals. (

Positional cloning A technique used to identify genes, usually those that are associated with diseases, based on their location on a chromosome.

Primer Short preexisting polynucleotide chain to which new deoxyribonucleotides can be added by DNA polymerase.

Privacy In genetics, the right of people to restrict access to their genetic information.

Probe Single-stranded DNA or RNA molecules of specific base sequence, labeled either radioactively or immunologically, that are used to detect the complementary base sequence by hybridization.

Profile A table that lists the frequencies of each amino acid in each position of protein sequence. Frequencies are calculated from multiple alignments of sequences containing a domain of interest. See also PSSM.

Profile A profile is a table of position-specific scores and gap penalties, representing an homologous family, that may be used to search sequence databases. In CLUSTAL-W-derived profiles those sequences that are more distantly related are assigned higher weights.

Prokaryote Cell or organism lacking a membrane-bound, structurally discrete nucleus and other subcellular compartments. Bacteria are examples of prokaryotes.

Promoter A DNA site to which RNA polymerase will bind and initiate transcription.

Pronucleus The nucleus of a sperm or egg prior to fertilization.

Protein A large molecule composed of one or more chains of amino acids in a specific order; the order is determined by the base sequence of nucleotides in the gene that codes for the protein. Proteins are required for the structure, function, and regulation of the body's cells, tissues, and organs; and each protein has unique functions. Examples are hormones, enzymes, and antibodies.

Proteome Proteins expressed by a cell or organ at a particular time and under specific conditions.

Proteomics Systematic analysis of protein expression of normal and diseased tissues that involves the separation, identification and characterization of all of the proteins in an organism.

Pseudogene A sequence of DNA similar to a gene but nonfunctional; probably the remnant of a once-functional gene that accumulated mutations.

PSI-BLAST Position-Specific Iterative BLAST. An iterative search using the BLAST algorithm. A profile is built after the initial search, which is then used in subsequent searches. The process may be repeated, if desired with new sequences found in each cycle used to refine the profile. Details can be found in this discussion of PSI-BLAST. (Altschul et al.)

PSSM Position-specific scoring matrix; The PSSM gives the log-odds score for finding a particular matching amino acid in a target sequence.

Purine A nitrogen-containing, double-ring, basic compound that occurs in nucleic acids. The purines in DNA and RNA are adenine and guanine.

Pyrimidine A nitrogen-containing, single-ring, basic compound that occurs in nucleic acids. The pyrimidines in DNA are cytosine and thymine; in RNA, cytosine and uracil.

(Continued on next part...)

Part: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26