Genomics comes with a lot of terminology. Our glossary features some of the terms you may come across in genetics and genomics. From allele to zygosity, we explain everything in easy-to-understand language.
A variant form (or version) of a gene. Some genes have different forms that are found in the same place in the genome. Humans have two alleles for most genes, with one inherited from each parent. Individuals can have two of the same allele (homozygous) or have different alleles (heterozygous).
Molecules that act as building blocks for proteins. A protein is made up of chains of amino acids. Properties of amino acids such as size and charge, determine the structure, function, and folding of a protein.
Adding information about a variant in the genomic data file, such as the variant’s chromosome location, gene location, or predicted effect on protein structure or function.
A numbered chromosome, unrelated to the sex of an organism.
Parts of the DNA building blocks that are commonly called the four ‘letters’ of DNA: adenine (A), cytosine (C), guanine (G) and thymine (T). The ‘letters’ of RNA are adenine (A), cytosine (C), guanine (G) and uracil (U).
Having variants on both copies (alleles) of a gene. An affected individual could be homozygous or compound heterozygous.
A field of biology that uses algorithms and software to analyse biological data, using the data to make biological discoveries, construct models or make predictions.
Call (a variant)
The process of identifying a variant from DNA sequence data. Sample DNA is sequenced and aligned to a reference genome for comparison. Differences in the sample are determined to be variants – they are ‘called’.
Genetic testing of biological relatives of an individual with a variant known to cause a genetic condition. Testing aims to identify family members carrying the variant and their chance of developing the condition or passing the variant on to their children.
A compact, threadlike structure composed of a DNA molecule coiled around histone proteins. Humans have 22 pairs of numbered chromosomes (autosomes) and one pair of sex chromosomes (XX or XY), with one of each pair inherited from each parent.
A diagnostic test to identify structural changes in chromosomes, such as an altered number of whole chromosomes and copy number variants.
A sequence of 3 bases in mRNA that codes for a particular amino acid in a protein.
Copy number variant; copy number variation (CNV)
A difference in the number of copies of a specific section of DNA, such as large sequence duplications and deletions.
A ‘new’ variant that arises in a gamete (sperm or egg), early in embryonic development, or in cancer cells is a de novo variant. De novo variants will be seen in an affected individual but not their parents – the variant is not inherited.
The loss (or deletion) of one or more nucleotides (DNA building blocks) from a DNA sequence.
Having two complete sets of chromosomes, with each parent contributing one of each pair.
DNA (Deoxyribonucleic Acid)
The genetic material of life on earth. DNA is built from 4 nucleotides – adenine (A), cytosine (C), guanine (G) and thymine (T) – joined in strands by phosphodiester bonds. Two linked strands form a double helix of complementary base pairs (A-T and C-G).
The order of the nucleotide bases in a DNA molecule.
In cancer, a gene with one or more variants that increases the rate of cell replication.
Modification of a DNA molecule by addition of chemical ‘tags’ without changing the DNA sequence. These changes can alter the way genes are turned on and off, and can be inherited.
The protein-coding region of a gene.
A change in the reading frame – groups of 3 nucleotides – of a gene. An insertion or deletion that is not a multiple of 3 nucleotides will produce a frameshift.
A section of DNA that carries the code for a protein or RNA molecule. Individuals inherit genes from their parents. They contain information that determines physical and biological characteristics.
The process of turning genes on and off to decode a DNA sequence into a protein. Technically, this involves two processes: transcription, which ‘copies’ a gene into an mRNA molecule, and translation which ‘reads’ the mRNA to make a protein.
The entire set of DNA information of an organism, including all of the genes. It contains all the information necessary for the human body to develop and function. The human genome has about 3 billion DNA base pairs and around 20,000 protein coding genes.
Genetic variants that are present in gametes (egg and sperm cells) and can potentially be inherited by offspring.
Having only one copy of a gene as a result of having only one copy of the chromosome. Examples include the genes on the X-chromosome in males, or loss of alleles due to deletion of a section of chromosome.
A protein complex within the cell nucleus. The long strands of chromosomal DNA coil around histones for a more compact shape.
Addition of one or more nucleotides into a DNA sequence.
A laboratory-produced representation of a person’s complete set of chromosomes in numerical order.
Patterns of inheritance of how characteristics are passed down from parents. The patterns establish how children can inherit traits due to a single gene (monogenic conditions). Examples of patterns include: recessive, dominant, X-linked inheritance.
A collective term used to refer to around 4000 genes known to carry variants associated with conditions caused by a single gene variant (monogenic conditions). The term derives from ‘Mendelian’ inheritance.
A genetic variant (nucleotide substitution) causing a change in one amino acid in the resulting protein.
A condition caused by a variant in a single gene.
Messenger RNA (mRNA) carries the information needed to produce proteins. mRNA is produced by transcription of the DNA template. The initial transcript of the gene contains both introns and exons. Introns are spliced out to produce mature mRNA.
Multigene panel test
A laboratory test that looks at several candidate genes known to cause a condition. It is used to identify variants that may be the cause of the condition.
A change in the DNA sequence. In a clinical setting, mutations are usually now called variants.
Next-generation sequencing (NGS)
DNA sequencing technology used for sequencing many genes at once. It is faster than preceding sequencing methods, such as Sanger sequencing. It is also called massively parallel sequencing. NGS technology is the method used for sequencing the entire genome.
Nonsense mediated decay (NMD)
A gene change that causes a premature stop codon, a signal to stop producing the protein, rather than coding for an amino acid. This results in producing a short or truncated protein product. It can cause NMD (see above).
See multigene panel test.
A publicly available knowledgebase and source of gene panels for the analysis of a genomic sequence. See PanelApp Australia.
Disease-causing. A pathogenic variant affects cell function and causes a genetic condition.
A chart with symbols representing inheritance over 2 or more generations of a family.
The physical appearance and physiology of an individual, resulting from expression of an individual’s genetic makeup (genotype) and influenced by environmental factors.
A variant that occurs frequently in a population, with a frequency >1%. Polymorphic genes contribute to typical variations with the population, e.g., the genes that control hair colour are polymorphic.
The individual through whom a family with a genetic disorder is ascertained – the first person in a family to be diagnosed with a genetic disorder.
Molecules encoded by genes, comprised of amino acids in a sequence specified by the DNA sequence. The sequence of amino acids determines how a protein folds and functions.
An inactive version of a gene. Pseudogenes began as a functional protein-coding gene but have lost their ability to code for proteins due to accumulated mutations through evolution.
The sequencing copies of a DNA sequence. Many reads of the same DNA region are needed for reliable variant identification when compared to a reference genome.
Reference sequence or reference genome
RNA (Ribonucleic Acid)
A nucleic acid similar to DNA but containing ribose sugar instead of deoxyribose sugar in its structure. RNA is often single-stranded, and the nucleotide bases are adenine (A), cytosine (C), guanine (G) and uracil (U).
A method of determining the order of nucleotides in DNA, one gene at a time. It is used to confirm variants and single gene sequence.
In mammals, the X chromosome and Y chromosome that typically determine the biological sex of the individual.
Genes located on the sex chromosomes (X or Y chromosomes).
Single gene test
Genomic testing performed on an individual subject; as compared to trio analysis, where the affected individual and their parents are tested.
SNP (Single nucleotide polymorphism)
A singe base pair in DNA that shows polymorphism – having alternate alleles – in a population.
SNV (Single nucleotide variant)
A single base difference between an individual’s DNA sequence compared to a reference sequence in a genomic test.
A change in DNA that occurs after fertilisation of egg and sperm and is not inherited.
Splice site variant
Structural variant (SV)
Large deletions, insertions, inversions, translocations, gene fusions and gene duplications occurring in chromosomes.
A variant where one nucleotide is replaced by one other nucleotide.
The RNA produced by transcription of a gene, where the DNA sequence is ‘copied’ into an RNA sequence.
The process of a ribosome reading the mRNA sequence to bring the correct amino acids needed to produce a polypeptide or protein.
Trinucleotide repeat; triplet repeat
Three consecutive nucleotides that repeat in tandem at one location, e.g. CCGCCGCCGCCG. Also called triplet repeat expansion.
The genomic testing of an affected individual and both their biological parents.
A situation where an individual has two copies of a chromosome (or part of a chromosome) originating from one biological parent rather than one from each parent.
A change or variation in a DNA sequence as compared to a reference sequence. Variants range from single base changes to large rearrangements of DNA.
The scale used to describe the likelihood of a variant being pathogenic or benign. The classifications used are typically: class 5-Pathogenic, class 4-Likely Pathogenic, class 3-Variant of Uncertain Significance, class 2-Likely Benign and class 1- Benign.
The process of gathering evidence for and against a variant being pathogenic or benign.
The overall process of finding and prioritising the variants (gene changes) found in a genomic test, then collecting and curating evidence (variant curation) to determine how likely they are to explain the cause of a condition or cancer (variant classification) and identifying whether the result provides information on treatment changes for a patient.
VUS (VOUS), variant of uncertain significance
A change in DNA sequence where it is unclear whether it is the cause of a condition.
W, X, Y, Z
WGS (whole genome sequencing)
Determining the sequence of all the DNA in an individual – both the regions that code for protein and the ‘non-coding’ regions.
The inactivation of one copy of the X-chromosome in females. Biological females have two X chromosomes, XX, compared to males with one X chromosome and one Y chromosome. X-inactivation ‘evens up’ the dosage of X-linked genes in males and females.
The degree of similarity of the alleles at a given genomic location, usually defined by the terms homozygous, heterozygous or hemizygous.