Topics‎ > ‎OCR A2 Topics‎ > ‎

Population genetics and epigenetics

Allele frequency

In the wild, each species may exist as one population or multiple populations. Different populations correspond to defined areas - habitats.

The sum of all present alleles for a given gene in a given population is known as the gene pool.

This is essentially a way of thinking about all the individuals in a population contributing their alleles towards the overall allele frequency. The extent of different alleles present gives the genetic diversity of a population.

The allele frequency in a population's gene pool can change as a result of selection. The effectors of selection can be varied, yet the outcome is similar: advantageous or preferred alleles and the traits associated with them increase in frequency, while detrimental or disfavoured alleles and the traits associated with them decrease in frequency.

Natural selection

Here is an all-time classic example. The most frequent initial moth colour in a population landing on tree trunks was dark, to match that of the tree trunks. Few moths could get away with being light-coloured. Once the tree trunks were painted white, the former moths became very apparent to predators, and so the light-coloured moths evaded predation much better and survived to reproduce. Essentially, the tables had turned!

This resulted in the allele for light colour to spread and become the most frequent compared to that for dark colour. The latter sharply dropped in frequency and became the minority.

This is an example of directional selection. It tends towards an extreme, either the light-coloured or the dark-coloured, depending on scenario. 

Selection can also tend towards a "happy medium" and avoid either extreme. This is stabilising selection. If really small lions don't survive long, but really large lions can't supply themselves enough food, then the average lions are selected for and achieve the highest frequency.

Directional selection also takes place when antibiotics are used against bacteria. The adaptive pressure favours bacteria that have the antibiotic resistance gene and can survive the hostile environment.

On the other hand, a scenario such as human birth weight showcases stabilising selection. The average weight is large enough to keep the newborn healthy and increasingly able to survive independently, but small enough to enable the actual birth.

Natural selection therefore results in species increasingly and consistently adapted to their environment via anatomical, physiological or behavioural changes.

The train of thought leading to natural selection includes these key points:

1. Individuals within a population exhibit variety of phenotypical traits caused by both their alleles and the environment.

Primarily the source of this variation is mutation. Secondarily it is meiosis and the random fertilisation of gametes in the case of sexual reproduction.

2. The balance of survival and reproduction is affected by factors including predationdisease and competition. Some appearances and behaviour can attract more predators while others such as camouflage can avert them.

Disease can impede survival and reproduction, while competition enables hidden traits that might have gone unnoticed or been "neutral" before to come in handy when unforeseen selection pressures arise. If the positive outcome of such competition, such as resources needed for survival, are limited relative to the population seeking them, then competition acts further to select certain traits.

3. Any favourable traits controlled by alelles will end up in more offspring, thereby shifting the alelle frequency and over time, the entire gene pool of a population or species.

Types of selection

We looked at stabilising and directional selection previously.

There is a third type called disruptive selection. Instead of shifting the traits towards an end, or towards a middle ground, disruptive selection splits the pool down the middle, where both extremes of a trait are favourable, but not a middle value.

An example of this is an original population of purple individuals which stand out quite a lot amongst red and blue flowers in a field. They will end up shifting towards either red or blue, but not staying purple as this attracts predators.
Little devil bats. Birds? Anyway.

Malaria and sickle-cell anaemia

The evolution of different allele frequencies in different conditions with specific selection pressures can be seen in humans when looking at the frequency of the sickle-cell anaemia allele in areas that suffer the most from unrelated malaria infections.

Why would the sickle cell allele for haemoglobin S spread and increase in frequency when it causes sickle cell disease in its homozygous form? Upon observing the association of this frequency with the prevalence of malaria, it becomes clear that haemoglobin S may actually be beneficial against malaria. This would explain its presence and increase in the allele pool of these populations.

Indeed, in its heterozygous state i.e. having just one allele of the sickle cell does not cause the full blown effects of sickle cell anaemia, while protecting against malaria. Sickle-shaped red blood cells cannot be infected by the malaria parasite (Plasmodium falciparum).

Therefore, the frequency of what would be a deleterious allele in isolation increases, as it behaves like an advantageous allele in the context of prevalent malaria infections. 

Hardy-Weinberg equations

How could we keep track of the frequency of each allele for a given trait when we have a dominant-recessive interaction? More specifically, how could we account for the visible dominant traits as homozygous or heterozygous, since both look the same?

This is where the Hardy-Weinberg principle comes in. Firstly, there are criteria for when this principle may be applied to a population:

1. Random mating must take place.

2. No migration must occur either inwards or outwards of the population.

3. No mutations must arise in the population.

4. No natural selection must take place due to one trait being better or worse adapted to the environment.

It's apparent that this is simply rarely, if ever, the case in a real wild population. However, the Hardy-Weinberg principle is useful at predicting allele frequencies in a reliable mathematical model.

The frequency of the dominant allele is noted while that of the recessive allele is noted q. Both must necessarily account for the whole population, therefore:

p + q = 1

The values are frequencies, so they are noted as percentages. 1 is 100% while 0.5 is 50% and 0.05 is 5%, etc.

Worked exercise

If we know that the frequency of the allele for dark fur in a population of koala bears is 0.2, and this allele is dominant over the one for light fur, work out the frequency of the allele for light fur in the population.

p = 0.2

p + q = 1

Therefore, 0.2 + q = 1 so q = 1 - 0.2

q = 0.8 or 80%.

Now the allele frequency has been worked out, how could we work out the actual phenotype of the koala bears in the population. How many are actually dark-furred? How many of the dark-furred ones are homozygous?

For this we use the same equation as before, but squared: (p + q)2 

This is equivalent to p2 + 2pq + q2 = 1

Where 2pq is the frequency of heterozygotes, and p2 and q2 the frequencies of homozygous dominant and homozygous recessive respectively.

We want to know how many koala bears have dark fur. We know that the allele frequency for dark fur is 0.2, so 0.22 is the percentage of homozygous dark fur individuals; = 0.04 (4%).

This trait being dominant, the heterozygotes must also have dark fur. The frequency of heterozygous dark fur is 2pq = 2*0.2*0.8 = 0.32 (32%).

So overall, there are (0.4 + 0.32) 0.36 or 36% dark-furred koala bears in the population.

This leaves the remaining 64% with light fur. Note the contrast between the light phenotype only being 64% while the allele frequency for light fur is 80%. If the allele were dominant over dark fur, the frequency would be higher rather than lower.

The founder effect

Suppose a boat travelled from one island to another. In the process, several lizards were transferred from the first island to the other. The lizards breed and settle down to form a new lizard population on their new island. This is called the founder effect. The small number of founding lizards formed the genetic base on which the whole population was built. This genetic base is significantly smaller than that of the original lizard population on the first island.

Therefore, the genetic diversity of the new population is lower than that of the original population.

Genetic bottlenecks

The only difference between the founder effect and genetic bottlenecks is the way in which the new genetic pool is formed. In the founder effect the new pool is formed when a few individuals from a population become geographically isolated, while in genetic bottlenecks the new gene pool is formed when only a few individuals from a population survive a mass disaster, or are the only ones to breed.

The effect is the same: the genetic variation of the new population is decreased compared to the original population.

Human examples of the founder effect and genetic bottlenecks are found throughout the history of migration, as exemplified by blood groups.

The migrations of humans from North Africa and Europe to Australia, North America, North Asia, South Africa and South America can be traced through the prevalence of the 0- blood group, while the migration of humans from Africa and the Middle East to South Asia and East Asia can be traced through the prevalence of the B+ blood group.

Another example of a much rarer outcome of the founder effect in humans is a condition called Ellis-van Creveld syndrome. It is an autosomal recessive condition associated with mutations in two genes whose protein products are smaller than they should be, resulting in multiple symptoms such as short limbs, presence of teeth at birth, heart defects, dwarfism, cleft palate and others.

The syndrome occurs very rarely and arises in reproductively isolated communities founded with a small allele pool, e.g. the Old Order Amish population in the US and Western Australian indigenous natives.


What is at the heart of new species formation? It all starts with a single population of a species which for whatever reason (genetic bottlenecks, founder effect, etc.) ends up being split geographically to the point where no interbreeding occurs for a certain length of time. 

Given that the two habitats are different, the individuals in each population will adapt differently to counteract different selection pressures. Say for example the ants in the forest experience a warmer and more nutrient-rich surrounding compared to the emigrated ants on a nearby, although disconnected, beach.

The adaptations acquired by both populations over a long time will get increasingly disparate. When these pass a threshold, the two populations can no longer interbreed, even if the opportunity were given (due to excessive genetic difference). They have now become separate species! This process is called speciation.

Speciation due to an established barrier such as geographical separation is termed allopatric.

Speciation can also occur in absence of a barrier. The individuals of a starting species can share the same physical space and be able to come into contact with each other, yet for other reasons subspecies can still separate within that population in what is termed sympatric speciation.

Sympatric speciation may occur as a result of different members of the former species occupying different niches within the same habitat. Perhaps they start feeding on different sources, behaving differently, having different mating signals, etc.

Primate speciation which includes humans looks at how and when different species part of the Hominidae family diverged to form the present species: chimpanzees, orangutans, bonobos, gorillas and humans. Following DNA analysis of chromosomes between these species, and looking at how specific regions of each chromosome have been conserved between species, a map of speciation could be drawn i.e. gene tree.

As it turns out, this wasn't an easy task. Factors contributing to confusion over determining how closely related different species are include different rates of evolution for different genes, gene flow between populations pre-speciation, gene duplications and deletions, and recombination of adjacent regions on chromosomes.

Comparing the sequences that are equivalent between species revealed that humans have 99% identity with chimpanzees and bonobos, 98% with gorillas and 97% with orangutans.

The pattern of speciation between our respective ancestor species pre-divergence has been subject to heated debate. Since the divergence was not clear-cut, questions around hybridisation were raised, where an initial separation would have been shortly followed by a cross-breeding of populations, and then a final, permanent divergence to form the two distinct species which exist today. Therefore, the species tree and the gene tree do not necessarily match completely over time.

While it is safe to say that chimps are the closest relatives of humans, the period of time during which all 4 groups of Hominidae speciated was relatively brief, and so it was subject to differing patterns of cross-over between them.


In eukaryotes, epigenetics refers to the heritable changes in gene function that do not involve any change to the DNA sequence. This underpins an embryo's ability to differentiate its cells into specialised lineages for different organs and tissues in the adult: skin tissue, muscle tissue, nervous tissue, etc.

Transcription can be inhibited by specific means. A common way is increased DNA methylation. The methyl (CH3) group acts as a tag on the DNA at various locations and prevents transcription that might've occurred otherwise.

Another chemical modification that can induce epigenetic effects and control gene expression is histone deacetylation. Histones hold the DNA chromatin and help to compress it. In its acetylated state, it is relaxed and the DNA can be accessed by transcription machinery. Deacetylation results in the tightening of chromatin around the histones, no longer making the genetic material accessible.

Knowledge of epigenetics can help in addressing various illness including cancer. Controlling gene expression remotely is much easier than having to change the DNA sequence itself. Drugs can act as signals for specific genes to be activated or deactivated. In the case of cancer, it has been shown that cancer cells switch off the genes associated with tumour detection. They also show additional epigenetic anomalies such as histone modifications and deregulation of proteins that bind DNA.

Epigenetic variation spans different tissues in the body, even different cells in the same tissue, and different ages of an individual.

The foundation of the field of epigenetics was spurred by a series of studies dubbed the Norrbotten studies carried out in the remote and scarcely populated Norrbotten County in Sweden, near the Arctic circle. Initially, the studies were concerned with following cohorts of newborns, their parents and then their own children and grandchildren, to study the effects of bouts of famine on their development and lifespan.

The studies were done in the 19th century, and crop failures happened unpredictably over the years and decades. By collecting data on the food availability to people, as well as their developing health over the years, it was found that bouts of plentiful food and overeating were associated with a shorter lifespan compared to times of famine when participants were children.

The offspring of those who ate too much as children were themselves affected by some of the illnesses their parents suffered from, as well as a shorter lifespan. Therefore, the effects of diet were being inherited.

Another similar study looking into the effects of diet was carried out on the children born following the Dutch Hunger Winter of 1944-45 during which people, including those pregnant, had access to little food in the range of 500 - 1,000 calories per day. This was during WWII with the Netherlands being under German occupation.

The long-term follow-up of those born following the hunger winter revealed that those subjected to hunger in the first trimester of pregnancy ended up becoming more likely to suffer from obesity and cardiovascular disease than those exposed to hunger later in pregnancy, and those born before or after the period of hunger.

Higher incidence of nervous system disorders such as schizophrenia were also found in those starved in the first trimester of pregnancy. This is the critical period for nervous system development.

People of the hunger winter cohort had lower birth weights than their control siblings. Those exposed to hunger early in pregnancy also presented decreased methylation of the IGF-2 gene compared to siblings and those exposed later in pregnancy.

IGF-2 (insulin-like growth factor 2) was investigated specifically because it promotes growth during gestation.

Twin studies have also contributed knowledge on epigenetics. Since identical twins are expected to share their DNA, any purely environmental pressures such as stress, diet, exercise, etc. could be monitored, and could show how much of an outcome e.g. appearance, disease, etc. would be attributable to genetics and how much was attributable to the environment.

Indeed, epigenetic differences between twins accrue over time. Older pairs of twins have been shown to have collected more epigenetic differences (such as DNA methylation and histone acetylation) between them compared with younger pairs of twins. Differences were also observed in twin pairs that had been separated for a long time, or experienced very different environments throughout life.

Ok byeeeeeeee

<< Previous topic: Patterns of inheritance                                                                               Next topic: Gene technologies >>