GoAL - Go Artificial Life - Part II

In the last article, we created a trivial genome. We’ll now extend this with a far more expressive DNA.
We’ll increase the number of valid codons but retain a simple approach to evaluating them.

We’ll focus on features of the world with which our organisms can interact. The features will be ideas like food and threats. They should require our genomes to have longer sequences to cope with the complexity. Hopefully, there will no longer be an obvious solution.

Complexity

The level of complexity we can simulate is necessarily lower than that which we can observe. We’ll create a hugely abstract system to model our virtual world. Perhaps in future post we will revisit this choice.

That being said, perhaps this might still be of inspiration. As of late last year, this is the most detailed image of a human cell, and it’s mind-boggling.

Human Cell

There’s also some interesting open source simulations of artificial life. These two projects aim to emulate the neural structure of fruit flies and roundworms. In the game Creatures, neural networks were married with a rich biochemistry. Perhaps we can later look at evolving these systems in tandem.

Genetic Algorithms - Fun Fact

I struggle to find a source for it now, but many years ago I heard of GAs being used to design integrate circuits.
There was a strange winding pattern in an evolved circuit which seemingly offered no advantage, but when it was removed the circuit failed to function correctly. It was hypothesised that the GA was taking advantage of minute delays in the propagation of current along this path.

Meet the Codons

Energy and food

Energy and food seem like good candidates to emulate. The energy available to our organisms can limit their lifespans, and spawning children must be balanced with attaining more food.

Let’s add some new codons to our repertoire:

A - No Op
B - Spawn child
C - Locate food
D - Eat food

Iterating now costs us energy. Food must be located before its attempted to be eaten, otherwise energy is wasted. Spawning has a very high energy cost.

	case 'C': // Locate Food
		g.FoundFood = true
		logWithFields(&g).Debug("Located Food")
	case 'D': // Eat Food
		if g.FoundFood {
			g.Energy += 10.0
		}

	if gene != "C" {
		g.FoundFood = false
	}

	g.Energy = g.Energy - 1.0

Size

Lots of organisms grow! It’ll cost exponentially more energy to grow and sustain our organisms, but they will be rewarded with more energy each time they gather food. Let’s also increase the cost of spawning children, which should put some pressure on our creatures to invest in costly growing. However, they must balance it against the increased maintenance cost.

type GenomeState struct {
	Children  uint
	Energy    float64
	FoundFood bool
	Size      uint
}
...
	case 'B': // Spawn Child
	if g.Energy > 15.0 {
...
	case 'D': // Eat Food
		if g.FoundFood {
			g.Energy += 10.0 * float64(g.Size)
		}
	case 'E': // Grow
		g.Energy -= math.Pow(float64(g.Size), 1.05)
		g.Size += 1

Threats

Now we’ll provide a simple model of threat. The threat level increases on every iteration. Once it reaches a threshold, it starts to damage (take energy from) our organism. Threats can be mitigated by defending or evading. Defending scales well with size. Evading scales inversely with size.

This threat level will help to compete against unbounded growth and spawning.

I evolved genomes of length 12 and 20. We see the smaller genome taking advantage of evade and defend, however the larger genome converges on evading only in this case. There seems to be a healthy trade-off to made with size.

Conditional genes

At this point I evaluated genomes of various lengths. Up until a length of 10 the genomes had very low fitness.

Length Genome Children
8 CDCDCDBG 6
9 CDCDCDABG 6
10 CDCDDGCDBG 99
11 CDCDCCDBGGA 90
12 ACDCDCDBGDAG 82
13 CDCDAGCDCBAGA 76
20 ECDCDBCDGCDEFCDCDBGG 99

I assumed they could be made viable by adding conditionals to our DNA logic. The hypothesis is that this additional functionality will allow the GA to stop repeated sequences of locate/eat, and replace those with conditional evade/defend/grow codons.

Let’s implement two such genes. The first will skip the next instruction if our threat is considered low.
We’ll set the threshold to 5.0. We also make sure to clamp the Threat value to a minimum of 0.0.

Likewise, we’ll add another to skip if our energy is too low. We’ll set this threshold at 30.0, slightly larger than the energy required to spawn a child.

		case 'H': // Skip if low threat
			if g.Threat < 5.0 {
				index++
			}
		case 'I': // Skip if low energy
			if g.Energy < 30.0 {
				index++
			}

Analysis

We now have a much larger search space for our GA to explore. Let’s evolve some populations and take a deeper look at the results.

Length Genome Children
4 CDIB 3
5 GCDIB 66
6 HGCDIB 66
7 ECDIBGI 62
8 IHEIBCDF 127
9 GCDCDCDGB 110
10 CDFIHEIBCH 107
11 CDGIECDIBHF 101
12 GCDFIHIHEIBH 93
13 CDCDIBCDGBGHF 90
14 GECDHFCDIBIBCD 131
18 HCCDCDCDGHCBGIHEIB 117
19 CDIBCDEFCAADFHBHAIB 119
20 IBCDEIBCDCDCDFFIBIHG 132
40 CDHFCDECDBFGIBCDCDEG 125
GHDHECDHBBCDIBIBCDFE

We see a range of performance. Genomes need to be at least 8 codons long to be competitive.

Let’s take a look at 3rd place which has a short enough genome to disect.

➜  go-al git:(master) ✗ ./ga parse IHEIBCDF
,-Skip if Low Energy
|-Skip if Low Threat
`>Grow
,-Skip if Low Energy
| Spawn Child
`>Locate Food
  Eat Food
  Defend

There’s a conditional grow, a conditional spawn, locate/eat food and defend. It seems to check all the boxes. But what is the purpose of Skip if Low Energy followed by Skip if Low Threat? This isn’t immediately obvious to me, and it appears frequently in a number of genomes. If either command is replaced with a no-op the fitness is drastically worsened.

What does our search space look like now?

We now have 9 instructions. For a length of 8 we have 9^8 possibilities, or 43,046,721 different genomes. At length 10 this jumps to about 3.5 billion permutations. This is definitely in the realm where brute force is no longer an option.

By length 20, where we found our best solution, there’s 12,157,665,459,056,928,801 - 12 quintillion - possible variants. At a rate of 1 billion evaluations per second it would take about 385 years.

What’s next

I’m not sure what the next post will cover. Maybe we’ll try a more intricate conditional instruction set. Perhaps attempt some form of coevolution. Or we could look to discard the idea of fixed genome lengths (this requires some work, eaopt does not seem to support variable lengths with its native crossover methods).

Resources

You can find the source code for this here at v0.2.0

eaopt - The GitHub repository for eaopt
go-al master - The latest version of go-al