GoAL - Go Artificial Life - Part II
In the last article, we created a trivial genome. We’ll now extend this with a far more expressive DNA.
We’ll increase the number of valid codons but retain a simple approach to evaluating them.
We’ll focus on features of the world with which our organisms can interact. The features will be ideas like food and threats. They should require our genomes to have longer sequences to cope with the complexity. Hopefully, there will no longer be an obvious solution.
Complexity
The level of complexity we can simulate is necessarily lower than that which we can observe. We’ll create a hugely abstract system to model our virtual world. Perhaps in future post we will revisit this choice.
That being said, perhaps this might still be of inspiration. As of late last year, this is the most detailed image of a human cell, and it’s mind-boggling.
There’s also some interesting open source simulations of artificial life. These two projects aim to emulate the neural structure of fruit flies and roundworms. In the game Creatures, neural networks were married with a rich biochemistry. Perhaps we can later look at evolving these systems in tandem.
Genetic Algorithms - Fun Fact
I struggle to find a source for it now, but many years ago I heard of GAs being used to design integrate circuits.
There was a strange winding pattern in an evolved circuit which seemingly offered no advantage, but when it was removed the circuit
failed to function correctly. It was hypothesised that the GA was taking advantage of minute delays in the propagation
of current along this path.
Meet the Codons
Energy and food
Energy and food seem like good candidates to emulate. The energy available to our organisms can limit their lifespans, and spawning children must be balanced with attaining more food.
Let’s add some new codons to our repertoire:
A - No Op
B - Spawn child
C - Locate food
D - Eat food
Iterating now costs us energy. Food must be located before its attempted to be eaten, otherwise energy is wasted. Spawning has a very high energy cost.
case 'C': // Locate Food
g.FoundFood = true
logWithFields(&g).Debug("Located Food")
case 'D': // Eat Food
if g.FoundFood {
g.Energy += 10.0
}
if gene != "C" {
g.FoundFood = false
}
g.Energy = g.Energy - 1.0
Size
Lots of organisms grow! It’ll cost exponentially more energy to grow and sustain our organisms, but they will be rewarded with more energy each time they gather food. Let’s also increase the cost of spawning children, which should put some pressure on our creatures to invest in costly growing. However, they must balance it against the increased maintenance cost.
type GenomeState struct {
Children uint
Energy float64
FoundFood bool
Size uint
}
...
case 'B': // Spawn Child
if g.Energy > 15.0 {
...
case 'D': // Eat Food
if g.FoundFood {
g.Energy += 10.0 * float64(g.Size)
}
case 'E': // Grow
g.Energy -= math.Pow(float64(g.Size), 1.05)
g.Size += 1
Threats
Now we’ll provide a simple model of threat. The threat level increases on every iteration. Once it reaches a threshold, it starts to damage (take energy from) our organism. Threats can be mitigated by defending or evading. Defending scales well with size. Evading scales inversely with size.
This threat level will help to compete against unbounded growth and spawning.
I evolved genomes of length 12 and 20. We see the smaller genome taking advantage of evade and defend, however the larger genome converges on evading only in this case. There seems to be a healthy trade-off to made with size.
Conditional genes
At this point I evaluated genomes of various lengths. Up until a length of 10 the genomes had very low fitness.
Length | Genome | Children |
---|---|---|
8 | CDCDCDBG | 6 |
9 | CDCDCDABG | 6 |
10 | CDCDDGCDBG | 99 |
11 | CDCDCCDBGGA | 90 |
12 | ACDCDCDBGDAG | 82 |
13 | CDCDAGCDCBAGA | 76 |
20 | ECDCDBCDGCDEFCDCDBGG | 99 |
I assumed they could be made viable by adding conditionals to our DNA logic. The hypothesis is that this additional functionality will allow the GA to stop repeated sequences of locate/eat, and replace those with conditional evade/defend/grow codons.
Let’s implement two such genes. The first will skip the next instruction if our threat is considered low.
We’ll set the threshold to 5.0. We also make sure to clamp the Threat value to a minimum of 0.0.
Likewise, we’ll add another to skip if our energy is too low. We’ll set this threshold at 30.0, slightly larger than the energy required to spawn a child.
case 'H': // Skip if low threat
if g.Threat < 5.0 {
index++
}
case 'I': // Skip if low energy
if g.Energy < 30.0 {
index++
}
Analysis
We now have a much larger search space for our GA to explore. Let’s evolve some populations and take a deeper look at the results.
Length | Genome | Children |
---|---|---|
4 | CDIB | 3 |
5 | GCDIB | 66 |
6 | HGCDIB | 66 |
7 | ECDIBGI | 62 |
8 | IHEIBCDF | 127 |
9 | GCDCDCDGB | 110 |
10 | CDFIHEIBCH | 107 |
11 | CDGIECDIBHF | 101 |
12 | GCDFIHIHEIBH | 93 |
13 | CDCDIBCDGBGHF | 90 |
14 | GECDHFCDIBIBCD | 131 |
… | ||
18 | HCCDCDCDGHCBGIHEIB | 117 |
19 | CDIBCDEFCAADFHBHAIB | 119 |
20 | IBCDEIBCDCDCDFFIBIHG | 132 |
40 | CDHFCDECDBFGIBCDCDEG | 125 |
GHDHECDHBBCDIBIBCDFE |
We see a range of performance. Genomes need to be at least 8 codons long to be competitive.
Let’s take a look at 3rd place which has a short enough genome to disect.
➜ go-al git:(master) ✗ ./ga parse IHEIBCDF
,-Skip if Low Energy
|-Skip if Low Threat
`>Grow
,-Skip if Low Energy
| Spawn Child
`>Locate Food
Eat Food
Defend
There’s a conditional grow, a conditional spawn, locate/eat food and defend. It seems to check all the boxes.
But what is the purpose of Skip if Low Energy
followed by Skip if Low Threat
? This isn’t immediately obvious to me,
and it appears frequently in a number of genomes. If either command is replaced with a no-op the fitness is drastically
worsened.
What does our search space look like now?
We now have 9 instructions. For a length of 8 we have 9^8 possibilities, or 43,046,721 different genomes. At length 10 this jumps to about 3.5 billion permutations. This is definitely in the realm where brute force is no longer an option.
By length 20, where we found our best solution, there’s 12,157,665,459,056,928,801 - 12 quintillion - possible variants. At a rate of 1 billion evaluations per second it would take about 385 years.
What’s next
I’m not sure what the next post will cover. Maybe we’ll try a more intricate conditional instruction set. Perhaps attempt some form of coevolution. Or we could look to discard the idea of fixed genome lengths (this requires some work, eaopt does not seem to support variable lengths with its native crossover methods).
Resources
You can find the source code for this here at v0.2.0
eaopt - The GitHub repository for eaopt
go-al master - The latest version of go-al