At each step a neural network receives inputs for navigating its world, finding food, avoiding competitors, and outputs indicate the heading for the next step it wants to take. They evolve via a genetic algorithm employing roulette wheel selection which ensures that the most successful will contribute more of their genes (the synaptic weights) to the next generation.
However, with nothing more than these inputs, it is inevitable that the organisms will tend to pile up in places, finding themselves all in pursuit of the same remaining piece of food in a given local, then all piled atop one another going on to the next.
To discourage the organisms from clumping up, a rule is made that they would derive no nutrition from any food eaten in the near presence of a competitor. This sets up an imperative to develop strategies to avoid neighbours when eating in order to be successful.
The neural network then is given an extra input that is only signaled when there is contention over a particle of food.
This in itself may not be very useful. A discovery was made that if each participant was assigned an arbitrary identity, they spontaneously developed a protocol to enable them to take turns, in effect over the long run, when a piece of food was in contention. Then the identity becomes the signal to indicate contention.
To allow the network to develop a protocol, a tri-state concept of "Handedness" was adopted as a label to distinguish one organism from another during an encounter. Each organism is arbitrarily assigned either "Right-handedness" or "Left-handedness" at the start of each generation, and there are equal numbers of each. Then as they navigate about the world and encountered competitors, if it happened that they were both the same handedness at the moment of encounter, the first to arrive upon the scene is changed to be the opposite handedness of the other.
These organisms have no memory and would never know their identity was different one moment to the next. They could, however, carry with them at all times a repertoire of different behaviours that could be triggered for either identity as reported on the neural input.
Since there are an endless number of encounters during a round, this assured that in the end any organism spends an approximately equal amount of time being labeled as either handedness.
If there was no conflict over a piece of food, this input for current identity was set to zero. If there was a conflict, where a organism and its neighbour both happen to find themselves headed for the same morsel, the organism's current handedness was input as 1 for "Left-handed" or -1 for "Right-handed". Thus there were three possible states: Left-handed, Right-handed, and non-signaled (zero).
Then any given interaction always involved a right-handed organism and a left-handed organism. Each organism could decide how that interaction should play out based on its current handedness if it so chose, but they are given no hint as to what they should do.
In several experiments some colonies developed a protocol of their own initiative within a few hundred generations, whereby right-handed organisms would consistently defer to left-handers, backing off and allowing the competitor to take the food. Other colonies developed the opposite protocol with equal probability. Then it can be seen that since each organism spends an approximately equal amount of time being right-handed or left-handed, in the end all will eat without conflict.
In some colonies, no protocol ever developed, resulting in organisms that were always squabbling over food. These colonies scored significantly lower than civilized colonies.
I conducted a final experiment with the two input model where no longer was the identity input set to zero when there was no contention, and therefore handedness no longer served as an alert to contention. As the organisms wandered about the world, all they ever knew was the direction to the nearest food particle and their handedness of the moment. Even under this condition they developed a protocol, though the reward was small - 20 to 30 points above what they had achieved without any protocol. However, the reduction of variability of the group scores could clearly be seen in the smoothing of the learning curve.
It is hypothesised that give and take behaviour begins as a genetic defect for the individual(s) that happens to be beneficial to the genome as a whole.