Adventures in Game Theory, Part Four

For those of you freshly joining this adventure, the last three posts have led us on a strange, thrilling journey that has passed through the valleys of introductory game theory, the jungles of applied improv, and the mountains of software simulation. Now, at last we arrive at our thunderous finale on the shores of Lake Awesome. I highly recommend reading from the start of the sequence, otherwise what I have to say may be too extraordinary and wonderful for your mind to fully hold!

At the end of the last installment caught me teetering on the brink of a realization–that by adding just a little more functionality to my simulation, I could start exploring some more socially useful truths about how people behave. My insight was to add status.

What this meant in practice was splitting the population of agents in my model into two groups: bosses and workers, or in training community parlance: leaders and team-members. Then, in order to make the interactions between bosses and workers a little less benign, I added two extra constraints.

One: If bosses were aggressive (nose-thumbing) to workers, workers were not empowered to reciprocate and be aggressive back in their next encounter.

Two: Bosses were unable to remember the specifics of positive interactions they had with workers. So for instance, if a boss and a worker both chose paper in one round, the worker would remember the fact, but the boss would not.

Implementing these changes was easy, as it simply required that the two memory rules I’d already added to make the first simulation work were now dependent on status. (I also added a little extra logic around the movement of the agents to ensure that workers had to interact with bosses, and to make the movements of bosses dependent on other bosses but not workers. However, while necessary, that code is somewhat beside the point.)

What happened next was wonderfully clear. Within a few seconds, all the bosses were behaving aggressively while the workers normed on a set of social standards of their own. My simulation suddenly looked a lot like some of the more awful companies I’d worked for. Without having to say anything about the kinds of people who become leaders, or about the specifics of organizational culture, I’d captured a simple truth about leadership: that without the incentives to behave otherwise and the right skills to succeed, people with power slide towards bad behavior, even if they start off thinking like saints.

What was even more interesting was that as the simulation progressed, the bosses started to bump up against the corners of the virtual environment as if desperate to leave. Because aggressive behavior was so successful for bosses in their interactions with workers, they were applying the same behavior to each other, resulting in a rapid erosion of their ability to collaborate. The lesson: by letting leaders behave badly, we ensure that leaders have less pleasant interactions with each other, as well as with us.

My goal, though, was not to engage in rhetoric about leaders, but instead to see whether models like the one I was looking at could tell us something about how to help organizations do better. To do this, I looked at what happened when I turned each of the status dependencies off in isolation.

Turning off the status dependency for remembering positive interactions is rather like sending your managers on an employee recognition course. They learn to value the specific information they get from each person they work with, and to let their team members know that they’re seen and valued.

The result in the simulation is that the culture improves significantly. The workers integrate more tightly and the bosses take on the same cultural colors as the workers they lead. Interestingly, the bosses don’t all start cooperating at once. Many of them initially retain their aggressive behavior. Then, one by one, they figure out that collaboration is more effective.

The lesson here: that training leaders to listen can make a huge difference in their effectiveness, but that the change they take on depends on their willingness to implement what they learn.

If instead, we turn off the status dependency for worker retaliation to boss aggression, the effects are even more interesting. Making this change is rather like implementing a shared accountability system like the one that revolutionized the airline industry and transformed the safety standards in air travel. Under this system, the pilots of planes are no longer the unquestionable captains of the air that they once were. If copilots think that they’re witnessing a mistake, they’re duty-bound to call the pilot on it and to report it to air traffic control if necessary. In our simulated business, we can imagine that we’re instructing the worker agents to hold their bosses accountable if they don’t uphold the collaborative social standards of their organization.

What happens when we make this change is that the behaviors of the bosses have trouble settling onto any specific color. When we watch the ‘mood’ of the agents to see how many positive or negative interactions they’re having, we see that the tables have been turned. The workers are now having a pretty great time all round and the bosses are mostly miserable–the opposite of what we see if status dependence for retaliation is left on. This is because the workers now have an advantage that the bosses don’t–they can remember and repeat positive interactions whereas bosses cannot. Because aggression no longer secures automatic results, bosses don’t have an easy way of stabilizing on a successful behavior.

The lesson here is that enabling everyone in an organization to hold leaders accountable for their behavior is what creates the incentive for leaders to improve, but that without the right training and direction, the main result is leader unhappiness.

As you might expect, turning off both status-dependent features creates a benign, functional organization that settles rapidly onto a cooperative culture. If you want to play around yourself, and have Java installed, the simulation is the second applet on this page. (It has four buttons.)

As before, red, blue and green denote different positive interactions. Gray denotes aggressive behavior. Swapping to ‘mood view’ shows the success of the agents interactions, ranging from blue (unhappy agents) to yellow (cheerful ones).

Clearly there’s a lot more to do here. For a start, in order to turn this into a science result, the simulations will need to be a lot more rigorous, which will probably mean sacrificing the visual playfulness.  Furthermore, we’ve only looked at one memory model for agents and solid research would need to try out others. However, the results seem pretty clear. We’ve gone from a simple game played in a room full of people to a model that turns business intuition into something rather like unavoidable, mathematical fact.

Thus, in the wake of our adventure, we can say with real confidence that any society or organization that doesn’t empower its people hold its leaders accountable, and which doesn’t teach those leaders how to listen, can expect its leaders to turn out bad, regardless of how ‘good’ we believe them to be as people.

This is something most of us already believe but which we often fail to implement. For instance, we’re all used to the idea of holding elected officials accountable, but explicit training in ‘voter recognition’? We leave that to chance. Similarly, we’re used to the idea that good managers are the ones who pay attention, but company-wide accountability systems? Those are pretty rare. I believe that simulations like this can make these points unavoidable, and also perhaps show us how to build measures that make our adherence to such standards quantifiable.

For any skeptics out there, my huge thanks for reading this far, and here’s a final thought to consider. Agent-based simulations of this sort have been used by biologists for years on the following basis: we can’t capture all the details of natural systems like cultures or the lives of organisms, so instead we capture only what we know is true. From that, we look to see what else must be true as a consequence. Thus we attempt to make the simplicity of the model a strength, not a weakness. In this instance, the agents are so simple that we can expect the same effects to arise regardless of the memory model we employ for our agents, so long as that memory model permits learning. Further work in this area will hopefully make that point even clearer.

That’s it. The adventure is finished. And while the ending perhaps isn’t unexpected, it feels like a step forwards to me. After all, if we can do this starting with Rock Paper Scissors, think what we can do with the game of Twister.

Causal Sets and Leaning Towers

An article freshly transplanted from my digital physics blog:

Last year I had the incredible good fortune to spend a couple of months collaborating with Tommaso Bolognesi at CNR-ISTI, in Pisa, Italy. Tommaso runs his own research program into the interface between computation and physics and is a champion of the Digital Physics cause. He hired me to see if together we could answer a very specific question:

Is it possible to build networks that have the same properties as spacetime using simple algorithms, and if so, how?

I’ve had plenty to say on the subject of modeling space before this. However, what Tommaso was looking for was very specific. He wanted us to find ways to build causal sets. Causal set theory is probably the point of closest approach between digital physics and more mainstream quantum gravity research and it’s a fascinating subject. In a nutshell, causal set theorists believe that spacetime is most usefully thought of as a discrete structure and that the way to model it is to try to mimic the kinds of relationships between events that we see in relativity. To achieve this, they connect nodes using something called a partial order—a set of relationships that define which nodes must come before others, but which falls short of providing an exact numbering for all nodes.

Broadly speaking, the Causal Set Program uses two methods to build their sets. The first, called sprinkling, is to deposit nodes at random onto a surface, and hook them together based on the geometry of that surface. The other way, called percolation dynamics, is to add nodes one by one to a set, and randomly add links from existing members of that set to each new node.

Sprinkling is useful for exploring how causal sets behave but it has a huge problem: in order to construct the discrete structure of spacetime, you have to deposit your points onto a smooth spacetime first! Clearly, if we want to come up with a background-independent theory of physics, we need to build the sets some other way. On the other hand, percolation dynamics has all the nice statistical properties that physicists would like to see and doesn’t need a background, but sadly doesn’t actually produce graphs that look like spacetime (though people are working on that).

The right solution would seem to be to come up with a third way: a process that produces the right structures without needing a background surface. However, this comes with problems. The key features that differentiate spacetime-like causal sets from others are dimensionality and Lorentz invariance.

Dimensionality essentially says that we should expect the graph that we build to have some consistent number of dimensions, rather than just being a tangled mess. Lorentz invariance is a little trickier. What it implies is that if you build your network first and then lay the nodes onto a flat surface afterward, the positions of the nodes should appear random. There should be no way you can stretch or squish the network to make it look otherwise. This is vitally important because in order to treat every relativistic reference frame the same way, as special relativity says we must, we need about the same number of links between nodes in each frame.

Another way to say this is that, thanks to Einstein, we know that no matter how fast we’re moving, space will always feel the same to us. The way a causal set works is that each link corresponds to a step through time and space taken at a certain speed. So, if for some speed of travel, our network doesn’t have enough links, it’s definitely not going to feel the same to someone traveling through it. If this happens, our model has failed. The only way that people have ever found to make Lorentz-invariant causal sets is to have the network be random.

My collaboration with Tommaso was founded on a neat way around this problem that works like this:

  • Because any causal set we can build is finite, it can only ever approximate perfect randomness.
  • Furthermore, for a finite network of given size, we can always find some algorithm that can approximate that level of randomness through a deterministic process.
  • Thus, no matter how big our network needs to be, we should still always be able to find an algorithm that could give rise to it.
  • This will always be true so long as we believe that spacetime is discrete, that the universe has finite size, and that it has existed for finite time.

In essence, what this tells us is that just because the network we want to build needs to look random, that doesn’t mean that we can’t use a completely non-random method for building it. This is all great as far as it goes, but it leaves us with an enormous problem: how to find an algorithm that can build spacetime.

In the two months we had, Tommaso and I didn’t manage to crack this problem (otherwise you would have heard about it on the news by now) but we learned some fascinating things along the way. I hope to share some of them with you in my later posts.

However, in the mean time, there are plenty of really excellent introductory papers on causal sets that are very approachable for those who’re interested. While my favorite approach to discrete physics is a little different from the causal set methodology, I can recommend this field very highly to anyone interested in learning more about quantum gravity without taking on a full-time career as a string theorist.

Adventures in Game Theory, Part Three

To those fresh to this sequence of postings, let me give you a little context. Two posts ago, I implied that some kind of wildly significant insight about how organizations and societies worked could be derived from looking at simple playground games like Rock Paper Scissors. Over the course of the last two posts, I’ve been building up the case for that statement. Now comes the next thrilling, life-changing installment—this time with some simulation results!

Before I can fully explain, though, first I have to give you a little more background.  Last week I had the good fortune to speak at the ASTD conference in Orlando, Florida, the world’s largest training and development business event. The topic of the session was the use of Tokenomics as a tool for organizational culture change. I delivered the talk with my good friend Cindy Ventrice, from MakeTheirDay.com, and to support the session we captured a large amount of material on the subject, which those interested can find on our collaboration website, techneq.com. The session went wonderfully and generated plenty of interest. However, what I’m most keen to talk about here doesn’t relate to that talk, exactly, but to the unexpected consequences of it.

In order demonstrate to the audience what the Tokenomics approach was capable of, I put together a short computer simulation based on Scissors Dilemma Party, a game which the readers of the last two posts will have already heard of. The simulation was designed to show how autonomous software agents, given nothing but a simple memory model and some behavioral rules based on token acquisition, would automatically aggregate into social groups defined by shared values.

To make the model more intuitively approachable for a conference audience, I chose to have the agents move around in a virtual environment rather like people in a workplace, interacting when they met. As well as making the simulation more visually appealing, it demonstrated how the agents’ behavior evolved over time as they learned more about their environment, much as players of the game do when they experience it at Behavior Lab.

Each agent had eight memory slots initially filled with random behaviors. With each interaction, an agent would pick a behavior from its memory and apply it. If the interaction resulted in a positive outcome for the agent (unreciprocated nose-thumbing, or a successful rock-paper-scissors match), that behavior was copied to another slot in memory. If the behavior resulted in any other outcome, that memory slot was overwritten with a new random behavior. Agents were designed to move towards other agents with whom they’d interacted positively, and away from those with whom interaction had failed.

At first, the simulation didn’t work very well. Aggressive behavior (nose-thumbing), was too seductive for the dim-witted agents and stable social groups never formed. In order to get the agents to behave a little more like people, I had to add a little extra subtlety. This came in the form of two new rules.

The first rule was that if an Agent A was aggressive to agent B, B would remember that fact and be aggressive back at the next opportunity. This captures the idea of ‘Tit for Tat’—a strategy that has proved very successful in Prisoner’s Dilemma tournaments.

The second rule was that if A and B had a successful match of rock, paper, or scissors, they’d both remember it and try for the same topic of conversation next time. This gave the agents a chance to reinforce positive relationships.

These two rules together did the trick and produced a somewhat mesmeric simulation. You can see it here, by just clicking on the first simulation button that appears. (Sadly, WordPress isn’t enthusiastic about supporting applets, otherwise I would have included it in this blog. Also, note that you’ll need Java installed for this to work. If you don’t have Java, let me know. I’m thinking of writing an HTML5 version and am keen to know whether that would make life easier for people.) In this simulation, the colors red, green, and blue take the place of rock, paper and scissors. The color gray takes the place of nose-thumbing.

However, once I’d finished the simulation, it occurred to me that I’d only scratched the surface of what could be demonstrated with this approach. I could go further, do more, and start saying something really meaningful. Better still, the tools to achieve it were already in my hands! However, I’ve promised myself that each one of these postings will be short and readable by people with day jobs, so in order to discover what I did next, you’ll have to join me for Episode Four.

Adventures in Game Theory, Part Two

In the previous installment of this adventure, I promised to reveal how the secrets to business effectiveness and social harmony could be achieved by playing games like Rock Paper Scissors. Will I be able to deliver on that outrageous promise? Only by reading on will you get to find out.

For the next part of our journey, let’s consider a new game which we’ll call Scissors Party. The rules are simple and very much like those of Rock Paper Scissors. Players bounce their fists as usual and then pick any one of the three gestures normally used in the game. However, the scoring system in this version is different. In Scissors Party, players get two points each if they successfully match their opponent’s choice and no points if they don’t match. So if two players both choose paper, they get two points each. If one player chooses scissors and the other chooses paper, nobody gets any points. As in Dilemma Party, players are free to stay with the same partner or mingle in the group as they like. Any guesses as to what happens?

You may have already guessed that players tend to form pairs and small clusters that make the same choice every time, eg: always rock or always paper. Even though lots of people will still mingle, they figure out fairly quickly that they’re not making as many points as the people who stay put. Just as in Dilemma Party, interpersonal dynamics add complexity to the game. Some people want to move around and take risks, while others just want to ace the game, so the results are never as perfectly consistent as we might imagine. However, the patterns are still pretty clear.

So far so good. But where it gets really interesting is when you put Dilemma Party and Scissors Party together. This gives you Scissors Dilemma Party: a game that gives players four options: rock, paper, scissors and nose-thumbing.  The scoring works as you’d expect:

  • Thumbing gets you three points against rock, paper, or scissors but only one point against another thumb.
  • Successfully matching rock, paper, or scissors with your partner gets you two points.
  • Failing to match with rock, paper or scissors, or coming up against a thumb, gets you zero points.

Everyone confused yet?

What’s bizarre is what happens when you play this game with a room full of people who have just played Scissors Party moments before. Even though they know full well that they can form cliques and collaborate to get two points each turn, people will form little clusters that repeatedly thumb noses instead, getting one point each instead. This means that they’re being half as effective at playing as they were thirty seconds ago, simply because they’ve been given the option to play it safe at the cost of other players. This, to me, is a fascinating example of how being given the option to tune out and avoid cooperation produces instant defensiveness and a change in social cohesion.

Perhaps some of you will by now have figured out where I’m going with these games. Choosing different gestures in the game is very much like choosing tokens to collect in life. Pairwise interactions are rather like small versions of the conversations we have every day. Rock, paper and scissors equate to different forms of social value, such as sexiness, intelligence, or likability. Nose thumbing equates to extracting involuntary tokens from others for personal validation gain. Whereas our choice of gestures in the game is conscious and our choice of tokens in life is non-conscious, the same patterns of defensive behavior can be seen. In fact, in non-conscious group behavior, we tend toward more predictable responses. Thus, playing Scissors Dilemma Party gives us an interesting, lightweight model for looking at how social groups form and interact.

Intriguing, I hear you say, but still not yet a conclusive solution to the world’s ills. True. To see the awesome social significance of Scissors Dilemma Party in all its glory, you’ll have to read Adventures in Game Theory Part Three.

Adventures in Game Theory, Part One

Question: Can playing simple games like Rock Paper Scissors teach us how to be better leaders, help us build effective, equitable organizations, and pave the way to a more harmonious world?
Answer: Yes! Undoubtedly!

If you want to know how, and why I would make such a ridiculous-sounding assertion, then I invite you to come with me on a journey into a dark and mysterious world of theoretical applied improv. The journey will be long and arduous (four blog posts), but for those who stick with me, there is treasure in store.

The starting point in this adventure is the Prisoner’s Dilemma–perhaps the best-known finding from Game Theory: a branch of math that studies how people or animals compete. Simply put, the Prisoner’s Dilemma is a formal description of a kind of situation we often face in life, in which cooperation between two parties comes with both risks and benefits, but where failing to cooperate is both safe and predictable.

People have studies Prisoner’s Dilemma very extensively. There have been research papers about it, world-spanning experiments, online tournaments between competing software programs, and dozens of books on the subject. Not satisfied by all this, I wanted to see what happened when I turned Prisoner’s Dilemma into an improv game and took it to Behavior Lab.

To this end, I created a game called Dilemma Party–a little like Rock Paper Scissors but with two  options per player instead of the traditional three. Here’s a slide I used at the ASTD conference in Orlando recently (more on that in later posts), that shows how to play, and how the scoring works.

As you can see, players have the option of thumbing their nose at their opponent or offering them an invisible gift. Offering a gift presents the best opportunity for mutual gain but comes with a risk. If the other player thumbs their nose at you, you get nothing and your opponent walks away with a nice stack of points. Thumbing your nose means that you always win something, regardless of what the other player does–it’s a safer bet but not a particularly friendly one.

Players of the game interact for an unspecified period of time, trying to rack up as many points as they can. They’re milling in a large group and can swap partners any time they like, or stay with their current partner if they prefer. What do you suppose happens if you put fifty random people in a room together and get them to play? Any guesses on what strategies they pick?

The answer is that it depends on the group. Put members of the general public together and the group norms to almost universally thumbing noses after a short time, with a few individuals doggedly giving gifts regardless of the losses they incur. However, put a room full of professional trainers together and the group norms to universal gift giving almost as fast. Perhaps unsurprisingly, pairs of players who settle on gift-giving tend to stay together. Pairs where one or more players thumb noses don’t stay together very long.

For the most part, people who aren’t already familiar with the Prisoner’s Dilemma do a very natural thing when reasoning about scores. They realize that by nose-thumbing, they can’t lose, so they keep doing it, even though they miss out on the chance to make more points by building stable relationships. No big surprises there.

Where the game gets interesting is when you look at how the rich, multi-layered nature of human interaction interferes with our stable assumptions about how the game should work. For instance, in one group, players repeatedly thumbed their opponents but then shared high-fives after each interaction. What this suggests is that the players knew they were making cautious, uncooperative choices, but still wanted to check in with each other to show that they were really friendly people at heart. Thumbing their noses felt awkward and antisocial but they didn’t want to change tactics and consequently lose! Giving high-fives was a way of subverting the game, and showing their opponents that they weren’t really in competition.

Also, those people who’ve spent a lot of time in a training, group therapy, or social workshop setting tend to repeatedly offer gifts, regardless of the consequences. I suspect that this has more to do with how those people are mentally parsing the game, rather than suggesting that they have fundamentally different personalities. These are people who’ve played similar games before and aware of the implications of cooperation. That makes them behave differently because perceiving themselves as cooperative affords them more validation than the points offered by the game. They’d rather feel positive and socially useful than win, even if that feeling comes with a very light dose of martyrdom.

Underpinning both of these reactions is the fascinating interplay between the choices made consciously in the game, and the very similar game of token exchange that the players are playing underneath. Because we load the game into the conscious awareness of the players, the acquisition of points can’t help but be held as an extrinsic goal. And because there aren’t cash prizes on offer, that goal comes with low priority. This means that the intrinsic motivations of the players guide their strategies. Thus, while we’re unlikely to get unbiased information about Prisoner’s Dilemma itself from the game, it shines a fascinating light on our motivations.

Interesting, I think, but not a recipe for social harmony just yet. There’s more we can do with these games. Much more. And for that, you’ll have to read my Adventures in Game Theory Part Two.

NOTE: This blog entry first appeared in my improv blog: Thinking Improv

a place for creative chaos