[IEEE 2013 IEEE Conference on Self-Adaptive and Self-Organizing Systems Workshops (SASOW) -...

6
Reasoning and Reflection in the Game of Nomic: Self-Organising Self-Aware Agents with Mutable Rule-Sets Stuart Holland, Jeremy Pitt, David Sanderson, D´ ıdac Busquets Department of Electrical & Electronic Engineering Imperial College London, SW7 2BT, UK Email: {stuart.holland09, j.pitt, dws04, didac.busquets}@imperial.ac.uk Abstract—The Game of Nomic was developed to investigate the idea that any modifiable rule-based system could result in situations where the ruleset is paradoxical, contradictory or incomplete. This has interesting and important implications for designers of open, self-organising, rule-based systems, if our concern is to ensure that the system should operate within a ‘corridor’ of behaviour, or should avoid certain non- normative states. To investigate this issue, this paper presents the preliminary design, implementation and operation of a self- organising multi-agent system in which the agents play the Game of Nomic. While not yet in a position to test Suber’s hypothesis fully, we can see how different agent strategies can reason, reflect, and make decisions that benefit their internal objectives relative to the game itself, by using an awareness of themselves, other players, the ruleset and the projected outcome of proposed rule modifications. Keywords-Self-Organising Systems, Multi-Agent Systems, Rule-based Systems, Reflection, Awareness, Nomic I. I NTRODUCTION The Game of Nomic [1] involves several players interact- ing in the context of a set of rules. Players start with zero points and take it in turns to propose a rule modification; the proposal is voted on; the player scores some points; first to 100 points wins. More than a game, its inventor (Peter Suber) wanted to make a point about legal or parliamentary systems, and that any modifiable ruleset might end up being incomplete or inconsistent, and to investigate what he called the paradox of self-amendment, that any proposed rule amendment might apply to itself, and therefore the rule would authorize its own amendment. Our concern is the effect of self-modification on a set of conventional, mutually-agreed rules, such as those found in self-organising institutions for common-pool resource management [2]. We want to examine what we are calling Suber’s Thesis, that any system allowing unrestricted self- modification of the rules will tend to paradox. A group of agents specifically motivated to avoid paradox might be able to do so, but without a specific intention to avoid paradox, and careful execution, their modifications to the rules will be subject to a probabilistic or entropic tendency to paradox. Suber’s thesis, if true, has interesting and important im- plications for designers of open, self-organising, rule-based systems, if their concern is that the system should operate within a ‘corridor’ of behaviour [3], should avoid certain non-normative states [4], or there is a risk of unintended consequences, for example undesirable pernicious outcomes like inconsistency, deadlock or exploitable loopholes. To investigate this issue, this paper presents the design, implementation and operation of a self-organising multi- agent system in which the agents play the Game of Nomic. There is a significant challenge in designing and implement- ing a Nomic-playing agent: firstly, how to decide on a move in the game when it is its turn, i.e. what to propose as a rule modification; and secondly, how to evaluate proposed rule modifications to inform the decision on whether to vote for or against the proposal. Taking inspiration from ideas of computational reflection [5], generative simulation [6], and internal modelling for robotics [7], we address this challenge by sub-simulation: the simulated agents invoke the simulation environment with their own (sub) simulation of the other agents, and animate their expected behaviour with the proposed modification of the ruleset. While not (yet) in a position to test Suber’s hypothesis fully in relation to self-organising rule-based systems, we can see how different agent strategies can reason, reflect, and make decisions that benefit their internal objectives relative to the game itself, by using their awareness and self-awareness, of themselves, other players, the ruleset and the projected outcome of proposed rule modifications. II. THE GAME OF NOMIC A. Gameplay The Game of Nomic is an n-player turn-based game in which the rules of the game include mechanisms by which the players change the rules. Initially, the order of turns is pre-defined before the game starts (positional, alphabetical, etc.), and play passes to the next player in turn according to the rule specifying the ordering. (However, even this rule could be changed.) In each turn, a player proposes a rule-change, has it voted on, and then throws a die to determine the addition to his/her score. In fact, the way the game is played is also specified as a rule, and therefore can also be changed. In Suber’s initial ruleset, there were two types of rule: mu- table rules and immutable rules. In any turn, a player could 2013 IEEE 7th International Conference on Self-Adaptation and Self-Organizing Systems Workshops 978-1-4799-5086-7/13 $31.00 © 2013 IEEE DOI 10.1109/SASOW.2013.27 101

Transcript of [IEEE 2013 IEEE Conference on Self-Adaptive and Self-Organizing Systems Workshops (SASOW) -...

Reasoning and Reflection in the Game of Nomic:Self-Organising Self-Aware Agents with Mutable Rule-Sets

Stuart Holland, Jeremy Pitt, David Sanderson, Dıdac Busquets

Department of Electrical & Electronic EngineeringImperial College London, SW7 2BT, UK

Email: {stuart.holland09, j.pitt, dws04, didac.busquets}@imperial.ac.uk

Abstract—The Game of Nomic was developed to investigatethe idea that any modifiable rule-based system could resultin situations where the ruleset is paradoxical, contradictoryor incomplete. This has interesting and important implicationsfor designers of open, self-organising, rule-based systems, ifour concern is to ensure that the system should operatewithin a ‘corridor’ of behaviour, or should avoid certain non-normative states. To investigate this issue, this paper presentsthe preliminary design, implementation and operation of a self-organising multi-agent system in which the agents play theGame of Nomic. While not yet in a position to test Suber’shypothesis fully, we can see how different agent strategies canreason, reflect, and make decisions that benefit their internalobjectives relative to the game itself, by using an awarenessof themselves, other players, the ruleset and the projectedoutcome of proposed rule modifications.

Keywords-Self-Organising Systems, Multi-Agent Systems,Rule-based Systems, Reflection, Awareness, Nomic

I. INTRODUCTION

The Game of Nomic [1] involves several players interact-

ing in the context of a set of rules. Players start with zero

points and take it in turns to propose a rule modification;

the proposal is voted on; the player scores some points; first

to 100 points wins. More than a game, its inventor (Peter

Suber) wanted to make a point about legal or parliamentary

systems, and that any modifiable ruleset might end up

being incomplete or inconsistent, and to investigate what

he called the paradox of self-amendment, that any proposed

rule amendment might apply to itself, and therefore the rule

would authorize its own amendment.

Our concern is the effect of self-modification on a set

of conventional, mutually-agreed rules, such as those found

in self-organising institutions for common-pool resource

management [2]. We want to examine what we are calling

Suber’s Thesis, that any system allowing unrestricted self-

modification of the rules will tend to paradox. A group of

agents specifically motivated to avoid paradox might be able

to do so, but without a specific intention to avoid paradox,

and careful execution, their modifications to the rules will

be subject to a probabilistic or entropic tendency to paradox.

Suber’s thesis, if true, has interesting and important im-

plications for designers of open, self-organising, rule-based

systems, if their concern is that the system should operate

within a ‘corridor’ of behaviour [3], should avoid certain

non-normative states [4], or there is a risk of unintended

consequences, for example undesirable pernicious outcomes

like inconsistency, deadlock or exploitable loopholes.

To investigate this issue, this paper presents the design,

implementation and operation of a self-organising multi-

agent system in which the agents play the Game of Nomic.

There is a significant challenge in designing and implement-

ing a Nomic-playing agent: firstly, how to decide on a move

in the game when it is its turn, i.e. what to propose as a

rule modification; and secondly, how to evaluate proposed

rule modifications to inform the decision on whether to

vote for or against the proposal. Taking inspiration from

ideas of computational reflection [5], generative simulation

[6], and internal modelling for robotics [7], we address this

challenge by sub-simulation: the simulated agents invoke the

simulation environment with their own (sub) simulation of

the other agents, and animate their expected behaviour with

the proposed modification of the ruleset.

While not (yet) in a position to test Suber’s hypothesis

fully in relation to self-organising rule-based systems, we

can see how different agent strategies can reason, reflect,

and make decisions that benefit their internal objectives

relative to the game itself, by using their awareness and

self-awareness, of themselves, other players, the ruleset and

the projected outcome of proposed rule modifications.

II. THE GAME OF NOMIC

A. Gameplay

The Game of Nomic is an n-player turn-based game in

which the rules of the game include mechanisms by which

the players change the rules.

Initially, the order of turns is pre-defined before the game

starts (positional, alphabetical, etc.), and play passes to the

next player in turn according to the rule specifying the

ordering. (However, even this rule could be changed.) In

each turn, a player proposes a rule-change, has it voted on,

and then throws a die to determine the addition to his/her

score. In fact, the way the game is played is also specified

as a rule, and therefore can also be changed.

In Suber’s initial ruleset, there were two types of rule: mu-

table rules and immutable rules. In any turn, a player could

2013 IEEE 7th International Conference on Self-Adaptation and Self-Organizing Systems Workshops

978-1-4799-5086-7/13 $31.00 © 2013 IEEE

DOI 10.1109/SASOW.2013.27

101

propose the addition, amendment or repeal of a mutable rule,

or they could propose what was called transmutation, by

proposing to change a mutable rule into an immutable one,

or vice versa. Thus supposedly ‘immutable’ rules were not

fixed either, they could be amended or repealed provided

they were converted (transmuted) into mutable rules first.Mutable rules were numbered from 101, immutable rules

from 201. At the start of the game according to Suber’s

specification there were 16 immutable rules and 13 mutable

rules. An example each type of rule are as follows:

– 103. A rule-change is any of the following: (1) the

enactment, repeal, or amendment of a mutable rule; (2) the

enactment, repeal, or amendment of an amendment of a

mutable rule; or (3) the transmutation of an immutable rule

into a mutable rule or vice versa.

– 202. One turn consists of two parts in this order: (1)

proposing one rule-change and having it voted on, and (2)

throwing one die once and adding the number of points on

its face to one’s score.There are other rules which specify how a rule is passed,

or not; and which specify the precedence of rules. There

are also complex rules about adjudication of rules. For full

details, see [1].

B. Automating NomicThere are some obvious limitations in attempting to au-

tomate a game of Nomic. Human players acknowledge and

comply with a set of implicit rules that are not codified in the

initial ruleset. So some behaviour has to be hard-wired into

all the agents. Furthermore, Nomic has many subtle aspects

which may be considered peculiar to ‘intuitive’ decision

making [8]. Human players are likely to suggest rule changes

or vote according to a ‘feeling’ with respect to other players

or the playing of the game itself, rather than any specific

objective of winning. Moreover, Suber’s expectation was

that unrestricted self-modification will create inconsistency,

indeterminacy, and loopholes which can intentionally be

exploited. But reasoning under these conditions remains a

challenge for Artificial Intelligence.Therefore the internal reasoning components of an agent

that attempts to emulate these processes involves two related

but importantly differentiated tasks. Firstly, the analysis of a

proposed rule change, and the potential advantages it confers

to an agent, requires an internal model of the environment

within which the agent exists, and how that environment is

changed by the given rule change. Secondly, the proposing

of any new rule change requires a degree of creativity

and lateral thinking. The search space of new rules (or

modifications to existing rules) is infinite in an unbounded

game of Nomic, and the evaluation of those changes has to

take into consideration that, as with all conventional rule-

based systems, it is not a matter of people complying or not

with the rule that matters, it is how they react to incentives

implied by the rule [9].

rule ‘‘Example rule’’// attributes such as saliencewhen

// preconditions herethen

// consquences hereend

Figure 1: Standard Drools rule structure

These factors require bounding the Game of Nomic as

specified by Suber (pure-Nomic) through the the rule design

and the agent design, as described in the next two sections,

to produce the system implementation for the game of

bounded-Nomic, which still provides a starting point for

a meaningful investigation of Suber’s hypothesis for self-

organising rule-based systems.

C. Target Platform

For the implementation of bounded-Nomic, the multi-

agent simulation and animation platform Presage2 [10] has

been used for the implementation of the agents, and the

business rule engine Drools [11] has been used to represent

the game rules, which is used by the agents to inform their

decision-making about the moves they propose in their turns.

1) Presage2: Presage2 is a general purpose platform for

developing animation and simulations of collective self-

organising multi-agent systems. Presage2 provides services

to simulate large, heterogenous agents populations, multi-

ple different networks, inter-agent communication, policy

modelling, the physical environment, event recognition, data

logging and visualisation. Crucially, it extends the original

PreSage platform [12] by adding support for declarative rule

specifications using Drools.

2) Drools: Drools is a business rule engine that allows

declarative programming using a Prolog-like syntax.

Drools works by allowing rules and queries (and some

other constructs) to be specified in a declarative language.

At the core of a given Drools rules engine instance are rule

bases and knowledge bases. Knowledge bases encapsulate

rule base implementations, providing appropriate access to

the structures that define a rule and how it functions. In

order for rules to be ‘modifiable’, and so that previously

removed rules can later re-added within a single simulation,

this system separately keeps track of all available rules and

the required resources to recompile them.

The primary interaction with the Drools knowledge bases

is through knowledge sessions. Knowledge sessions come

in two varieties: stateless and stateful. Stateless knowledge

sessions do not use inference, though they can be useful

for validation and calculation operations. However, state-

ful knowledge sessions offer logical inference relationships

and persistent state over time. Through stateful knowledge

sessions, facts can be inserted into the Drools rules engine

knowledge base. Any Java object can be treated as a fact.

102

Figure 2: Nomic rule represented as Drools rule

Most rule specifications involve the interaction between,

inserting of, or properties of new facts that are added to

the knowledge base. When a new fact is inserted, retracted,

or modified, then any number of rules may be triggered.

Any newly triggered rule is added to its associated knowl-

edge base’s agenda and scheduled for execution the next

time an appropriate API call is received. A rule consists of

a precondition block and a consequence block (see Figure 1).

The precondition block (when) specifies all conditions that

must be met for the rule to become active. Preconditions are

written using the declarative syntax used by the Drools rules

engine. The consequence block (then) specifies the actions

that should be taken when a given rule is activated.

Salience is an attribute of a rule that determines the order

in which rules execute after they have been added to the

knowledge base’s agenda. Rules with higher salience values

will execute first, ensuring that some rules can override

others. This is useful in rules such as ‘Each Agent Can Vote

Only Once per Turn’, where one rule needs to execute before

another (a second vote within a turn shouldn’t count toward

the current proposal vote if it is going to be denied for being

a duplicate afterwards).

III. RULE IMPLEMENTATION

A. Representation in Drools

Representation of Nomic rules is a matter of encoding

them as Drools rules. For example, the scoring element of

Rule 202 is encoded by the Drools rule shown in Figure 2.

However, there are some limitations, and some rules had

to be hard-wired and others merged.

1) Hard-Wired Rules: Several of the core rules of Nomic

need to be part of the framework within which the agents

exist, rather than being modifiable constructs as they are

in pure Nomic. For example, Rule 202 specifies that turns

must be composed of a single proposal stage followed by

a round of voting. Technically, this is a mutable rule and

therefore changeable. However, we have implemented this

as an unmodifiable component in bounded-Nomic.

The fact that all agents must vote is codified in a rule

specified as a part of the initial active set, but there is a

degree of implicit agreement regarding which agents will be

asked for votes. All agents will always be asked for a vote

during every voting phase, even though some of those votes

can be made ineffective by changes to relevant rules. This

means that rules which limit the set of agents that can vote

are effective, but difficult for the agents to recognize their

effects, limiting the scope to reason about such changes.

There are other core concepts to pure Nomic which are

technically mutable but are hard-wired into bounded Nomic:

the fact that there are turns, that players can win, that the

order of voting does not matter, that there can be only one

active player per turn, and several more. However, these do

not substantially alter the fundamental nature of the game,

i.e. unrestricted player-modification of game rules.

2) Merged Rules: Several rules in pure Nomic interact in

ways that produce conceptual relationships between separate

rules and determine flow of play. In bounded-Nomic, several

such rules are merged into a single rule that performs the

overarching concept. This is primarily due to difficulties with

rule interactions affecting each other’s outcomes.

A side effect is that multiple core systems essential to the

continued proper execution of the game of Nomic within the

simulations are codified in single rules, affected by single

removals and modifications. This means that the continued

stability of the game (where stability is the ability of the

game and agents within the simulation to continue play in a

meaningful way) can be unseated by individual rule changes.

For example, if the ability to pass proposals was revoked

in the first few turns, this would prevent any agents from

making progress for the remainder of that simulation.

This makes any rule modifications which reduce the

majority required for a proposal to be successful extremely

volatile, tending to either end the simulation quickly or cause

instability such that little happens until the simulation ends.

B. ¬(Representation in Drools)

Some aspects of the pure-Nomic rules are not imple-

mented in bounded-Nomic.

The distinction between mutable and immutable rules has

been removed to limit the search space of agents so that the

two transmutation proposals do not need to be considered.

Note that the concept of transmutation can be removed from

a game of pure Nomic by the players if all relevant rules are

repealed, the implementation of bounded-Nomic is a game

of pure Nomic where these moves have already been made.

Pure-Nomic uses judges and arbitration to make decisions.

The necessity for judges in a game of pure Nomic (with

human players) is largely due to player interpretation of

rules. In bounded Nomic, the behaviour of rules and the

interaction between them is entirely determined by the rule

engine implementation. The inclusion of a dispute resolution

mechanism [13] as a system of arbitration and overriding

decisions has been left to future work.

Finally, players in pure-Nomic are allowed to leave the

game, but in bounded-Nomic they cannot, although its im-

plementation is not overly complex. The concept of ‘losing’

103

is not specified as a part of pure-Nomic’s default ruleset, it

can be introduced by new rule additions. None of the rules

in the pool available to bounded-Nomic create a potential

for agents to ‘lose’ (except implicitly, where any agents that

do not win can be considered to have lost).

IV. AGENT IMPLEMENTATION

A. Agent Proposals and Rule Flavours

The creation of new proposals from no initial informa-

tion requires a degree of creativity not easily emulated in

software applications. A new rule can be triggered by any

known action and affect any facet of the existing simulation.

Furthermore, before making any proposal, the agent must be

able to formalize the concepts it finds desirable (requiring

an ability to generate correct Java code) and analyze the

result of this formalization. While sub-simulations (see

below) offer a facility to evaluate a given formalization,

it is not computationally feasible to analyze the scope of

permutations available to each agent.

With these limitations and challenges in mind, bounded-

Nomic agents, instead of ‘creating’ new rules themselves,

draw from a preset pool of available rule proposals. These

proposals represent only a very small subset of available rule

proposals for a game of Nomic, but are used to represent

the agents’ capacities to reason about which changes afford

them the most advantage.

This pool of available rules offers a number of valid

proposals that agents can make on any given turn. To

further reduce the search space, the concept of rule flavourswas introduced. Rule flavours are conceptual markers that

allow an agent to make an informed decision about the

properties of a given rule without expensive computation.

This allows rules to be flagged, for example, as introducing

new win conditions or being beneficial to all players. This

information means the agents can make decisions on the

kinds of proposals they wish to pursue, given the current

simulation state, before having to run entire sub-simulations.

Bounded-nomic offers eight compile-time specified rule

flavours that allow agents to quickly categorize the pool

of proposals available to them and analyze only those

that are relevant. The eight available flavours are complex,

destructive, simple, desperation, beneficial, winCondition,

stable and detrimental.

For example, a new rule proposal’s complexity flavour

represents how difficult it is to properly gauge the effect of

the rule it has on the game. A rule with a high complexity

requires a longer sub-simulation to properly assess its ef-

fects, while a low complexity requires only a short time to

produce effects. As an example of a low complexity rule, e.g.

“Agent4 wins”, the effects of this rule are seen immediately,

and agents need run only short sub-simulations to determine

its effects. A high complexity rule proposal might be “If all

other agents vote against a given agent’s proposal, they each

steal 7 points from the proposer”, which is complex because

it is unlikely to occur and assessing whether this behaviour

is desired requires a long projection further into the future.

B. Sub-SimulationIn order for agents to be able to reason about the effects

of rule changes and whether or not those changes are

preferable to the current state of the simulation, they must

be able to model what effects those changes will have. To

do this, the agents are capable of running sub-simulationswhich are secondary simulations that do not affect the

‘super-simulation’ where Nomic is actually being played.

Explicitly, an agent in Presage2 with a Drools knowledge

base invokes Presage2 with a variant of that knowledge

base and its own, internal simulation of the other (super-

simulation) simulated agents.These sub-simulations are defined by the rules from the

super-simulation and any proposed changes whose effects

the invoking agent wants to analyze. The sub-simulation is

then populated by a set of proxy agents corresponding to

the agents in the super-simulation. This is a limited form of

reflective reasoning [5], [7] or awareness, by which is meant

that each agent has a model of its environment, including

itself (i.e. self-awareness), and animates that model to inform

its decisions (rather than ‘awareness’ as any deep subjective

reflective experience).Each agent is capable of executing entirely isolated sub-

simulations (often in parallel with other agents) in which

the actions and the consequences of those actions can be

used to evaluate the value of a proposed rule change (or

a rule change that the agent is considering proposing) and

inform the voting decision. The effectiveness of these sub-

simulations depends on the controlling agent’s ‘avatar’, the

proxy that represents the super-simulation agent that invoked

the sub-simulation. The avatar agent measures its preference

for the sub-simulation using a set of preference rules defined

by the controlling agent in the super-simulation.

C. Agent Implementation: StrategiesThe agents’ reasoning about mutable rulesets in bounded

Nomic fall into four major categories of strategic interac-

tions. Each strategy differs mostly in how it decides on a rule

change to propose during its own turn, where each different

strategy requires branching decision-making depending on

the current state of the simulation.Most agents, when required to propose a rule change,

begin by running a sub-simulation with a ‘blank’ rule

change, to analyze the current state of the game and decide

whether or not the current state is a desirable one. This

decreases the likelihood of the agent making decisions that

would later come to disadvantage itself. What actions an

agent takes after deciding on their preference for the current

state of the simulation varies from one strategy to the next.Several of the agent strategies make some assumptions

about the objectives the game of Nomic in the super-

simulation. These assumptions limit the scope of new rules

104

that can be acted upon intelligently by these agents, but

are not indicative of such limitations being a part of the

framework within which they operate. More flexible rule

specifications for agent preference could deal with much

larger sets of available rule changes without any further

alteration to the simulation architecture’s framework.

An example of this kind of assumption is that many of

the agents’ strategies view gaining points as positive and

losing points as negative (for the player whose point total is

changing). However, in Nomic (or any equivalent mutable

rule-defined environment) a new rule can be introduced that

causes a player to win after reaching a large negative point

total. If the initial ‘100 points wins’ rule is then also re-

pealed, the agents’ objectives have swapped polarity, where

losing points is beneficial and gaining points is detrimental.

There are four basic agent strategies: selfish, harmonious,

vindictive and destructive.

1) Selfish Agent: A selfish agent attempts to maximize its

own value in the game and eventually achieve victory for

itself. Selfish agents prioritize gaining points for themselves

and any sub-simulation that leads to their own victory is

locked in as a positive result. Selfish agents are the uncoop-

erative with other agents, in terms of voting for and against

proposals. Simulations composed solely of selfish agents

tend to be very stable with few rule changes, because the

agents are unlikely to allow changes that aid their opponents.

2) Harmonious Agent: The harmonious agent is the coun-

terpoint to the selfish agent, prioritizing helping other agents

achieve victory. Harmonious agents prefer for other agents’

point totals to exceed their own and to assist other agents in

winning the game. Harmonious agents tend to be the primary

driving force for ‘positive’ change in a simulation. They

introduce rules that most often lead to other agents winning,

particularly those that have a self-interested strategy, who

will support such proposals.

3) Vindictive Agent: The vindictive agent is the most

variable of the four agent strategies. At the beginning of the

super-simulation, each vindictive agent selects an opponent

to be their ‘nemesis’ for the duration of that simulation. They

prefer any changes that disadvantage their nemesis and have

no particular preference toward their own victory.

4) Destructive Agent: The destructive agent attempts to

modify the rules in such a way that it is obstructive to

continued sensible execution of the game of Nomic active

within the given simulation. This is often achieved by

repealing the rules that define basic Nomic activities, such

as determining when a proposal has succeeded (once that

rule has been repealed, no further proposals can pass) or

determining whose turn it is (which means it will be the

proposing player’s turn until the end of the simulation or

that agent proposes the rule be re-added).

V. EXPERIMENTAL OBSERVATIONS

The system was run with different populations of the four

different types of agent. Several runs were used for each of

population distributions because the randomness and non-

linearities in the system meant that the same outcomes were

not observed from the same population.

The key observations are twofold. Firstly, the sub-

simulation strategy worked as intended: in its turn, an agent

was able to explore a search space (bounded by the use of

the proposal pool and rule flavours) to come up with possible

proposed rule modification, and when out-of-turn, the agents

were able to use sub-simulations to decide whether or not

to vote for, or against, a proposed rule modification.

Secondly, the different agent strategies gave rise to some

unusual proposals and variations in behaviour. For example,

when placed in larger simulations composed of multiple

agent strategies, selfish agents tend to be victorious quite

often. With other agents’ votes making many rule changes

more likely, the simulation tends to be more dynamic. Selfish

agents tend to benefit from the rule changes proposed by

other agents and then introduce or vote for win conditions

that cause them to win in the final few turns of the game.

Simulations composed entirely of harmonious agents

tended to be quite short. All ‘negative’ rules were generally

removed in the first few turns and new win conditions

introduced easily (in the sense that such proposals are passed

as soon as they are introduced). Often, there was a winner

soon after unanimity expired at the end of the second round,

when the to-be winner’s vote against a proposal ceases to

prevent it from being applied.

Vindictive agents’ nemeses vary as rule changes are

proposed and voted on. When a vindictive agent’s proposal

fails to pass, it chooses a new nemesis from among those

agents that voted against the proposal. Vindictive agents,

since they prioritize the success of any other agent above

their nemesis (not only themselves), often act similarly to

harmonious agents but were prone to sudden changes of

attitude if their current nemesis is succeeding.

Destructive agents proved to be precisely that, due to

the non-immediacy of the rule changes they propose. Any

simulation involving even a single destructive agent often

had no winner, mostly due to blocking rule changes.

Finally, there were some unexpected consequences, per-

haps of the kind Suber would have appreciated. For example,

Figure 3 shows a run in which agent0 ‘invented’ a proposal

which gave it an infinite number of turns, and the other voted

in favour of accepting it.

VI. SUMMARY AND CONCLUSIONS

This paper has described the implementation of a self-

organising system of self-aware agents that are designed to

play the game of Nomic. Like Nomic’s inventor, Peter Suber,

we are interested in what happens in a self-organising rule-

based system which allows unrestricted self-modification of

105

Figure 3: Agent0 gets infinite turns

the rules. Suber’s belief was that, in such circumstances, the

rules would inevitably end in a paradoxical state. This is

a significant concern for the design of self-organising rule-

based systems, which may be required to stay in corridors

of operation [3] or avoid non-normative states [4].Despite the limitations and restrictions imposed by the

difficulty in automating pure-Nomic, we contend that the

bounded-Nomic platform presented here is a valid approx-

imation for investigations of this kind. However, there is

substantial scope for development, including extending the

knowledge base of each agent, mimicking successful prox-

ies, and reasoning about third-party decisions, as well as

implementing the judgement rules.We are not in a position to confirm or deny Suber’s hy-

pothesis in relation to artificial systems, but it is, we believe,

an essential question to investigate: do self-organising rule-

based systems which allow unrestricted modification of the

rules inevitably end in paradoxical rulesets, or contradiction,

impasse, or creation and exploitation of loopholes? Indeed,

the reported observations do show how different agent

strategies can reason, reflect, and make decisions that benefit

their internal objectives relative to the game itself, by using

their awareness and self-awareness, of themselves, of other

players, of the ruleset, and of the projected outcome of

proposed rule modifications.Finally, we intend to make the platform fully open source,

and invite others to write their own Nomic playing strategies.

It would be interesting to open up a “Nomic Playing Compe-

tition”, similar to other agent-based competitions (like TAC

(Trading Agent Competition) and RoboCup Rescue) to ob-

serve the effects of unrestricted interaction of Nomic-playing

strategies and examining what happens to the rulesets.

AckowledgmentsWe are grateful for the extensive and helpful comments

from both Peter Suber and the anonymous reviewers.

REFERENCES

[1] P. Suber, The Paradox of Self-Amendment: A Study of Law,Logic, Omnipotence, and Change. Peter Lang, 1990.

[2] J. Pitt, J. Schaumeier, and A. Artikis, “Axiomatisation ofsocio-economic principles for self-organising institutions:Concepts, experiments and challenges,” ACM Trans. Auton.Adapt. Syst., vol. 7, no. 4, pp. 1–39, Dec. 2012.

[3] F. Nafz, J.-P. Steghofer, H. Seebach, and W. Reif, “For-mal modeling and verification of self-* systems basedon observer/controller-architectures,” in Assurances for Self-Adaptive Systems, ser. LNCS, vol. 7740. Springer, 2013.

[4] A. Artikis, “Dynamic specification of open agent systems,”Journal of Logic and Computation, vol. 22, no. 6, pp. 1301–1334, 2012.

[5] C. Landauer and K. Bellman, “Meta-analysis and reflection assystem development strategies,” in International Symposiumon Metainformatics, ser. LNCS, D. Hicks, Ed., vol. 3002.Springer, 2004, pp. 178–196.

[6] Y. Demiris and B. Khadhouri, “Hierarchical attentive multiplemodels for execution and recognition of actions,” Roboticsand Autonomous Systems, vol. 54, no. 5, pp. 361–369, 2006.

[7] A. Winfield, “Robots with internal models: A route to self-aware and hence safer robots,” in The Computer After Me,J. Pitt, Ed. Imperial College Press, (to appear).

[8] G. Vreeswijk, “Formalizing nomic: working on a theory ofcommunication with modifiable rules of procedure,” Vak-groep Informatica, Rijksuniversiteit, Tech. Rep., 1995.

[9] E. Lopez, The Pursuit of Justice: Law and Economics of LegalInstitutions. New York, NY: Palgrave MacMillan, 2010.

[10] S. Macbeth, D. Busquets, and J. Pitt, “System Modeling: Prin-cipled Operationalisation of Social Systems Using Presage2,”in Modeling & Simulation-based Systems Engineering Hand-book. Taylor and Francis, (to appear).

[11] The JBoss Drools team, “Drools introduction and general userguide,” http://www.jboss.org/drools/documentation, 2013.

[12] B. Neville and J. Pitt, “Presage: A programming environmentfor the simulation of agent societies,” in Programming Multi-Agent Systems, ser. LNCS, vol. 5442, 2008, pp. 88–103.

[13] J. Pitt, D. Ramirez-Cano, L. Kamara, and B. Neville, “Al-ternative dispute resolution in virtual organizations,” in Proc.ESAW’07, ser. LNCS, vol. 4995, 2007, pp. 72–89.

106