Theory-Choice, Transient Diversity and the E ciency of...

27
Theory-Choice, Transient Diversity and the Efficiency of Scientific Inquiry Forthcoming in the European Journal for Philosophy of Science AnneMarie Borg 1 , Daniel Frey 2 , Dunja ˇ Seˇ selja 3 , and Christian Straßer 1 1 Institute for Philosophy II, Ruhr-University Bochum 2 Faculty of Economics and Social Sciences, Heidelberg University 3 Munich Center for Mathematical Philosophy, LMU Munich Abstract Recent studies of scientific interaction based on agent-based models (ABMs) suggest that a crucial factor conducive to efficient inquiry is what Zollman, 2010 has dubbed ‘transient diversity’. It signifies a process in which a community engages in parallel exploration of rivaling theories lasting sufficiently long for the community to identify the best theory and to converge on it. But what exactly generates transient diversity? And is transient diversity a decisive factor when it comes to the efficiency of inquiry? In this paper we examine the impact of different conditions on the efficiency of inquiry, as well as the relation between diversity and effi- ciency. This includes certain diversity-generating mechanisms previously proposed in the literature (such as different social networks and cautious decision-making), as well as some factors that have so far been neglected (such as evaluations underlying theory-choice performed by scientists). This study is obtained via an argumentation-based ABM (Borg et al., 2017, 2018). Our results suggest that cautious decision-making does not always have a significant impact on the efficiency of inquiry while differ- ent evaluations underlying theory-choice and different social networks do. Moreover, we find a correlation between diversity and a successful per- formance of agents only under specific conditions, which indicates that transient diversity is sometimes not the primary factor responsible for ef- ficiency. Altogether, when comparing our results to those obtained by structurally different ABMs based on Zollman’s work, the impact of spe- cific factors on efficiency of inquiry, as well as the role of transient diversity in achieving efficiency, appear to be highly dependent on the underlying model. Keywords: agent-based modeling, cautious decision-making, theory- choice, transient diversity, scientific inquiry, scientific interaction. 1

Transcript of Theory-Choice, Transient Diversity and the E ciency of...

Page 1: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

Theory-Choice, Transient Diversity and the

Efficiency of Scientific InquiryForthcoming in the European Journal for Philosophy of Science

AnneMarie Borg1, Daniel Frey2, Dunja Seselja3, and ChristianStraßer1

1Institute for Philosophy II, Ruhr-University Bochum2Faculty of Economics and Social Sciences, Heidelberg University

3Munich Center for Mathematical Philosophy, LMU Munich

Abstract

Recent studies of scientific interaction based on agent-based models(ABMs) suggest that a crucial factor conducive to efficient inquiry is whatZollman, 2010 has dubbed ‘transient diversity’. It signifies a process inwhich a community engages in parallel exploration of rivaling theorieslasting sufficiently long for the community to identify the best theory andto converge on it. But what exactly generates transient diversity? Andis transient diversity a decisive factor when it comes to the efficiency ofinquiry? In this paper we examine the impact of different conditions onthe efficiency of inquiry, as well as the relation between diversity and effi-ciency. This includes certain diversity-generating mechanisms previouslyproposed in the literature (such as different social networks and cautiousdecision-making), as well as some factors that have so far been neglected(such as evaluations underlying theory-choice performed by scientists).This study is obtained via an argumentation-based ABM (Borg et al.,2017, 2018). Our results suggest that cautious decision-making does notalways have a significant impact on the efficiency of inquiry while differ-ent evaluations underlying theory-choice and different social networks do.Moreover, we find a correlation between diversity and a successful per-formance of agents only under specific conditions, which indicates thattransient diversity is sometimes not the primary factor responsible for ef-ficiency. Altogether, when comparing our results to those obtained bystructurally different ABMs based on Zollman’s work, the impact of spe-cific factors on efficiency of inquiry, as well as the role of transient diversityin achieving efficiency, appear to be highly dependent on the underlyingmodel.

Keywords: agent-based modeling, cautious decision-making, theory-choice, transient diversity, scientific inquiry, scientific interaction.

1

Page 2: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

1 Introduction

Recent studies on epistemic effects of scientific interaction, conducted via agent-based models (ABMs), have largely focused on the context of theoretical diver-sity, where a scientific community pursues different rivaling theories within agiven scientific domain (Borg et al., 2017, 2018; Frey and Seselja, 2018a; Grim,2009; Grim et al., 2013; Kummerfeld and Zollman, 2016; Zollman, 2007, 2010).Since one of the rivaling theories is assumed to be the best, agents are successfulif they manage to converge on it. A take-home message from a number of thesestudies has been the following: in order for an inquiry to be successful it needsa property of ‘transient diversity’ (Zollman, 2010). Transient diversity refersto a process in which a community engages in a parallel exploration of differ-ent theories, which lasts sufficiently long to prevent a premature abandonmentof the best of the available theories, but which eventually gets replaced by aconsensus on the best theory. Or as Poyhonen and Kuorikoski, 2016 specify it,transient diversity represents “a proper balance between the diversity of beliefsand consensus”.

But what does exactly generate this kind of balance? Zollman has sug-gested that transient diversity can be obtained either by limiting informationflow among scientists or by equipping them with extreme prior values for theirinitial hypotheses (though not by both of these mechanisms at the same time).Kummerfeld and Zollman (2016) suggest that institutional encouragement ofunpopular, risky paths of inquiry may be necessary to obtain such a diversity.Finally, Frey and Seselja (2018) suggest that cautious decision-making may beyet another mechanism that increases the chance of the community achievingthe optimal degree of diversity. In all of these models mechanisms that gener-ate transient diversity function by preventing fully connected communities fromprematurely converging on a possibly wrong theory.1

Since all of the above ABMs are inspired by Zollman’s models,2 which rep-resent the situation of theoretical diversity in terms of ‘bandit problems’, thisraises the question whether the same kind of mechanisms still played a role (inthe sense of generating transient diversity) if we represented scientific inquiryin a structurally different way (as e.g. Grim et al., 2013 or Borg et al., 2018 do).Moreover, whether transient diversity is a robust property in the sense that acertain degree of a diversity is a ‘difference-making’ factor when it comes tosuccessful inquiry, is another open question. Addressing this issue is important

1Alexander, 2013 presents a slightly different scenario, where the number of rivaling theoriesgrows over time. His results suggest that some learning strategies (namely, the combinationof reinforcement and social learning via preferential attachment) can lead to the optimal levelof diversity, under the condition that agents discount the knowledge of past theories.

2We have omitted a class of models employing epistemic landscapes (such as Alexander,Himmelreich, and Thompson, 2015; Poyhonen, 2017; Thoma, 2015; Weisberg and Muldoon,2009) since they tend to represent a different kind of diversity than the one we are focusingon in this paper: they rather examine what would better be labeled as ‘cognitive diversity’(Poyhonen and Kuorikoski, 2016), which concerns different research heuristics employed byindividual agents across the given community. Moreover, efficiency of inquiry in these modelsis usually measured in terms of success of the community in discovering certain patches of thegiven landscape, rather than in terms of agents converging on a single theory.

2

Page 3: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

not only for the examination of the robustness of previous results, but also fora more precise understanding of the phenomenon of transient diversity and itsrelation to the efficiency of inquiry (where efficiency is a function of the rate ofsuccessful convergence and the required time).

In this paper we will examine this question by means of an argumentation-based ABM (ArgABM) of scientific interaction, which we previously presentedin Borg et al., 2017.3 We will focus on two kinds of interrelated mechanisms:

1. On the one hand, we will examine mechanisms that represent cautiousdecision-making (previously discussed by Frey and Seselja, 2018a with re-spect to a Zollman-inspired model). The first such mechanism is ‘rationalinertia’ that an agent has towards her pursued theory, which assures sheabandons the theory only after having repeatedly gathered evidence infavor of its rival for a significant period of time. The second mechanism isa relative threshold value which a rivaling theory has to surpass in orderto count as superior to one’s current theory.

2. On the other hand, we will examine different evaluative procedures, inview of which scientists decide which theory to pursue and on top of whichcautious decision-making is employed. For instance, agents in the modelmay prefer theories that have a wider scope than their rivals, or they mayavoid theories that exhibit more anomalies than their rivals. These mea-sures may come down to different preference orders on the given theories.While ABMs of scientific interaction have usually employed a specific kindof assessment, which of these assessments is either descriptively adequateor normatively desirable has largely remained open. To this end, it ishelpful to understand their impact on the efficiency of inquiry.

What makes ArgABM especially suitable for this research question is that,on the one hand, it employs both of the above mechanisms representing cautiousdecision-making as parameters of the model. On the other hand, the model al-lows for a straightforward approach to studying different assessment proceduresunderlying theory-choice of scientists. In addition, the model employs a specificapproach to knowledge representation, which is structurally different from Zoll-man’s or Grim & Singer’s models. For instance, both defensible and anomalousparts of knowledge can be located as specific parts of the given theories. Thismakes the model apt for the above mentioned robustness analysis.

Our results suggest that, a certain degree of diversity can be clearly iden-tified as correlated with efficient inquiry only when agents employ a specifictheory-choice assessment—namely, when they prefer theories that are based ona comparatively larger body of solidified research, relative to their rivals. Inthat case cautious decision-making has a positive impact on the efficiency offully connected communities. When it comes to other evaluations, as well asto less connected communities, cautious decision-making either has no impact

3The model is programmed in NetLogo (Wilensky, 1999). The code of the model employedin this paper can be found at: https://github.com/g4v4g4i/ArgABM.

3

Page 4: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

on the efficiency or it is harmful for it. Hence, this study indicates that deter-mining factors conducive to the efficiency of inquiry is highly dependent on thespecific model and its idealizations. This points to an important task for futureresearch: specifying which types of inquiry (for example, related to specific sci-entific domains) are more adequately represented by some of these conditionsand certain ABMs of science, rather than others.

Here is how we will proceed. In Section 2 we will present the central featuresof ArgABM. In Section 3 we will introduce four different types of evaluationunderlying scientists’ decision as for which theory to pursue. In Section 4 wewill explicate how we model cautious decision-making. In Section 5 we willpresent our results: we will show how different social networks perform in eachof the four evaluations, with and without the mechanisms of cautious decision-making. Moreover, we will analyze the impact of diversity on successful inquiry.In Section 6 we will conclude the paper suggesting some questions for futureresearch.

2 ArgABM: an overview

In this section we introduce ArgABM, an argumentation-based ABM of scientificinquiry, which has previously been used for the examination of epistemic effectsof scientific interaction under different types of social networks (Borg et al.,2017, 2018). The model is designed to measure the efficiency of groups of agentsin their knowledge acquisition. Knowledge acquisition is represented in termsof agents exploring a number of rivaling scientific theories, where they have todetermine which theory is the best one. Efficiency of their inquiry is representedin terms of their success in converging on the best of the available theories, andin terms of time they need to achieve this convergence.4

A specific feature of this model is that it aims to represent argumentativedynamics among scientists who explore rivaling theories or research programs5

and exchange arguments pro and con these theories along the way. To thisend, the model represents the argumentative context underlying theories, withinwhich scientists gather evidence for the hypotheses constituting the given theoryand against the rivaling ones. Such an argumentative context is represented interms of an argumentative landscape, explored by agents.

4In (Borg et al., 2017, 2018) the efficiency in terms of time is measured in a slightlydifferent way. Moreover, Borg et al., 2017 presents an alternative, ‘pluralist’ measure ofsuccess, according to which agents are successful if at the end of the run the best theoryisn’t less populated than any of its rivals. In the current section we will try to keep technicaldetails at the minimum. An interested reader can take a closer look at the above mentionedpublications on this model.

5For the sake of simplicity, we use the terms ‘theory’ and ‘research program’ interchange-ably in this paper.

4

Page 5: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

2.1 The argumentative landscape

As mentioned above, the model represents scientific inquiry in which scien-tists explore their research programs, gradually fleshing them out. They do soby exploring the argumentative landscape, which represents the argumentativecontext underlying the rivaling research programs. Each theory is representedas consisting of a number of arguments. These arguments are represented ab-stractly, as nodes in a directed graph, connected via a discovery relation. Anargument can be understood as a hypothesis supported by evidence gained bymeans of a certain study (e.g. an experiment).6 The discovery relation repre-sents paths that agents take when moving on the landscape, from one argumentto another. Its role is to track the temporal aspect of research where new re-search steps build on the previous ones. Moreover, arguments belonging to oneresearch program can attack arguments of one of the rivaling programs. Suchan attack represents, for instance, a discovery of a methodological problem ina certain study of the rivaling research program, or results of a novel studywhich provides a better explanation of a certain phenomenon than a study of-fered within the rivaling program.7 The landscape then consists of differentargumentative rooted trees, with nodes as arguments and edges as discoveryrelation, where an argument in one tree may attack an argument in anothertree (see Figure 1).8 The extent to which each research program is attacked isa parameter of the model. We represent all theories as trees of the same size,i.e. consisting of the same number of arguments.

While at the beginning of the run, agents only see the root argument ofeach theory, over the course of the run they gradually discover the rest of thelandscape. Each argument can be understood as a hypothesis investigated by

6For a concrete example of a scientific controversy—namely, the continental drift debate—represented by means of a similar framework (based on abstract argumentation) see Seseljaand Straßer, 2013.

7While we can imagine a situation in which a single argument serves as an objectionattacking the rivaling theory in whole (for example, showing the theory cannot explain acertain set of phenomena) in the current model we abstract away from such cases by employingthe idealization that attacks always target a specific part of a theory (e.g. an attack on a studyin a rivaling research program pointing to a methodological problem doesn’t necessarily attackresults of other studies within the same program—i.e. other arguments). Note that this isalready a step further in the direction of representational adequacy in comparison to Zollman-inspired ABMs. It remains a task for future research to examine whether our results remainrobust if we implemented a more detailed representation of argumentative attacks, e.g. byintroducing an explanatory relation between arguments and a set of explananda (as it is doneby Seselja and Straßer, 2013) and more refined evaluation procedures (as compared to theones to be introduced in Section 3).

8The representation of our landscape is inspired by abstract argumentation frameworks(Dung, 1995). Formally, the landscape is given by a triple 〈A, , ↪→〉 where ↪→ is the discoveryrelation, is the attack relation, and A = 〈A1, . . . ,Am〉 is partitioned in m many theoriesTi = 〈Ai, ai, ↪→〉 which are trees with ai ∈ Ai as a root and

⊆⋃

1≤i,j≤mi 6=j

(Ai ×Aj) and ↪→ ⊆⋃

1≤i≤m

(Ai ×Ai).

Specifying like this ensures that the theories are conflict-free, i.e. that there are no attacksbetween the arguments of the same theory.

5

Page 6: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

ResearchProgram 1

ResearchProgram 2

Figure 1: An example of an argumentative landscape consisting of 2 theories (orresearch programs). Darker shaded nodes represent arguments that have beeninvestigated by agents and are thus visible to them; brighter shaded nodes standfor arguments that aren’t visible to agents. The biggest node in each theoryis the root argument, from whcih agents start their exploration via discoveryrelation, which connects arguments within one theory. Arrows stand for attacksfrom an argument in one theory to an argument in another theory.

scientists. Throughout their exploration of the landscape, our scientists willoccasionally encounter defeating evidence, represented as attacks coming fromarguments in a rivaling theory. Moreover, they may encounter arguments thatdefend their attacked hypotheses, where—informally speaking—an argument ais defended in the theory if it is not attacked or if each attacker b from anothertheory is itself attacked by some defended argument c in the current theory.9,10

Let’s look at the example illustrated in Figure 2. In graph (a) we haveargument a1 from theory T1, which is attacked by argument b1 from theoryT2. In this case, a2 defends a1 since it attacks b1, the attacker of a1. If in the

9More precisely, we call a subset of arguments A of a given theory T admissible iff for eachattacker b of some a in A there is an a′ in A that attacks b. Since every theory in the modelis conflict-free, it can easily be shown that for each theory T there is a unique maximallyadmissible subset of T (with respect to set inclusion). An argument a in T is said to bedefended in T iff it is a member of this maximally admissible subset of T .

10Agents discover attacks to and from their current arguments, as well as the child argumentsof their current arguments gradually, depending on the degree of exploration assigned to thecurrent argument at a given time point of a run: for each agent ag and each argument a ∈ A,expl(a, ag) ∈ {0, . . . , 6} where 0 indicates that the argument is unknown to ag and 6 indicatesthat the argument is fully explored and cannot be further explored. Since the model is round-based, each round may be interpreted as one research day. Each of the 6 levels of an argumenttakes a researcher 5 rounds/days of exploration. Thus, each argument represents a hypothesisthat needs altogether 30 research days to be fully investigated.

6

Page 7: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

a1 b1

a2 b2

(a)

a1 b1

a2 b2

(b)

Figure 2: Argumentation graph with arguments a1 and a2 to theory T1, andarguments b1 and b2 belonging to theory T2. Discovery relations are omitted.

further course of exploration agents encounter b2, which attacks a2 (graph (b)),then the previous defense becomes unsuccessful and both a1 and a2 will now beundefended (for a formally precise definition of defended arguments see belowSection 3.1).

The idea behind such argumentative dynamics stems from the defeasiblecharacter of scientific reasoning, where throughout inquiry scientists may en-counter defeating evidence for their previously accepted hypotheses, and defenseof hypotheses that they have earlier rejected. This feature allows for the repre-sentation of errors that commonly appear in scientific research: false positives(accepting a hypothesis that is actually false) and false negatives (rejecting ahypothesis that is actually true). This is important in a model that aims to ex-amine the efficiency of scientific inquiry, since these errors have a direct impacton it. Cases in which scientists accepted a false hypothesis (sometimes simula-teously with rejecting a true one) are well known from the history of science.11

This is precisely why Zollman-inspired models examine the efficiency of inquiryby focusing on the mechanisms that are conducive to minimizing the risk offalse positives and false negatives.

The argumentative dynamics in our model allows for a straightforward rep-resentation of false positives and false negatives: the former are arguments thatinitially appear defensible, though further inquiry would reveal that they arenot; the latter are arguments that are attacked and undefended, but for whichdefense can eventually be found.

Now, an important feature of the model is that one of the rivaling researchprograms is designed as the ‘best one’. In this way we can measure the effi-ciency of scientists by assessing their success and time needed to converge onthis particular theory. The best theory is simply the one which is designed asfully defended from all the attacks, in the fully explored landscape.12 This is,of course, an idealization, but it helps to represent the above mentioned ap-pearance of false positives and false negatives: while at early stages of inquiry,the best theory may appear to have many anomalies (undefended arguments),if scientists keep on exploring it, they will find solutions for these anomalies

11For instance, the continental drift debate or the research on peptic ulcer disease are someof the cases in point (see Seselja and Straßer, 2014b; Seselja and Weber, 2012).

12The other theories are modeled as having a certain percentage of their arguments attackedand undefended.

7

Page 8: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

(namely, defenses of the attacked arguments).

2.2 Behavior of agents

The model is round-based and each round agents perform one of the followingactions:

A1 exploring a single argument, thereby gradually discovering possible attacks(on it, and from it to arguments that belong to other theories) as well asdiscovery relations to neighboring arguments;

A2 moving to a neighboring argument along the discovery relation within thesame theory;

A3 moving to an argument of a rivaling theory.

As mentioned above, agents start the run of the simulation at the root ofa given theory and gradually discover more and more of the argumentativelandscape. In this way each turn an agent operates on the basis of her own(subjective) fragment of the landscape, which consists of arguments that shehas explored to a specific degree, and (attack and discovery) relations that shehas found between the arguments.

To decide whether to keep on pursuing their current theory (actions A1 andA2 above), or whether to better start working on an alternative theory (A3)agents are equipped with the ability to evaluate theories.13 Every few roundsthey apply an evaluative procedure, with respect to the set of arguments andattacks they currently know (i.e. their subjective memory). We will introducefour such procedures in the next section. For now, it will suffice to say that allsuch evaluations are based on the question, how many defended or undefendedarguments the theory has.

2.3 Social networks

Just like other models of scientific interaction, ArgABM employs social net-works. In particular, agents form their subjective knowledge of the landscapenot only in view of information they gather on their own, but also in view ofinformation they receive from other agents, with whom they are linked in anetwork. There are two types of such networks:

1. Collaborative groups, which consist of five individuals who start from thesame theory. While each agent gathers information about the landscapeon her own, every five time steps this information is shared with all otheragents forming the same collaborative network.

13In addition, agents are equipped with a certain heuristic behavior, which allows them tosearch for the defense of their current argument in case it is attacked. See Borg et al., 2017,Section 2.2.

8

Page 9: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

Figure 3: A cycle, a wheel and a complete graph. Each node is a collaborativegroup, while the edges represent communication channels.

2. Community-networks, between collaborative groups, which are formed outof representative agents from each of the linked collaborative networks(one representative agent per collaborative network). Within community-networks agents share information (arguments and attack relations) thatthey have recently gathered via their exploration. This could be inter-preted as having a scientist report on her recent (positive and negative)findings concerning her current theory, by writing a paper or giving aconference talk.14 Community-networks can have one of the followingstructures: a cycle, in which each collaborative group is connected to ex-actly two other groups, a wheel which is similar to the cycle, except thata unique group is connected to every other group, and a complete graphwhere each group is connected to all other groups (see Figure 3).

3 Evaluations underlying theory-choice

As mentioned in the previous section, agents in ArgABM assess their theoryin order to decide whether to stick with it, or to switch to one of the rivalingtheories. In this section we will present four evaluative procedures, in view ofwhich scientists can make such a theory-choice.

In order to explore the space of possibilities, we start with two simple mea-sures, and then proceed by adjusting them towards two additional, more com-plex measures. Of course, which of these measures (or yet some other ones)is actually employed by scientists is an empirical question, which cannot beanswered from a philosophical armchair. We will motivate four suggestions forimplementing such evaluative procedures in the context of ArgABM (see Section6 for some additional proposals).

14In the current model we assume that agents reliably share information, i.e. that they shareboth positive and negative findings about their current theory. Borg et al., 2018 examine inaddition deceptive information sharing, i.e. agents who share only positive findings abouttheir theory (arguments and attacks to other theories), while witholding the informationabout attacks on their own theory. Whether the results presented in this paper also hold fordeceptive agents remains a question for future research.

9

Page 10: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

g

e a c

f b d

theory defended degree of def.T1 = {e, f} (white) {f} 1T2 = {a, b, g} (gray) {} 0

T3 = {c, d} (dark gray) {} 0

Figure 4: Argumentation Framework 1

3.1 The degree of defensibility (assessment D)

Our first measure is the assessment of theories in terms of their degree of de-fensibility.15 We will call it for short: assessment D. The degree of defensibilityof a theory is the number of defended arguments in this theory. T1 is preferredto T2 iff T1 has more defended arguments than T2.

This strategy represents scientists who are easily impressed by the size of atheory, that is, by the size of its defensible parts.16 In other words, they keepon pursuing their current theory unless one of the rivaling theories turns out tohave more defended arguments.

Let’s give a more precise formal definition. First, we call a subset of argu-ments A of a given theory T admissible iff for each attacker b of some a in Athere is an a′ in A that attacks b. Since every theory is conflict-free, it can easilybe shown that for each theory T there is a unique maximally admissible subsetof T (with respect to set inclusion). An argument a in T is said to be defendedin T iff it is a member of this maximally admissible subset of T .17 The degreeof defensibility of T is equal to the number of defended arguments in T .

Figure 4 depicts a situation with three theories as it might occur from theperspective of a given agent: T1 consisting of arguments e and f (white nodes),T2 consisting of arguments a, b and g (gray nodes), and T3 consisting of ar-guments c and d (dark gray nodes). The arrows represent attacks, we omitdiscovery relations. We are now interested in the degrees of defensibility ouragent would ascribe to the given theories. The table shows which argumentsare defended in each theory and their corresponding degree of defensibility. Theonly defended argument in this situation is f in theory T1. Note for instancethat in T3 the argument d is not defended since no argument in T3 is able to

15This measure is employed in previous versions of ArgABM (Borg et al., 2017, 2018).16An alternative way to interpret this assessment is in terms of an explanatory scope of a

theory, where we are assuming that the arguments constituting the given theory are explana-tory in nature (see Seselja and Straßer, 2013). A less idealized measure of explanatory powercould be implemented by introducing a set of explananda E and an explanatory relation fromsome of the arguments in the theory to a subset of E.

17Given that theories in the model are conflict-free, the notion of admissibility is here thesame as the one introduced in Dung, 1995. In Dung’s terminology, our sets of defendedarguments correspond to preferred extension (which are exactly the maximally admissiblesets), except that we determine these sets relative to given theories.

10

Page 11: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

g

e a c

f b d

theory defended degree of def.T1 = {e, f} (white) {} 0T2 = {a, b, g} (gray) {a, b, g} 3

T3 = {c, d} (dark gray) {} 0

Figure 5: Argumentation Framework 2

defend it from the attack by b. Although the argument f in T1 attacks b, itdoesn’t count as a defender of d for theory T3 when determining the defendedarguments in T3 since in our account a theory is supposed to defend itself.

Figure 5 depicts the situation after an attack from a to f has been discovered.Consider theory T2. In this situation a defends b from the attack by f , b defendsa from the attack by d, a defends g from the attack by e and g defends a fromthe attack by c. Hence, all arguments are defended resulting in a degree ofdefensibility of 3.

3.2 The degree of anomaly (assessment A)

According to this measure, T1 is preferred to T2 iff T2 has more undefendedarguments than T1. We call it for short: assessment A. If we interpret thenumber of undefended arguments as the degree of anomaly of the given researchprogram, this strategy can be taken as representing scientists who abandontheories that become more anomalous than their rivals. This approach couldbe seen as corresponding to a Kuhnian scientist who resists converting to a newparadigm until her theory is clearly more anomalous than its rival (see Kuhn,1962 [1996]).

Taking a look at the scenario in Figure 4, T1 has a degree of anomaly 1,while T2 has a degree of anomaly 2 and T3 has a degree of anomaly 3. Hence,agents will prefer T1. In Figure 5 T1 and T3 have a degree of anomaly 2, whileT2 has a degree of 0. Here they will thus prefer T2.

3.3 Multiplication (assessment M)

We now turn to more sophisticated assessments. According to the measurewhich we call ‘multiplication’, T1 is preferred to T2 iff |Undef(T1)| · |Disc(T1)| <|Undef(T2)| · |Disc(T2)|, where Undef(Ti) stands for undefended arguments oftheory Ti, and Disc(Ti) stands for all discovered arguments of Ti (i.e. argumentsthat belong to the knowledge base of the agent). We call this procedure for short:assessment M.

This strategy represents scientists who are less forgiving toward anomaliesin their research program the more advanced it is (i.e. the more arguments ithas). This approach could be seen as corresponding to the Lakatosian idea that

11

Page 12: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

in their early stages research programs are infested with anomalies, which areexpected to be resolved as time passes by (see Lakatos, 1978).

Taking a look at the example in Figure 4, if we assume all the argumentsin the framework are actually discovered, then T1 has a multiplication score of1× 2 = 2, T2 has a score of 3× 3 = 9 and T3 2× 2 = 4. Agents will thus preferT1.

3.4 Normalization (assessment N)

Our final measure is labeled ‘normalization’ since according to it, T1 is pre-ferred to T2 iff |Undef(T1)|/|Disc(T1)| < |Undef(T2)|/|Disc(T2)|, where againUndef(Ti) stands for undefended arguments of theory Ti, and Disc(Ti) standsfor all discovered arguments of Ti.

18 We call this evaluation for short: assess-ment N.

This strategy represents scientists who evaluate the defended (or anomalous)scope of their research program relative to how advanced it is. The idea behindthis assessment is similar to Bayesian updating via beta-distributions (employedby Zollman, 2010), the mean of which is given by the ratio of the number ofsuccessful draws from the distribution through the number of all draws.

Considering the example in Figure 4 and assuming all the arguments arediscovered, the normalization score for T1 is 1/2 = 0.5, for T2 it is 3/3 = 1, andfor T3 it is 2/2 = 1. Thus, agents prefer T1.

While applying our four evaluations to the example in Figure 4 has led tothe same preference order (with T1 being selected in each case), the followingexample illustrates that our four assessments may not always lead to the sametheory-choice.

The example in Figure 6 consists of two theories, a blue one, T1, with ar-guments a1-a3, and a green one, T2, with arguments b1-b6. We have thatDisc(T1) = 3, Def(T1) = 1, Undef(T1) = 2, Disc(T2) = 6, Def(T2) = 3 andUndef(T2) = 3. Hence, T1 has a multiplication score of 6 and a normalizationscore of 2

3 and T2 has a multiplication score of 18 and a normalization scoreof 3

6 . Therefore, T1 is preferred over T2 if theories are compared by means ofassessments A or M, and T2 is preferred over T1 when evaluation is done bymeans of assessments D or N.

18It is easy to show that the following measure results in an equivalent preference order:T1 is preferred to T2 iff |Def(T1)|/|Disc(T1)| > |Def(T2)|/|Disc(T2)|, where Def(Ti) stands fordefended arguments.

12

Page 13: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

a1 b1 b4

a2 b2 b5

a3 b3 b6

T1 better? T2

D 1 ≺ 3

A 2 � 3

M 6 � 18

N 23 ≺ 3

6

Figure 6: Argumentation framework illustrating different evaluations underlyingtheory-choice. D: degree of defensibility, A: degree of anomaly, M: multiplica-tion, N: normalization.

4 Modeling cautious reasoning

We will now explicate two types of diversity-preserving mechanisms, each ofwhich can be understood in terms of cautious reasoning, that functions in com-bination with evaluations presented in the previous section.

4.1 Rational inertia: temporal threshold

The first mechanism has the aim to prevent agents from being hastily swayed bynew evidence. It functions in the following way: an agent abandons her currenttheory and switches to a rivaling one only after she has received consistentevidence showing that the latter is better for X number of evaluations (whereX is a parameter of the model). We will refer to X as temporal threshold.This corresponds to the situation in which scientists don’t easily abandon theirtheory, even after discovering problems with it. Instead, they stick with it untiland unless they are convinced that it can no longer be saved from the defeatingevidence and that its rival is superior to it.

We call such an inertia rational for it wouldn’t make much sense for a scien-tist to prematurely abandon her theory, before she is sure the current anomaliescannot be resolved and the theory improved. In this sense, it is rational for ascientist to stick to her theory for a while longer (see Kelp and Douven, 201219).Moreover, such an inertia is rational also in view of the fact that changing one’sinquiry usually comes with various costs (such as acquiring additional knowl-edge, new equipment, etc.).

19Rational inertia shouldn’t be confused though with the ‘Steadfast Norm’ discussed byKelp and Douven in the same paper, and well-known in the literature on peer disagreement.Unlike in their account, in our model we may interpret a scientist as having a rational inertiatowards her theory, while having lowered her confidence that the theory is actually true.

13

Page 14: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

4.2 Similarly successful theories count as equally good:epistemic threshold

While a rational inertia keeps agents ‘sticky’ on their theories for a certain periodof time, our second mechanism keeps them ‘sticky’ for as long as the rivalingtheory isn’t significantly better than their current one. To this end, agents stayon their current theory unless it has been surpassed by a rival beyond a giventhreshold value, relative to the employed evaluation procedure. We call such athreshold – epistemic threshold.

More precisely, an agent abandons her current theory only if it fails to beone of the best theories, where the set of ‘best theories’ is calculated by meansof the four evaluative procedures together with the epistemic threshold in thefollowing way:

• For the evaluation in terms of assessment D: if Ti stands for a theory thathas the highest degree of defensibility, then the set of best theories consistsof those theories that have at least the following assessment D:

|Def(Ti)| · [epistemic threshold]

where epistemic threshold is a value from the interval (0, 1].

• For the evaluation in terms of assessments A, M and N: if Ti stands fora theory that has the lowest Evaluation Score(Ti) for each of the threemeasures, then the set of best theories consists of those theories that havemaximally the following score:

Evaluation Score(Ti)

[epistemic threshold]

where epistemic threshold is a value from the interval (0,1].

For instance, let Ti be a theory with Disc(Ti) = 20, Undef(Ti) = 10 andDef(Ti) = 10 and assume Ti is the theory with the most defended argumentsand the lowest evaluative score according to A, M and N procedures. We choosethe epistemic threshold of 0.9. For each of the evaluation procedures we get thefollowing scores:

• D: all theories that have at least 10 · 0.9 = 9 defended arguments will fallamong the set of best theories,

• A: all theories whose degree of anomalies is smaller than 10/0.9 = 11.11count among the best ones,

• M: all theories whose multiplication score is less that (10·20)/0.9 = 222.22count among the best ones,

• N: all theories whose normalization score is less than (10/20)/0.9 = 0.55count among the best ones.

The primary idea behind this mechanism is that a rivaling theory has to passa sufficiently wide margin to be considered superior to one’s current theory. Thiscorresponds to the reasoning of a scientist who uses a dose of caution in such

14

Page 15: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

evaluations, knowing that future inquiry might reveal new evidence. As a result,she will abandon her current theory not merely after she has seen it performworse multiple number of times (as in the case of rational inertia), but onlyafter its rival has become sufficiently superior to it.

In Table 1, we show the sets of best theories for the example in Figure 6, fordifferent values of epistemic thresholds.

Epistemic threshold 14

13

12

23

34

D {T1, T2} {T1, T2} {T2} {T2} {T2}A {T1, T2} {T1, T2} {T1, T2} {T1, T2} {T1}M {T1, T2} {T1, T2} {T1} {T1} {T1}N {T1, T2} {T1, T2} {T1, T2} {T1, T2} {T1, T2}

Table 1: Set of best theories for the example in Figure 6, for different epistemicthreshold values.

5 Our findings

In this section we present the results of our simulations, focusing on two mea-sures: how successful agents are in converging on the best theory, and how muchtime they need to converge on it.20 Each of the plots shows a mean of 10,000simulations for each data point (unless otherwise indicated). All the simulationswere run with a landscape consisting of 3 theories, each having 85 arguments.While the best theory is fully defended, the other two theories have a certainportion of undefended arguments.

Concerning the last point, we employ two types of landscapes:

1. an ‘easy’ landscape, in which the two suboptimal theories have around35% of undefended arguments,21

2. a ‘difficult’ landscape, in which the two suboptimal theories have around85% of undefended arguments.

That a landscape is easy/difficult means that theories are more or less similarin terms of their degree of defensibility, which makes it easier or harder todetermine which one is the best.

A simulation stops when one of the theories is fully explored. At this pointwe examine whether the agents have converged on the best theory, and if so, atwhich step of the simulation they have done so.22

20Due to space restructions, many of the plots are omitted from the paper and can be foundin the online appendix of this paper.

21For the exact procedure of how the attacks are generated, and the degree of defensibilityof the two suboptimal theories such a procedure results in, see Borg et al., 2018.

22The reason why we stop the simulation at this point is that otherwise some agents wouldbecome ‘idle’: since they have explored their preferred theory fully, the only way they wouldchange their preference is by waiting for other agents to send them new information. Borg

15

Page 16: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

As for our two mechanisms explicated in the previous section—which we callfor short ‘threshold mechanisms’ or ‘thresholds’—we have employed the tempo-ral threshold of 10. This means that in order for an agent to switch to a rivalingtheory, she has to consistently evaluate that theory as one of the best ones (andbetter than her current theory) for 10 (not necessarily consecutive) rounds.23

For the epistemic threshold, we have opted for a relatively small value of 0.9.We have tested our results with higher thresholds (e.g. temporal threshold of50, and the epistemic threshold of 0.7) and they have remained robust underthese changes, except for the time agents need to achieve convergence, which,as expected, increases with higher thresholds.

5.1 Results

We will now focus on four interesting points revealed by the simulations. In thenext subsection we will discuss these findings.

Impact of threshold-mechanisms. First, the impact of the threshold mech-anisms varies across different evaluative procedures. The only case where weobserve a positive effect of thresholds on the success of agents is the completegraph employing procedure D. The impact of thresholds on different networksemploying assessment D can nicely be observed in case of a larger population (of70 agents), represented in Figure 7. In case of all other evaluations and networkstructures, thresholds either have no effect or they have a negative effect, acrossboth easy and difficult landscapes (see Table 2).

et al., 2019 propose an alternative model in which the simulation continues after this point,eventually bringing all agents on the best theory, so that the efficiency is measured in termsof time only (similarly to the ABM proposed by Frey and Seselja, 2018a).

23In view of an interpretation suggested by Borg et al., 2017, according to which a roundstands for a working day, this threshold means that scientists have to wait 10 weeks beforebeing able to change their theory. Of course, different interpretations of the time in the modelare possible.

16

Page 17: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

0

20

40

60

80

Perc

ent o

f suc

cess

ful r

uns

Easy landscape Difficult landscapeNo thresholds With thresholds No thresholds With thresholds

cycle wheelcomplete

Figure 7: Success of agents employing procedure D (70 agents).

Success TimeEvaluation Complete Sub-Complete Complete Sub-CompleteD + + − − − − − −A ± ± ± ± ± ± ± ±M ± ± − − ± ± ± ±N ± ± ± ± − ± − ±

Table 2: The impact of threshold-mechanisms on different social networks withrespect to the four evaluative procedures on the easy landscape (on the left) andon the difficult landscape (on the right with shaded background). Complete:complete graph; Sub-Complete: cycle and wheel networks; +: positive impact;−: negative impact; ±: neither positive nor negative impact. Note that theeffect is the strongest for larger populations.

Efficiency of different evaluative procedures. Second, different evaluativeprocedures result in drastically different degrees of efficiency, across all threenetworks. While D assessment results in the worst performance for all threenetworks in case of both types of landscapes, N procedure makes all three net-works very efficient on the easy landscape. Nevertheless, a complete graphemploying the M procedure overtakes the N one on the difficult landscape (seeFigures 8 and 9).

Efficiency of different social networks. Third, the relative efficiency ofdifferent social networks remains pretty robust across all explored scenarios,with the complete graph outperforming less connected networks in terms ofboth – the success of agents in converging on the best theory, and the amount

17

Page 18: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

0

20

40

60

80

100

Perc

ent o

f suc

cess

ful r

uns

0 20 40 60 80 100Scientists

D AM N

Figure 8: Easy landscape: success of agents connected in the complete graphfor different evaluation procedures (aggregated over both runs with thresholdsand without thresholds).

0

20

40

60

80

100

Perc

ent o

f suc

cess

ful r

uns

0 20 40 60 80 100Scientists

D AM N

Figure 9: Difficult landscape: success of agents connected in the complete graphfor different evaluation procedures (aggregated over both runs with thresholdsand without thresholds).

18

Page 19: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

0

5

10

15

20

0 .5 1 0 .5 1

No thresholds With thresholds

Den

sity

of o

bser

vatio

ns

Degree of diversity

(a) D procedure

0

2

4

6

8

0 .5 1 0 .5 1

No thresholds With thresholds

Den

sity

of o

bser

vatio

ns

Degree of diversity

(b) N procedure

Figure 10: The effect of thresholds on the degree of diversity (70 agents, difficultlandscape, complete graph)

of time they need to achieve such convergence. In the case of A, M and Nevaluations the complete graph is extremely successful on the easy landscape,while being a bit less successful on the difficult one.

Transient Diversity. As mentioned in Section 1, the literature on ABMs ofscience has advanced the idea that in order to optimize efficiency of scientificinquiry we seek a diversifying mechanism that creates a tension among agentssuch that it is (a) strong enough to prevent agents from an early convergence onthe wrong theory and (b) sufficiently soft to enable them to eventually convergeon the right theory. The wanted type of diversity has been labeled transient.One ingredient of such diversity was identified in the social network structure,another one in epistemic biases (Zollman, 2010). In this paper we have studiedother parameters, such as evaluative standards of agents and (temporal andepistemic) thresholds used by agents when deciding when to choose anothertheory.

Our first expectation is that higher thresholds have a diversifying effect sim-ilar to loser network structures. And indeed this is what we see for instance inFigure 10 for the D and N procedure. We measure diversity of a run in terms ofthe number of rounds in which agents have no consensus on any theory dividedby the number of rounds it took to terminate the run. We can see that the centerof mass is moved to the right (more diversity) when introducing thresholds.

When considering the relation between the degree of diversity and efficiencywe may naively expect a bell-shaped curve at whose peak we find runs withmost efficiency while moving to more or less diversity the situation worsens.Things are more complicated, though. We find, for instance, a camel-like curvefor the D-procedure and difficult landscapes (see Figure 11) with one peak forruns with diversity degrees between 0 and 0.1, and another peak for runs withdiversity degrees between 0.7 and 0.8. Furthermore, the difficulty of the land-scape influences the shape of the curve: for easy landscapes more diversity ishighly beneficial as we can see for the interval from 0.5 to 0.8, but less so for

19

Page 20: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

0

20

40

60

80

100

Perc

ent o

f suc

cess

ful r

uns

0 .2 .4 .6 .8 1Degree of diversity

D evaluation, difficult landscape D evaluation, easy landscape

N evaluation, difficult landscape N evaluation, easy landscape

(a) Percentage of successful runs relative todegree of diversity

0

50

100

150

200

Aver

age

roun

d co

nver

ged

0 .2 .4 .6 .8 1Degree of diversity

D evaluation, difficult landscape D evaluation, easy landscape

N evaluation, difficult landscape N evaluation, easy landscape

(b) Round of convergence relative to degree ofdiversity

Figure 11: The effect of diversity on efficiency (70 agents, complete graph)

low diversity degrees (unlike in the difficult landscape). Also the evaluation cri-terion matters, as we can observe when considering the N procedure where wesee a continuous (for a long time slow) decline of efficiency with higher degreesof diversity.

In sum, the efficiency-diversity relation does not in general exhibit a simplebell-like curve. Moreover, the shape of the curve is highly dependent on factorssuch as the underlying evaluative procedure and the difficulty of the problem.Furthermore, in some cases (like the N procedure) diversity has not much ofan influence on efficiency (except for extreme degrees). This also highlights theimportance of studying other factors which influence the efficiency of scientificinquiry, such as evaluation procedures, as done in this paper.

5.2 Discussion

We will now comment on a few most important aspects of our findings.Highly successful communities. The first striking point that deems anexplanation is the extremely high success rate of fully connected communitiesin case of A, M and N evaluations. Why do these populations perform so well?To answer this question, we will first explain (i) why fully connected networkstend to be at least as successful as the less connected ones, and in most casesmuch more successful, and then turn to (ii) the success of A, M and N evaluationsin particular.

As for (i), the reason for their success lies in the way information is repre-sented in our model. How accurate one’s assessment about the given theories is,directly depends on how much knowledge of the landscape the agent has. Largergaps in such knowledge can easily lead to errors in theory assessments. Now,since our agents share only recently acquired information (rather than their en-tire knowledge of the landscape), in less connected communities some of thisinformation may easily be missed, and hence their knowledge of the landscapewill be ‘patchier’. As a result, they may fail to accurately determine the best

20

Page 21: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

theory.24 Note that this is also why larger communities linked in sub-completegraphs have a low success rate: since in our community-networks not every agentcommunicates with every other agent (instead collaborative groups appoint rep-resentative agents who then share information in community-networks), the de-gree of connectedness gets smaller the larger the overall population is, and as aresult, subjective knowledge can in larger populations be rather different acrossdifferent collaborative groups. Moreover, since agents share recently gatheredinformation, there may be a permanent information loss in such groups. This isin contrast to, e.g., Zollman’s model, where any shared information is represen-tative of the entire theory, which makes information losses much less harmful.We take ArgABM, however, to be representative of a situation in which scientistswho don’t share all their results may fail to have an encompassing understandingof each of the rivaling theories (e.g. they might lack an insight into an importantstudy in one of the theories). This means that larger populations of scientistswill have a harder time converging on the same theory due to the fact that theyassess theories in view of different evidence. This is, however, not unrealistic:larger scientific communities that are not tightly connected indeed tend to havea harder time achieving consensus on one theory.

As for (ii), the reason why A, N and M evaluations perform better than Dbecomes clear when we observe that agents in the case of the former assessmentstend to switch more often from one theory to another (see Figures 12 and 13). Inother words, these assessments generate diversity by allowing agents to changetheir theories and gain enough information about them to accurately decidewhich one is the best.

Cautious decision-making. What do our results tell us about cautiousdecision-making and its conduciveness to efficient inquiry? The impact of ourthreshold mechanisms seems to be highly dependent on (i) the degree of con-nectedness of the given community, and (ii) the evaluation underlying theorychoice employed by agents (as visible from Table 2). Altogether, the thresh-olds increase the efficiency only of fully connected communities that employ Dassessment, while sometimes having the opposite effect on the less connectedones. Moreover, for A, N and M assessments the addition of thresholds justslows them down.

In view of these considerations it might seem like our mechanisms of cautiousdecision-making play no beneficial epistemic role at all unless scientists applythe assessment in terms of D procedure. Nevertheless, a closer look at thesimulations reveals that thresholds do play an important role, which is notimmediately clear when analyzing the results for success and time. Lookingat the exploratory behavior of agents—how many times they switch from onetheory to another—we observe that without the presence of thresholds, agents

24Though we haven’t examined the situation in which agents share a random subset oftheir knowledge of the landscape (rather than only recently acquired information), the fullyconnected community would most likely still outperform the less connected networks since,on the one hand, it would still have a less patchier knowledge of the landscape than the othertwo networks, while on the other hand, such a change is not likely to increase the chance thatthe community prematurely abandons the best theory.

21

Page 22: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

0

5

10

15

Aver

age

jum

ps

D A M N

cycle wheelcomplete

Figure 12: The number of times agents switch from one theory to anotherwith no threshold-mechanisms, averaged over all population sizes for the easylandscape.

05

1015

Aver

age

jum

ps

D A M N

cycle wheelcomplete

Figure 13: The number of times agents switch from one theory to anotherwith temporal threshold of 10 and epistemic threshold of 0.9, averaged over allpopulation sizes for the easy landscape.

22

Page 23: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

frequently switch between theories (see Figures 12 and 13). While our modeldoesn’t take into account that changing theories can be costly (in terms of timeone needs to learn the necessary background knowledge or in terms of costsof acquiring the right equipment), in many domains this can be an importantissue.25

This brings us to the following conclusion: while in view of previous ABMs(such as Frey and Seselja, 2018a), it seemed that threshold mechanisms playedan important role in generating transient diversity in fully connected commu-nities, our results indicate that this is the case only under certain conditions.More precisely, threshold mechanisms will have a beneficial impact only if thecosts of changing theories, occurring in the absence of cautious decision-making,are high enough to make incautious communities slower than the cautious ones.This points to the importance of including this factor in ABMs of scientific in-quiry. Note, however, that a proper study of such costs would require empiricalcalibration of the given model. First, the time in the model would have to bemapped to the real time of inquiry, and second, the costs associated with chang-ing one’s theory would have to be based on empirical data concerning the givendomain of science.

The role of diversity. Let’s take now a closer look at the D procedure toget a better understanding of the role diversity plays in our simulations. Aswe can observe in Figure 10a, without thresholds the majority of the runs isroughly located between diversity degrees 0 and 0.5 while with thresholds it isroughly between 0.5 and 1. When introducing thresholds we only get a slightincrease in successful runs for the difficult landscape despite the vast difference indiversity (see Figure 7). How to explain this? The answer is given in Figure 11a.Given the information from Figure 10a, We notice that without thresholds manysuccessful runs will be located around the steep peak at 0.1 and not many aroundthe peak at 0.7 to 0.8. When introducing thresholds the situation is exactly viceversa. Since overall the area between 0.5 and 0.8 is more elevated as comparedto the area from 0 to 0.5 we get a slight boost in efficiency, but not too much.

This analysis demonstrates that when analyzing the given dynamics in ourruns, diversity has explanatory value: only by combining the data given inFigures 10a and 11a we were able to explain the only slight performance boostin Figure 7. Nevertheless, we consider the investigations into diversity in thissection preliminary for several reasons. For instance, our way of measuringdiversity is still very rough. A more refined approach may provide measures thatdistinguish between synchronic and diachronic diversity: the former concernsthe distribution of agents among different theories at a given time point, thelatter concerns the number of times agents change theories over the course of arun. Our current measure can be considered as a rough way of measuring theformer. We postpone a more in-depth analysis for future work.

More general take-home message. More generally, these results show that

25For example, different hypothesis in medicine concerning the main causes behind a givendisease may require knowledge in different medical disciplines. For a further discussion on theimportance of including costs of this kind into ABMs of science see Muldoon, 2017.

23

Page 24: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

determining the impact of a specific factor on the efficiency of scientific inquiry ishighly dependent on the specific model and its idealizations. While in Zollman-inspired ABMs threshold mechanisms had a big impact, in ArgABM they do soonly under very specific conditions. In the former, their main role is in prevent-ing the community from prematurely converging on the wrong hypothesis byallowing for more data to be gathered before the decision is made. This is alsothe case in ArgABM when agents employ D assessment on the easy landscape.However, a much better approach to increasing the efficiency seems to lie in thetype of assessment underlying scientists’ decisions as for which theory to pursue.Altogether, our analysis provides further support to the argument that ABMsof science are in need of detailed robustness analysis before we can draw fromthem any conclusions about actual scientific practice.26

It is also worth noticing that differences between our procedures for theory-choice could be understood as representing specific epistemic and methodolog-ical values preferred by scientists. While such preferences are still highly ide-alized across ABMs of science, our results suggest that methodological valuesmay play an important role in the efficiency of inquiry and that they deservefurther attention. Beside Weisberg and Muldoon’s (2009) ‘mavericks’ and ‘fol-lowers’, or Currie and Avin’s (2018) ‘obligates’ and ‘omnivores’27, other types ofmethodological preferences could be considered: for instance, a method basedon the search for defeaters vs. a method that prioritizes corroborating evidencefor one’s current hypothesis, etc.

Another important take-home message is that some relevant factors may verywell remain hidden unless we take an in-depth analysis of the given simulations.For instance, while the impact of the threshold-mechanisms seemed rather neu-tral or even harmful for three of our evaluations, only once we have examinedhow often scientists change theories, it has become obvious that they did playan important role—by reducing possibly high costs that may be involved in ascientist’s frequent change of a pursued theory.

6 Outlook and conclusion

In this paper we have investigated the impact of different factors on the effi-ciency of scientific inquiry by means of ArgABM. To this end, we have examinedthe impact of cautious decision-making, different assessments underlying theory-choice, and different network structures on the efficiency of inquiry. In addition,we have examined the phenomenon of transient diversity by studying the re-lationship between diverse, non-consensual spread of scientists across differenttheories and their performance under varying conditions. Our results suggest

26For the importance of robustness analysis for models in general see e.g. Weisberg, 2006and for ABMs of science in particular see Frey and Seselja, 2018a,b; Seselja, 2019.

27While maverics and followers stand for more or less epistemically risk-averse agents, om-nivorse are agents that prioritize independent evidence for their hypotheses, i.e. evidence thatis supported by background theories that overlap as little as possible. Obligates, on the otherhand, seek sharp evidence that “speaks clearly and firmly” (p.5): the sharper the evidencethe more it allows us to increase our credence in a given hypothesis.

24

Page 25: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

that, on the one hand, cautious decision-making has a significant impact on theefficiency of inquiry only under specific conditions. On the other hand, differentassessments underlying theory-choice and different network structures result invarying degrees of efficiency. Moreover, diversity is not always correlated with asuccessful performance of scientists, but only under some conditions employedin the model. Such correlation occurs when scientists prefer theories that havea relatively larger scope of solidified results, in comparison to their rivals.

It is important to add though that the nature of this model and our resultsare primarily exploratory (rather than having normative consequences for actualscientific inquiry). The next step in this investigation includes, for instance,examining the performance of other evaluation procedures, which include themeasure of the growth of the given research program.28 Next, it would bevaluable to relate these evaluations with philosophical and historical accountsof decision-making in the context of pursuit (such as Nickles, 2006; Seseljaand Straßer, 2014a; Whitt, 1992), as well as to empirically calibrate differentaspects of the model (such as the time of inquiry, the degree of anomaly of giventheories, etc.). Furthermore, it remains a task for future research to determinewhich types of inquiry (e.g. more related to some scientific domains rather thanothers) are more adequately captured by Zollman-inspired models, which byGrim & Singer’s ones, and which by ArgABM. Finally, our results point to theimportance of further studies of the phenomenon of transient diversity and itsrelation to efficient inquiry.

Acknowledgments We are grateful to two anonymous reviewers for valuablecomments on the previous draft of this paper. The research by AnneMarie Borgand Christian Straßer is supported by a Sofja Kovalevskaja award of the Alexan-der von Humboldt Foundation and by the German Ministry for Education andResearch. The research by Dunja Seselja is supported by DFG-SNF ResearchGrant 278967060 “Inferentialism, Bayesianism and Scientific Explanation”.

References

Alexander, J McKenzie (2013). “Preferential attachment and the search forsuccessful theories”. In: Philosophy of Science 80.5, pp. 769–782.

Alexander, Jason McKenzie, Johannes Himmelreich, and Christopher Thomp-son (2015). “Epistemic landscapes, optimal search, and the division of cog-nitive labor”. In: Philosophy of Science 82.3, pp. 424–453.

Borg, AnneMarie, Daniel Frey, Dunja Seselja, and Christian Straßer (2017).“Examining Network Effects in an Argumentative Agent-Based Model ofScientific Inquiry”. In: Logic, Rationality, and Interaction: 6th InternationalWorkshop, LORI 2017, Sapporo, Japan, September 11-14, 2017, Proceedings.

28Such approach could capture, for instance, Laudan’s (1977) suggestion that “it is alwaysrational to pursue any research tradition which has a higher rate of progress than its rivals(even if the former has a lower problem-solving effectiveness)” (p. 111, italics in original).

25

Page 26: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

Ed. by Alexandru Baltag, Jeremy Seligman, and Tomoyuki Yamada. Berlin,Heidelberg: Springer Berlin Heidelberg, pp. 391–406.

Borg, AnneMarie, Daniel Frey, Dunja Seselja, and Christian Straßer (2018).“Epistemic effects of scientific interaction: approaching the question withan argumentative agent-based model”. In: Historical Social Research 43.1,pp. 285–309.

— (2019). “Using Agent-Based Models to Explain Past Scientific Episodes: to-wards robust findings”. In: Forthcoming.

Currie, Adrian and Shahar Avin (2018). “Method Pluralism, Method Mismatch& Method Bias”. In: Philosopher’s Imprint.

Dung, Phan Minh (1995). “On the Acceptability of Arguments and its Funda-mental Role in Nonmonotonic Reasoning, Logic Programming and n-PersonGames”. In: Artificial Intelligence 77, pp. 321–358.

Frey, Daniel and Dunja Seselja (2018a). “Robustness and Idealization in Agent-Based Models of Scientific Interaction”. In: British Journal for the Philoso-phy of Science https://doi.org/10.1093/bjps/axy039.

— (2018b). “What is the Epistemic Function of Highly Idealized Agent-Based Models of Scientific Inquiry?” In: Philosophy of the Social Scienceshttps://doi.org/10.1177/0048393118767085.

Grim, Patrick (2009). “Threshold Phenomena in Epistemic Networks.” In:AAAI Fall Symposium: Complex Adaptive Systems and the Threshold Ef-fect, pp. 53–60.

Grim, Patrick, Daniel J Singer, Steven Fisher, Aaron Bramson, William JBerger, Christopher Reade, Carissa Flocken, and Adam Sales (2013). “Sci-entific networks on data landscapes: question difficulty, epistemic success,and convergence”. In: Episteme 10.04, pp. 441–464.

Kelp, C. and I. Douven (2012). “Sustaining a Rational Disagreement”. In: EPSAPhilosophy of Science: Amsterdam 2009, pp. 101–110.

Kuhn, Thomas (1962 [1996]). Structure of Scientific Revolutions. 3rd ed.Chicago: The University of Chicago Press.

Kummerfeld, Erich and Kevin JS Zollman (2016). “Conservatism and the scien-tific state of nature”. In: The British Journal for the Philosophy of Science67.4, pp. 1057–1076.

Lakatos, Imre (1978). The methodology of scientific research programmes. Cam-bridge: Cambridge University Press.

Laudan, Larry (1977). Progress and its Problems: Towards a Theory of ScientificGrowth. London: Routledge & Kegan Paul Ltd.

Muldoon, Ryan (2017). “Diversity, Rationality and the Division of CognitiveLabor”. In: Scientific Collaboration and Collective Knowledge: New Essays.Oxford University Press.

Nickles, Thomas (2006). “Heuristic Appraisal: Context of Discovery or Justifica-tion?” In: Revisiting Discovery and Justification: Historical and philosophicalperspectives on the context distinction. Ed. by Jutta Schickore and FriedrichSteinle. Netherlands: Springer, pp. 159–182.

Poyhonen, Samuli (2017). “Value of cognitive diversity in science”. In: Synthese194.11, pp. 4519–4540.

26

Page 27: Theory-Choice, Transient Diversity and the E ciency of ...philsci-archive.pitt.edu/15653/1/Theory-choice.pdf · 1Institute for Philosophy II, Ruhr-University Bochum 2Faculty of Economics

Poyhonen, Samuli and Jaakko Kuorikoski (2016). “Modeling epistemic commu-nities”. In: The Routledge Handbook of Social Epistemology (forthcoming).Ed. by M. Fricker, P. J. Graham, D. Henderson, N. Pedersen, and J. Wyatt.Routledge.

Seselja, Dunja (2019). “Exploring Scientific Inquiry via Agent-Based Modeling”.In: Fortchoming.

Seselja, Dunja and Christian Straßer (2013). “Abstract argumentation and ex-planation applied to scientific debates”. In: Synthese 190, pp. 2195–2217.

— (2014a). “Epistemic justification in the context of pursuit: a coherentist ap-proach”. In: Synthese 191.13, pp. 3111–3141.

— (2014b). “Heuristic Reevaluation of the Bacterial Hypothesis of Peptic UlcerDisease in the 1950s”. In: Acta Biotheoretica 62, 429–454.

Seselja, Dunja and Erik Weber (2012). “Rationality and Irrationality in the His-tory of Continental Drift: Was the Hypothesis of Continental Drift Worthyof Pursuit?” In: Studies in History and Philosophy of Science 43, pp. 147–159.

Thoma, Johanna (2015). “The Epistemic Division of Labor Revisited”. In: Phi-losophy of Science 82.3, pp. 454–472.

Weisberg, Michael (2006). “Robustness analysis”. In: Philosophy of Science 73.5,pp. 730–742.

Weisberg, Michael and Ryan Muldoon (2009). “Epistemic landscapes and thedivision of cognitive labor”. In: Philosophy of science 76.2, pp. 225–252.

Whitt, Laurie Anne (1992). “Indices of Theory Promise”. In: Philosophy ofScience 59, pp. 612–634.

Wilensky, Uri (1999). “NetLogo.(http://ccl.northwestern.edu/netlogo/)”. In:Center for Connected Learning and Computer Based Modeling, Northwest-ern University.

Zollman, Kevin J. S. (2007). “The communication structure of epistemic com-munities”. In: Philosophy of Science 74.5, pp. 574–587.

— (2010). “The epistemic benefit of transient diversity”. In: Erkenntnis 72.1,pp. 17–35.

27