Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

58
DATA EXPLORATION AND INFORMATION RETRIEVAL IN THE B2C MARKETING DEPARTMENT OF THE FINANCIAL TIMES Seun Odueko December 2013

description

An in-depth research into how members of the marketing team explore and analyse data to gain insight, with design recommendations for dashboard reporting tools.

Transcript of Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page 1: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page |

CIT

DATA EXPLORATION AND INFORMATION RETRIEVAL IN THE

B2C MARKETING DEPARTMENT OF THE

FINANCIAL TIMES

Seun Odueko

December 2013

Page 2: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | i

ABSTRACT A case study of marketing professionals in the B2C Marketing department of the Financial Times investigated their patterns of behaviour when interacting with data sources and analytics tools to perform work tasks. The study was conducted within the framework of search and discovery proposed by Russell-Rose and Tate. Data was gathered by observing the participants in their natural work environment, and their data exploration goals and strategies were analysed respectively, using Attfield and Blandford's Information Journey Model and Russell-Rose et al.'s Dimensions of Search User Experience. The findings reveal that the participants’ goals were mainly static rather than dynamic, but they performed ‘exploratory search’ activities to fulfil them. Evidence was also found in support of Search Modes, and their occurrence in clusters of Mode Chains, within a live work environment. Recommendations for design features of analytics tools that facilitate the observed patterns of behaviour were made, with examples of how they could be implemented. Keywords: User Goal; Search Mode; Mode Chain; Data Exploration; Information Retrieval.

Page 3: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | ii

1 Table of Contents ABSTRACT ....................................................................................................................................... i

LIST OF FIGURES ............................................................................................................................ iv

LIST OF TABLES .............................................................................................................................. iv

2 PROBLEM DESCRIPTION AND OBJECTIVES ............................................................................ 1

2.1 Background ................................................................................................................... 1

2.2 Question ........................................................................................................................ 2

2.3 Objectives ...................................................................................................................... 2

2.4 Outcomes ...................................................................................................................... 2

2.5 Beneficiaries .................................................................................................................. 3

2.6 Structure ....................................................................................................................... 3

3 ACADEMIC CONTEXT ............................................................................................................. 4

3.1 Data Exploration ............................................................................................................ 4

3.2 User Type ...................................................................................................................... 5

3.3 User Goal ....................................................................................................................... 6

3.4 Search Context .............................................................................................................. 7

3.5 Search Mode ................................................................................................................. 7

3.6 Data Type ...................................................................................................................... 8

3.7 Analytics Tool ................................................................................................................ 9

4 METHODS ............................................................................................................................ 11

4.1 Participant Selection ................................................................................................... 11

4.2 Data Gathering ............................................................................................................ 12

4.2.1 User Type ............................................................................................................ 12

4.2.2 User Goal ............................................................................................................. 15

4.2.3 Search Context .................................................................................................... 15

4.2.4 Observation ......................................................................................................... 16

4.3 Data Preparation ......................................................................................................... 20

4.3.1 Tools .................................................................................................................... 20

4.3.2 Coding ................................................................................................................. 20

5 RESULTS AND DISCUSSION .................................................................................................. 23

5.1 User Goal ..................................................................................................................... 23

5.1.1 Code Frequency .................................................................................................. 23

5.1.2 Code Sequences .................................................................................................. 25

5.1.3 Designing for User Goals ..................................................................................... 26

5.2 Search Mode ............................................................................................................... 28

5.2.1 Code Frequency .................................................................................................. 28

5.2.2 Code Sequences .................................................................................................. 42

Page 4: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | iii

6 EVALUATION, REFLECTIONS AND CONCLUSIONS ............................................................... 46

6.1 User Goal ..................................................................................................................... 46

6.2 Search Mode ............................................................................................................... 47

6.3 Project Management .................................................................................................. 47

6.4 Further Research ......................................................................................................... 49

7 GLOSSARY ............................................................................................................................ 50

7.1 Usability ....................................................................................................................... 50

7.2 User Experience .......................................................................................................... 50

7.3 User Interface .............................................................................................................. 50

7.4 Cognition ..................................................................................................................... 50

8 REFERENCES ........................................................................................................................ 51

Page 5: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | iv

LIST OF FIGURES

Figure 1: Serialist ......................................................................................................................... 14

Figure 2: Holist ............................................................................................................................ 14

Figure 3: Frequency of User Goals for All Participants ............................................................... 23

Figure 4: Frequency of User Goals per Participant ..................................................................... 24

Figure 5: Code Sequences - Participant H ................................................................................... 25

Figure 6: Code Sequences - All Participants ................................................................................ 26

Figure 7: Number of Occurrences per Code ............................................................................... 28

Figure 8: Number of Participants per Code ................................................................................ 29

Figure 9: Workbook Connections ................................................................................................ 34

Figure 10: Evaluation Comments ................................................................................................ 34

Figure 11: Current Selections ...................................................................................................... 35

Figure 12: Code Sequences ......................................................................................................... 42

Figure 13: Dashboard Supporting Monitor-Compare-Evaluate .................................................. 44

Figure 15: Dashboard Supporting Explore-Compare-Evaluate ................................................... 45

Figure 14: Dashboard Supporting Explore-Compare-Evaluate ................................................... 45

LIST OF TABLES

Table 1: Domain Expertise of Participants .................................................................................. 12

Table 2: Technical Expertise of Participants ............................................................................... 13

Table 3: Learning Style of Participants ........................................................................................ 13

Table 4: User Type of Participants .............................................................................................. 14

Table 5: Job Roles of Participants ............................................................................................... 16

Table 6: Summary of Observation Sessions ................................................................................ 19

Table 7: Codebook of User Goals ................................................................................................ 21

Table 8: Codebook of Search modes (those highlighted in green are my own “grounded codes”) ..................................................................................................................................................... 22

Page 6: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 1

2 PROBLEM DESCRIPTION AND OBJECTIVES

2.1 Background Business data has evolved from being perceived primarily as a by-product of business processes (Magal and Word, 2009, p.18) to a key productivity driver through which organisations derive competitive advantage (Barton and Court, 2012). In the present era of “big data” where organisations face a deluge of different forms of data from various internal and external sources and at near real-time speed (Beyer and Laney, 2012, cited in Sicular, 2013), those who have successfully harnessed it in their decision-making process achieved average improvements of 5% in relative productivity and 6% in relative profitability (McAfee and Brynjolfsson, 2012). Research into user attitudes towards business intelligence software, conducted in October 2012 by Ventana Research (Smith, 2013), revealed that a greater percentage of respondents are more concerned about usability (63%) than functionality (49%) or reliability (46%). Analytics tools tend to be optimised for functionality and although accessible to experts, end-users are unable to make the most of them or even use them at all due to their complexity (Barton and Court, 2012). IDC researchers concluded that the business analytics software market will grow at a compound annual rate of 9.8% between 2012 and 2016 (Vesset et al., 2012), reaching $50.7 billion (Taft, 2012). Senior executives are increasingly deploying analytics tools in their organisations, contributing to a shortage of Information Technology (IT) and analytics staff with the necessary skills to use them (Vesset et al., 2012). Hence there is a requirement to design tools that fit the way managers and frontline staff, as opposed to data analysts, explore information (Smith, 2013). The understanding of how they explore data and gain actionable insight is important to senior executives making strategic decisions on how data is used across the organisation as well as data analytics and visualisation professionals developing tools and models for end-users. Any organisation producing goods or services has a requirement to identify and fulfil the needs of its target customers (The Chartered Institute of Marketing, 2009) and as such, the marketing function is central to the existence of organisations of virtually all sizes and sectors. Marketing departments are forecast to spend more on technology than IT departments by 2017 (McLellan, 2012) due to growth in electronic commerce (e-commerce) sales volumes and increasing adoption of social media and mobile channels for engaging customers (Murphy, 2012), with spending largely focused on business and web analytics (Moth, 2013). The Business-to-Consumer (B2C) marketing department of the Financial Times – a global business news organisation – includes customer acquisition, customer retention, customer relationship management (CRM) and payment optimisation teams. Current analytics tools are based on user-centred Microsoft Excel (Microsoft, n.d.) models and expert-retrieved Oracle SQL (Oracle, n.d.) reports; as a member of the marketing operations team, I am responsible for retrieving SQL reports and building Excel models for the marketing professionals. These are in the process of being consolidated in Microsoft Excel 2010, using the PowerPivot (Microsoft, n.d.) add-in to empower end-user exploration of data on OLAP cubes (Microsoft, n.d.) and retrieval of information with minimal expert intervention. Thus the data exploration patterns of behaviour exhibited by members of the department may yield useful insight into how analytics tools may be optimised for the marketing function.

Page 7: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 2

This will also complement scientific and industrial research into the data exploration patterns of behaviour of non-experts in a work context. It offers an opportunity to build on existing insight into the search patterns of professionals including sales and marketing specialists (O’Day and Jeffries, 1993) and feed into forthcoming research into data exploration tools designed for non-IT experts working in a variety of sectors (The Technology Strategy Board, 2013).

2.2 Question This study seeks to contribute to the body of knowledge about the characteristics of non-experts and their data exploration patterns of behaviour, which can be used to inform the design of business analytics tools with high usability for staff and thus higher returns on investment (ROI) for organisations. The research question to be answered is “how do Financial Times’ B2C marketing professionals in the customer acquisition, customer retention, CRM and payment optimisation teams explore customer data and retrieve required information?” and this will be investigated from two perspectives:

Why and how do they seek information? What strategies do they employ to find and understand information?

Data is usually distinguished from information with the former consisting of numbers and words that exist in a raw, unprocessed state while the latter has added qualities of organisation and relationships that make it meaningful (Yacci, 1999, Pohl, 2001). For the purposes of this study however, no distinction is made between data and information and they are used interchangeably, following a precedent established in previous research (Baker et al., 2009).

2.3 Objectives My primary objective is to research and analyse the behavioural processes exhibited by the marketing professionals as they interact with data sources and analytical tools to fulfil specific work tasks. My secondary objective is to use the results of the analyses to suggest user interface (UI) design features of analytics tools that support the behavioural processes.

2.4 Outcomes The project outcomes will be:

Closed-format questionnaires will be used to gather data and analysed to classify the participants according to their domain expertise, technical expertise, cognitive style and learning style. Plans to classify the marketing professionals’ work data according to Cool and Belkin’s dimensions (2002, cited in Russell-Rose and Tate, 2013, p.58) have been discontinued in order to keep the study’s scope manageable.

Page 8: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 3

Participants will be observed in their natural working environment while performing a regular work task; the observation data will be coded and analysed to yield a description of the methods and reasoning processes followed in selecting and interpreting information. The analysis framework to be used has been changed from Sensemaking (Pirolli and Card, 2005) to Dimensions of Search User Experience (Russell-Rose and Tate, 2013, p.1) as the latter generates a more robust insight into data exploration processes.

Desirable UI features for supporting those processes that can be implemented in Microsoft Excel 2010 and alternative business analytics tools will be suggested, drawing on prior research, existing software features and creative thinking. There were initial plans to go beyond suggesting the features and implement them by designing dashboards in Microsoft Excel for the marketing professionals, which would be subsequently evaluated. However, this has been shelved due to time constraints.

2.5 Beneficiaries The intended beneficiaries of the project are:

Academics and researchers in the fields of information retrieval and user experience (UX) design.

Senior executives that make investment decisions on business analytics software for their organisations.

Data analysts and IT professionals that deploy analytics tools in organisations and design models for end users.

Business analytics software vendors that develop the applications.

2.6 Structure The project report continues in Chapter 3 with a review of relevant academic literature in the field of information seeking to identify previous research and established theories that have shaped the current body of knowledge. These will provide the study with requisite analytical frameworks that inform the data gathering and result evaluation strategies. In Chapter 4, suitable methods identified in the reviewed literature will be applied to the study, with a description of the participants’ selection process and explanations of how data has been gathered and analysed to yield insight. Chapter 5 presents the results of the analyses, and the findings are discussed in light of the objectives of the study, the extent to which established theories are supported and recommendations for their implementation. The project report concludes in Chapter 6 with an evaluation of the management and execution of the project as well as the outcomes. Suggestions for follow-up research are also made and personal lessons learned from undertaking the project are reflected upon.

Page 9: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 4

3 ACADEMIC CONTEXT The subject of data exploration has received considerable attention from the academic community and a range of established theories have emerged from empirical research. The research findings provide sound frameworks for understanding the various facets of data exploration, identifying gaps in the body of knowledge that need to be addressed and providing direction for future research efforts. Hence a review of academic literature has been conducted to elucidate the impact of personal and professional traits on search behaviour, the differing effects of both information requirements and types of task on search process as well as the strategies employed in retrieving information and gaining insight. The appropriate format for representing different types of data, and analytics software features suitable for data exploration are also examined. For each facet, limitations of existing research are highlighted where applicable. The facets provide the analytical framework for this study, with the theories guiding the choice of data to be gathered and the criteria for evaluating results.

3.1 Data Exploration Exploration in its various forms – investigating, studying, analysing, testing, experimenting, discovering or examining (“Exploring,” n.d.) – is a fundamental human experience and lies at the heart of our individual and collective advancement. It is the process by which we satisfy our curiosity (Edelman, 1997), whether in fulfilment of a defined objective or as an end in itself (Berlyne, 1954, cited in Loewenstein, 1994, p.77). In the domain of information science, exploration is synonymous with the discovery or synthesizing of new knowledge from existing data; it is a journey rather than a race having a pre-conceived outcome (Gossen et al., 2012). It can involve “Exploratory Data Analysis” whereby data is analysed and information that cannot be immediately explained is hypothesised about and tested (Tukey, 1977). Although a statistical technique, “Exploratory Data Analysis” can be generalised for the manner in which users confronted with puzzling information go through an iterative process of possible explanations, which are tested with the available data to prove or disprove them (Gossen et al., 2012), making it synonymous with data exploration. In general terms, a task can be categorised as data exploration if it involves the examination of data without prior knowledge of the information to be discovered (Grinstein and Ward, 1997, Tukey, 1977, Tukey, 1980, cited in Baker et al., 2009). Data exploration is a search activity (Tukey, 1977), that is, looking into or over data carefully or thoroughly in an effort to find or discover information (“Search,” n.d.). However, there is a distinction between simple item retrieval that does not require examination and ‘exploratory search’, which employs learning and investigating methods for information discovery (Marchionini, 2006). In order to appreciate the intricacies of data exploration, it is beneficial to view it through the four “dimensions of search user experience” that frame information search and discovery (Russell-Rose and Tate, 2013, p.1), namely User Type, User Goal, Search Context and Search Mode, as well as data type and analytics tool.

Page 10: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 5

3.2 User Type The “level of knowledge and expertise” (Russell-Rose and Tate, 2013, p.1) of users has a significant impact on the way a given data set is explored and thus it ought to be taken into consideration in the design of search and analytics tools. “Domain experts” possess a high level of subject-matter knowledge and experience relative to “domain novices” while “technical experts” are more versed in the use of information systems to extract information in comparison to “technical novices” (Russell-Rose and Tate, 2013, p.4). Users who possess both domain and technical expertise tend to drill deeply into a narrow subset of data and use advanced techniques to evaluate information, in contrast to novices in both areas who explore a data set widely without venturing beyond the surface and engage in minimal evaluation (Jenkins et al., 2003). In-between these polar extremes are users who are domain experts and technical novices, exploring widely with rigorous evaluation of information, and others who are domain novices and technical experts, exploring both widely and deeply with less sophisticated evaluation (Jenkins et al., 2003). The study by Jenkins et al. (2003) and related research by Kim (2001) focused on web searches and it would be interesting to see how their findings apply to business users of analytics tools. Although a study – involving participants with a broad range of domain knowledge and mostly high technical expertise – has been conducted into the use of interactive information systems in the execution of work tasks (Li and Belkin, 2010), the effects of user knowledge and expertise were neutralized rather than incorporated into the study. As there is a requirement to design analytics tools that fit the way business users explore information (Fisher et al., 2012), an understanding of this matrix of domain and technical expertise as it applies to the marketing professionals can inform UI design and potentially increase their speed in exploring data and making discoveries; this raises the hazards of lax analysis and poor judgment, however (Fisher et al., 2012). Expertise is a function of time and experience and as such it is not advisable to design for a specific group (Russell-Rose and Tate, 2013, p.9); rather UI features can be used to support the transition of users from novice to expert in order to remain relevant at each stage of their development cycle. In the same vein, users’ cognitive style of information processing and learning style of concept representation (Russell-Rose and Tate, 2013, p.14) affect how they explore data and can be supported by the UI of analytics tools. “Holistic thinkers” process information by starting with a high-level view of a given data set in order to grasp its wider context while conversely, “serial thinkers” zero in on details and analyse its constituent parts (Russell-Rose and Tate, 2013, p.11). “Holistic thinkers” are “global processors” who progress from developing a broad overview to focusing on specifics, in contrast to serial thinkers who are “analytic processors”, progressing from a step-by-step grasping of facts to an understanding of the overall concept (Dunn and Griggs, 2003, cited in Denig, 2004). This suggests that cognitive style influences users’ starting point and progression path when seeking information and the business analytics software industry may benefit from empirical research into the relationship between information-seeking patterns of behaviour and cognitive styles of users. Similar understanding of the preferred learning styles of business users can inform the representation of data, with tables and charts provided for verbal and visual learners respectively (Russell-Rose and Tate, 2013, p.14). On the other hand, representing data to suit such preferences may not always be appropriate as the nature of the work task and exploration requirements may determine the format of the data; visual learners may be better served by tabular data for in-depth analysis, for example.

Page 11: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 6

3.3 User Goal Lookup, Learn and Investigate have been identified as search tasks conducted by users in fulfillment of their goals (Marchionini, 2006). Lookup tasks are carried out as a systematic information retrieval process when the goal is clearly defined, such as finding the answer to an unambiguous question like the number of new customers acquired each week during a particular year. On the other hand, goals that are vaguely defined or unknown during the search activity, such as comparing the performances of all acquisition channels employed or analyzing the causes of a drop in acquisition numbers in the third quarter of the year, may require an iterative analysis of multiple data before satisfactory insight is derived. In other words, the more complex and loosely defined the User Goal, the more exploratory iterations needed to uncover relevant insight (Li and Belkin, 2010). These learning tasks and investigating tasks are classified under ‘exploratory search’ (Marchionini, 2006). Gossen et al. (2012, p.290) defined ‘exploratory search’ as “a highly dynamic process of a user to interact with an information space in order to satisfy an information need that requires learning about structure and/or content of the information space.” The exploration process was seen as being shaped by the user’s perspective, which may in turn be transformed by the insight gained and thus lead to further exploration. This notion is corroborated by dynamic models of information seeking, such as the Information Journey Model (Attfield and Blandford, 2010, p.29), which acknowledge that the act of interpreting data modifies the original goals (Belkin, 1993). It has been argued that the fundamental goal of users is to synthesize and further explore their initial information requirements and therefore, the process of exploration is often more important than the information discovered since the successful discovery of information does not fulfill the users’ goals but rather lead to new requirements (Gossen et al., 2012, p.290). However while the importance of the exploration process is acknowledged, the postulation about the fulfillment of user goals could be challenged by the recognition that exploration does not iterate indefinitely; there are entry and exit points in the search process, the former arguably triggered by the desire to fulfill a requirement and the latter arguably triggered by the fulfillment of the requirement. Bridging these two positions is the position that a user’s goal in exploring data is sensemaking, that is, developing an understanding of the information and synthesizing it before responding to it in fulfillment of a requirement (Baker et al., 2009). From a different perspective, the Information Journey Model highlights that the entry point is not restricted to the desire to fulfill a need; rather it could also be an accidental discovery of information or serendipitous insight gained during the evaluation of information (Russell-Rose and Tate, 2013, p.27). Likewise the exit point may not necessarily be triggered by the fulfillment of the exact requirement; the demand placed on users’ attention by the volume and variety of information explored, further amplified by time constraints, may lead to an exit of the search process when the requirements have been “satisficed” (Simon, in Greenberger, 1971) as opposed to fulfilled. A common thread is that an information journey, or data exploration, has a start and an end point. Regardless of their trigger, both are only meaningful when aligned with a pre-defined or newly formed goal of the user. Hence identification of actual, as opposed to simulated, work goals of users in business organisations is essential to understanding their data exploration behaviour. In order to derive findings with a higher degree of validity for business organisations, work-based research is essential and the majority of research conducted in this field of study has not addressed this.

Page 12: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 7

3.4 Search Context Baeza-Yates and Ribeiro-Neto (2011, p.5) distinguish between data retrieval and information retrieval (IR), with the former concerned with matching query parameters to structured data and the latter involving concepts of interpretation and meaning. Hence context is a critical factor in evaluating the IR process. The context of an information-seeking task determines its scope, technique, time length, success criteria and resources used. Building on research by Järvelin and Ingwersen (2004), Russell-Rose and Tate (2013, p.49) devised a model for representing search context in four progressive layers of “information retrieval” – akin to Baeza-Yates and Ribeiro-Neto’s data retrieval definition and Marchionini’s Lookup task; “information seeking” – akin to Baeza-Yates and Ribeiro-Neto’s IR definition and Marchionini’s Learn and Investigate tasks; “work task” – organisational need or personal motive that motivates information seeking; and “cultural context” – expectations associated with the organisation or person performing a work task. Each successive layer encompasses the preceding one; hence an organisation’s cultural context informs the importance attached to work tasks, which in turn are fulfilled by information seeking activities that may involve information retrieval. Work task in organisations is characteristically exploratory in nature and requires clear communication of results to non-experts (Fisher et al., 2012). Functional teams such as marketing or finance identify an information need and rely on the technical expertise of the analytics team to retrieve the information (Fisher et al., 2012). This has a number of drawbacks: the time lag between the request for information and the communication of results; the increased potential for aspects of the original information need to be ‘lost in translation’; less opportunity to explore the available data and gain additional insight beyond the original information need. Hence there is a requirement for organisations to both build robust data models and provide easy-to-use tools for frontline staff to directly interact with the models (Barton and Court, 2012). Furthermore, work task highlights the influence of Search Context on User Goal and associated search activities. In Li and Belkin's (2010) experimental study of the relationship between work task and the process of using information systems to search for information, the type of work task – defined as any activity performed by people in fulfilment of their work responsibility, similar to Russell-Rose and Tate’s organisational work task – determined how users searched for information. Work tasks that involved knowledge acquisition, for example, resulted in ‘exploratory search’ activities and were perceived to be difficult to accomplish, with increased time and information requirements. Similarly, work tasks with clearly defined goals that required Lookup search activities required less effort to accomplish. Findings that different types of tasks utilised different search processes (Toms et al., 2003) corroborate Li and Belkin’s conclusions. Hence a better understanding of organisational needs, including the marketing function, could lead to improved UI design of analytics tools that support users’ information seeking processes. This would provide easier access to relevant information in less time.

3.5 Search Mode While Search Context is concerned with external factors that influence users’ information seeking activities (Russell-Rose and Tate, 2013, p.48), Search Mode focuses on the strategies

Page 13: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 8

employed in seeking information. The process of exploring data involves not only data retrieval but also analyzing and interpreting the retrieved data to build understanding (Russell-Rose and Tate, 2013, p.2). Thus a clear grasp of the entire cycle of exploration, that is, Search Mode, is useful for defining search behavior and supportive features of analytics tools (Russell-Rose and Tate, 2013, p.73). The journey from realizing an information need to satisfactorily fulfilling it often involves a series of search tasks (O’Day and Jeffries, 1993). Building on the search tasks of Lookup, Learn and Investigate (Marchionini, 2006), insights gained from the behavior of business intelligence applications have been distilled into nine modes of enterprise search and discovery for which analytics tools can be optimized, ranging from location of objectively defined data to subjective interpretation of associated sets of data (Russell-Rose et al., 2011). In a similar vein, research into how professional clients of a library interpret and use search results pinpointed three Search Modes – monitoring, planned and exploratory (O’Day and Jeffries, 1993). Users tend to switch between different modes as a search activity progresses, forming a pattern of Mode Chains until evaluation is completed and synthesis is achieved (Russell-Rose et al., 2011). This was demonstrated by the library clients’ use of multiple Search Modes (O’Day and Jeffries, 1993). O’Day and Jeffries focused on the specific information needs and resultant search behavior of professionals; however their study involved the use of librarians as intermediaries in the search process. Hence there remains an opportunity to apply and possibly extend their research findings to the direct IR activities of professionals. The findings that the nature of work tasks may affect the level of effort and amount of information required by users to complete them (Li and Belkin, 2010) underscores the utility of optimizing UI for specific Search Modes, since modes are determined by the task being performed. This has implications for the amount of detail presented and level of drill-down depth made available, depending on task complexity.

3.6 Data Type Extensive research has been conducted to examine various facets of information search and knowledge discovery. The established body of knowledge is predominantly focused on the exploration of textual data through the use of search engines and library databases. This leaves a gap in the understanding of how quantitative data is explored and analysed to yield insight. Tukey (1977) coined the term “exploratory data analysis” as a paradigm for simplifying the description of quantitative data and actively looking beneath its surface to discover new insight (Tukey, 1977, p.v). Data exploration is likened to a detective investigation, discovering and evaluating clues and following their trail until insight is gained (Tukey, 1977, p.1). The preferred investigation tool is a visual display of data, which is seen as vital to the discovery process (Church, 1979). Visualised data, consisting of a fixed scene, variable objects within the scene and defining characteristics of each object, provides a graphical means by which users perceive data and by extension, the underlying context which the data represents (Baker et al., 2009). Visual representations such as graphs and charts are a popular way of depicting and enhancing the perception of quantitative data. Although non-experts might be inclined to represent quantitative data with graphs, the Cognitive Fit theory (Vessey, 1991) was developed as a

Page 14: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 9

framework for determining the optimal conditions for visualisation and conversely, when such data are best represented in tabular form. The research concluded that graphs are suited to spatial tasks, requiring an overview of associations and relationships within the data, while tables are suited to symbolic tasks, requiring an analysis of discrete components of the data (Vessey, 1991). While the validity of the conclusions is accepted within the relevant academic community, it is worth noting that the type of data, rather than user, determines the proposed cognitive fit. It may thus be of interest to find out whether users’ cognitive style and learning style have an impact on the performance of graphs and tables. Specifically, it might be useful to investigate the relative performance of graphs in spatial tasks carried out by serialists and verbal learners as well as the relative performance of tables in symbolic tasks carried out by holists and visual learners.

3.7 Analytics Tool Marchionini (2006) asserts that data retrieval tools such as database systems and general-purpose search engines, which have varying levels of technical expertise requirements, can efficiently fulfill lookup tasks with human interaction limited to query formulation. Conversely, learning and investigating tasks require tools that facilitate continuous human interaction in the ‘exploratory search’ process (Marchionini, 2006). The need for continuous interaction with retrieved data places users, rather than tools, at the heart of the search process (Belkin, 1993), with design implications for user interaction with tools. Users’ ability to explore data and discover associations is enhanced by the development of bisociative (relationships between items in different data sets), as well as what I term intrasociative (relationships between items in one data set), knowledge discovery tools, with appropriate UI design. Gossen et al. (2012, p.291) argue that such UI need to support: dynamic information seeking, identification of associations between data sets, creative ways of exploring data and real-time access to live data. However the exploration of live data, which are typically cloud-based, increases the complexity and cost of the analytics tool used and as such, it may be more productive and feasible to explore subsets of live data stored locally (Fisher et al., 2012). Designing UI to facilitate creative exploration implies a flexible environment with loose navigation structures to avoid shepherding users along a predetermined path, and this may be daunting for non-experts and unsuitable for certain tasks. Hence, more robust knowledge discovery tools would cater for different search mode sequences (Russell-Rose, Lamantia and Burrell, 2011, cited in Russell-Rose and Tate, 2013, p.85), each with a customised UI. The inherent superiority of graphs over lists in revealing associations between data sets (Gossen et al., 2012, p.292) is tempered by advances in analytics tools, particularly with quantitative data. Microsoft Excel (Microsoft, n.d.), one of the most pervasive tools, provides pivot tables as a means of aggregating and combining data to yield insight. PowerPivot (Microsoft, n.d.), a more advanced version implemented in Excel 2010 and above, includes additional abilities to combine separate data sets - albeit with at least one field in common - for analysis as well as the Slicer tool (Microsoft, n.d.) for flexible and interactive data filtering. Specialised analytics tools such as Tableau (Tableau Software, n.d.) and QlikView (QlikTech, n.d.) are optimized for non-experts to explore associations between data sets in a flexible and

Page 15: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 10

interactive manner. The drawback from an organisation’s perspective is the extra cost of deploying such tools, in contrast to Microsoft Excel, which is virtually available by default. While QlikView offers a free version for individual use, accessing corporate data and sharing insights require a paid license. The challenges of evaluating the usability of knowledge discovery tools involve the creation of realistic scenarios and recruiting of representative users (Gossen et al., 2012, p.295). An effective way of sidestepping these challenges is to eschew laboratory-based testing and conduct the evaluation in an actual working environment with typical end-users completing real tasks. The main drawback here would be the risk of participants modifying their normal behavior as a result (Oates, 2006, p.204), whether consciously or subconsciously.

Page 16: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 11

4 METHODS This is an explanatory case study (Yin, 2003, cited in Oates, 2006, p.143) investigating the data exploration and information retrieval patterns of behaviour of four sub-groups of marketing professionals in a business news organisation. The ubiquity of the marketing profession in all industries and organisation sizes, established in section 1.1, makes the findings generalisable for other B2C marketing professionals performing similar tasks. The type of generalisation made is the “rich insight” yielded by the results (Walsham, 1995, cited in Oates, 2006, p.146). The four “dimensions” of User Type, User Goal, Search Context and Search Mode (Russell-Rose and Tate, 2013, p.1) provided the framework under which the study was conducted. The participants completed closed-format questionnaires, which were subsequently analysed to classify them according to their domain expertise, technical expertise, cognitive style and learning style. A description of the goals was obtained from each participant at the beginning of the observation session and analysis of the observation data yielded further insight into their goals. The participants’ job roles, obtained from their LinkedIn profiles, and specific details of their work task, gathered during the observation, defined the context of the information retrieval activities. The observations were conducted in the participants’ natural work environment while performing a regular work task, and qualitative analysis of the data revealed their Search Modes, that is, strategies employed in selecting and interpreting information. Further analysis yielded the sequence of occurrence of the Search Modes and those with the highest levels of clustering are indicative of the Mode Chains followed by the participants. Desirable UI features supporting the Search Modes and Mode Chains – as well as User Goals – which can be implemented in Microsoft Excel 2010 and alternative business analytics tools are suggested, drawing on prior research, existing software features and creative thinking.

4.1 Participant Selection A presentation was given to members of the B2C marketing department explaining the purpose and format of the study. A detailed participant information sheet (Appendix C) was provided to all twelve members of the department based in the London offices at the commencement of the study, out of which eleven agreed to participate and signed an informed consent form (Appendix D). One subsequently withdrew from the study, leaving a total of ten participants for the duration of the study. Members of the department based in the New York, Hong Kong and Manila offices were not invited to participate for logistical reasons. All sub-divisions of the B2C Marketing department were represented, vis-à-vis Audience Development; Customer Acquisition – Optimisation; Customer Acquisition – Direct Marketing; Customer Acquisition – Emerging Platforms; Customer Retention – Retention; Customer Retention – Customer Relationship Management; and Marketing Operations – Payments Optimisation. One participant was at Executive level, six were at Managerial level, one was at Senior Managerial level and a further two were at Departmental Head level.

Page 17: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 12

4.2 Data Gathering Russell-Rose and Tate propose the use of four “dimensions” as a framework for analysing search behaviour, re: User Type, User Goal, Search Context and Search Mode. While this study investigates the last of these dimensions, and User Goal to a lesser degree, data was gathered about the other dimensions through preliminary questionnaires and the actual observation in order to define the boundaries of the Search Modes observed. As such, the data gathered about User Type and Search Context is presented in this chapter rather than in Chapter 5, since they do not form a central part of the study.

4.2.1 User Type Participants completed an 11-question self-assessment of their proficiency in using IT tools to analyse data, a 5-item questionnaire to determine their preferred learning style and a rod-and-frame diagram exercise to indicate how they process information. Data about participants’ domain expertise was gathered from their LinkedIn profiles, with experts defined as possessing 3+ years of professional experience.

4.2.1.1 Domain Expertise Data about participants’ marketing domain expertise was gathered from their LinkedIn profiles, factoring in relevant academic qualifications, professional experience and industry certifications. For the purposes of this study, the benchmark for classification as a “domain expert” is the possession of a relevant first degree or higher and more than three years of professional experience, based on the Chartered Institute of Marketing’s criteria (The Chartered Institute of Marketing, n.d.). Those not meeting these criteria are considered a “domain novice”. Table 1 shows the participants’ qualifications and number of years of experience, with all meeting the “domain expert” criteria.

4.2.1.2 Technical Expertise An 11-question self-assessment of proficiency in using IT tools to analyse data (Appendix E), based on extracts from Bowling Green State University’s self-assessment questionnaire

Table 1: Domain Expertise of Participants

Page 18: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 13

(Bowling Green State University Career Center, n.d.), was completed by participants. Each question had five answer choices ranging from ‘very low’ to ‘very high’ and a corresponding number range of 1 to 5. The questionnaire was scored by adding up the answer choices, dividing by 55 and multiplying by 1000 to rebase their values between 0 and 1000. Isograd’s TOSA proficiency scoring scale (Isograd, n.d.) was adopted in interpreting the scores, with participants classified as ‘beginner’, ‘basic’, ‘productive’, ‘advanced’ or ‘expert’. Those with a classification of productive or greater are considered “technical experts” while those classified as basic or lower are considered “technical novices”. Table 2 shows the results of the participants’ self-assessment with nine meeting the “technical expert” criteria and one falling into the “technical novice” category.

4.2.1.3 Learning Style A 5-question Learning Scenario Questionnaire (Mayer and Massa, 2003) (Appendix F), each describing a learning scenario and having an answer choice of a verbal or visual learning style, was used to determine participants’ preferred style of learning. Participants selecting a verbal learning style three or more times were classified as verbal learners while those selecting a visual learning style three or more times were classified as visual learners. Table 3 shows participants’ answers, with all falling into the visual learning style category.

Table 2: Technical Expertise of Participants

Table 3: Learning Style of Participants

Page 19: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 14

4.2.1.4 Cognitive Style A rod-and-frame diagram exercise (Russell-Rose and Tate, 2013, p.9) (Appendix G) was completed by participants to indicate how they process information. Those who drew a line parallel to the edges of the rectangle were classified as “serialists” (Figure 1) while those who drew a line along the north-south axis of the square background were classified as “holists” (Figure 2). The diagrams indicated that eight participants are “holists” while two are “serialists”.

4.2.1.5 User Type The data of participants’ domain expertise, technical expertise, learning style and cognitive style were combined to form a picture of the types of users whose search modes are being studied (Table 4). This information mainly contextualises the observation data within Russell-Rose and Tate's (2013, p.3) framework but efforts were also made to identify any influence of participants’ characteristics on their data exploration activities.

Figure 2: Holist Figure 1: Serialist

Table 4: User Type of Participants

Page 20: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 15

4.2.2 User Goal Although the objective of this study is to gain insight into how users interact with data in order to fulfil their needs, rather than the actual nature of the needs, knowledge of the participants’ goals provides a useful backdrop to their respective search modes. Prior to the observation, the following questions were sent to participants to ensure their tasks have a clearly defined goal:

1. What specific information do you need? 2. What triggered the need? 3. Which data set will you explore to meet the need? 4. Which software application is the data set contained in? 5. Which software application is the data set analysed in? 6. How will you act on the information once it is retrieved?

Where a participant performed more than one task, the questions had to be answered separately for each task. The questions correspond to the four steps of the Information Journey Model (Attfield and Blandford, 2010), which covers the lifecycle of information need fulfilment. Questions 1 and 2 correspond to “recognising an information need” while questions 3 and 4 correspond to “acquiring information”. The third step, “interpreting and validating the information”, can be inferred from question 5 and question 6 corresponds to “using the information”. A potential limitation of defining User Goals before the data exploration activity – as opposed to capturing them during the activity – is it takes a static view of the goals by not accommodating dynamic changes along the way. Hence while participants were required to consider the questions in choosing a suitable observation task and answer them at the beginning of the observation, further evidence of their goals were derived from the analysis of the observation data after it was transcribed and coded.

4.2.3 Search Context To aid an understanding of the context in which information seeking is conducted Russell-Rose and Tate (2013, p.50) propose the viewing of data exploration activity from four progressive perspectives: “information retrieval”, “information seeking”, “work task” and “cultural context”. The work task layer in particular provides the professional context in which participants engage in data exploration and information retrieval. The participants’ job descriptions and general areas of responsibilities within the organisation were harnessed from their LinkedIn profiles (Table 5) while information about the specific work task being fulfilled was gathered during the observation (Table 6), based on their answers to the User Goal questions in Chapter 4.2.2. Their general job roles and the specific duties carried out during the observation convey the organisational need that motivated the information seeking. This information mainly contextualises the observation data within Russell-Rose and Tate's (2013, p.47) framework and to a lesser extent, was used in analysing the influence of participants’ work task on their data exploration activities.

Page 21: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 16

4.2.4 Observation Gaining insight into the behavioural processes followed by the participants as they explore data and retrieve information necessitated a case study research strategy, as it allows in-depth study of the phenomenon with all variables taken into consideration, and within their natural setting of occurrence (Oates, 2006, p.141). Basing the case study on a regular task makes it representative and thus generalisable to marketing professionals performing similar tasks while providing an opportunity to test the applicability of established theories (Oates, 2006, p.146), specifically the Information Journey model (Attfield and Blandford, 2010) and Russell-Rose and Tate's (2013) framework for search and discovery. Data was gathered by observing the participants while performing their tasks which, as a “practitioner-researcher” working in the organisation, I was able to carry out after obtaining necessary permissions without the need for familiarisation or access negotiations (Oates, 2006, p.209). However, challenges of balancing the study with my work responsibilities as well as minimising interference with participants’ work schedule had to be managed. Despite drawbacks of being relatively time-consuming and generating large amounts of unstructured data (Rogers et al., 2011, p.261), as well as the risk of participants modifying their normal behaviour, observation was selected over other data gathering methods such as interviews and questionnaires because it captures what is been done and how; this is more reliable than potentially selective verbal descriptions (Oates, 2006, p.198). Furthermore, observing participants in their natural work environment rather than in a controlled environment avoided the probable effects of artificial work conditions limiting the relevance of the results (Rogers et al., 2011, p.261). Observation sessions were scheduled for a maximum of 60 minutes but flexibility was allowed to address concerns about the potential impact on participants’ work schedule and productivity. 10 minutes was scheduled for introduction and equipment set up, followed by up to 50 minutes of direct observation at their workstations. Participants were observed while performing a regular task during their normal work schedule, using their normal data sets and analytics tools. Participants were required to think aloud as they performed their tasks so as to verbalise their thought processes (Rogers et al., 2011, p.256) and questions were asked for clarification during

Table 5: Job Roles of Participants

Page 22: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 17

the observation session rather than in a follow-up interview in order not to depend on memory recall.

4.2.4.1 Introduction At the beginning of each session, I explained:

The participant’s role in the study; My role as the moderator; The format of the study; Think-aloud technique.

4.2.4.2 Equipment Set Up The sessions were conducted in the natural work environment of the participants, using their regular furniture and Windows laptop computer as well as data sources and data analysis software routinely used to complete tasks. Screencast-O-Matic (n.d.) screen capture java applet was used to record on-screen activities, with participants’ voices recorded simultaneously by the applet using the computer’s built-in microphone. The audio-visual recordings were exported in MP4 file format at the end of the sessions.

4.2.4.3 Observation Tasks At the beginning of each session, participants gave a description of their goals regarding:

The specific information needed; The trigger of the need; The data set to be explored to meet the need; The software application the data set is contained in; The software application the data set will be analysed in; How the information will be acted on once it is retrieved.

Participants were requested beforehand to select a task that was performed regularly – defined as at least once a week – for the observation sessions. Nine participants selected one task while one participant selected two tasks. Of the eleven tasks observed, seven met this criterion. One, which was performed once a month at the time of the observation, was planned for increase to a weekly basis. Two tasks were performed regularly but without a fixed schedule; the last task, although performed on an ad-hoc basis, was considered to be representative by the participant who otherwise did not perform regular data exploration tasks. Seven of the observed tasks involved the creation of a report on a spreadsheet, of which six were based on quantitative data extracted from web-based sources and one based on quantitative data in an existing spreadsheet. Two tasks involved the analysis of quantitative data on an existing spreadsheet and a further two tasks involved the analysis of quantitative and qualitative data on web-based platforms. The shortest observation session lasted for almost 7 minutes while the longest lasted for just over 80 minutes. The total length of the observation sessions was approximately 4 hours and 50 minutes.

Page 23: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 18

A task was deemed completed when:

The initially identified information need had been satisfied; and/or Subsequent information needs triggered by the exploration activity had been satisfied;

or The exploration was terminated because the information need could not be satisfied.

Nine participants satisfied the identified information needs; Participant I was unable to retrieve all the required information due to technical problems with the data source but gave a verbal description of how the task would have been performed while providing on-screen demonstrations with the available data. A summary of the observation sessions is shown in Table 6:

Page 24: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 19

Table 6: Summary of Observation Sessions

Page 25: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 20

4.3 Data Preparation

4.3.1 Tools Screencast-O-Matic java applet was used to record the computer screen and sound during each observation and saved in MP4 file format. All but one of the observations were of good audio-visual quality, enabling accurate transcriptions. The exception was Participant F which suffered from poor audio quality and hence had gaps in the transcript, although that was partially mitigated by interpreting on-screen actions. The transcription of each recording was produced in HyperTranscribe 1.6 (Researchware, n.d.) and exported as a Microsoft Word (Microsoft, n.d.) document (Appendix O). The transcripts consisted of the vocal dialogues of participants and where appropriate, supplemented by descriptions of on-screen events, using a different-coloured text to differentiate them from participants’ own words. Transcribed sentences were timestamped to facilitate referencing. The transcripts were imported into QDA Miner 4.1 (Provalis Research, n.d.) for coding and software-assisted analysis, with the results exported as Microsoft Excel documents for in-depth analysis.

4.3.2 Coding A prerequisite for analysing the qualitative data gathered in the observations was to look for themes and patterns that emerge from them. The transcripts were read, interpreted and coded with the themes reflected in each passage (Appendices I and N). Duplicate coding was avoided by interpreting the passages and coding each unique occurrence of a theme once, regardless of how many times it may have been referred to. However where a unit of passage was perceived to reflect more than one theme, it was coded with all applicable themes. Additional steps were taken to ensure themes were coded in their correct sequence by re-arranging the subject and predicate where appropriate. Descriptions of on-screen actions (in red text) were added to complement verbal descriptions, where deemed necessary. A combination of descriptive and analytic coding techniques (Gibbs and Taylor, 2010) was used in identifying the methods and reasoning processes followed by the participants in their selection and interpretation of information. Efforts were made to ensure consistency of coding by comparing similarly coded passages within each transcript and with transcripts of other participants (Gibbs and Taylor, 2010). Explanatory comments to justify coding decisions were also added where deemed necessary. However it is acknowledged that coding decisions are subjective and based on the interpretations of a single coder, without the benefit of checking coding agreements with other coders. The transcripts were coded separately for User Goals and Search Modes and both dimensions were analysed independently.

4.3.2.1 User Goals The Information Journey Model (Attfield and Blandford, 2010) provided the underlying theory for identifying the participants’ goals (Table 7). The four steps of the model support a dynamic as opposed to static nature of information seeking (Russell-Rose and Tate, 2013, p.26) and activities of participants were interpreted within the framework.

Page 26: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 21

Table 7: Codebook of User Goals

Code Description

Recognising Need Recognising an information need

Acquiring Information

Acquiring information to fulfil the need, whether an entire data set or an item within a data set

Interpreting Information

Interpreting and validating the acquired information

Using Information Using the information to fulfil the need

4.3.2.2 Search Modes The nine search modes developed by Russell-Rose et al. (2011), distilled from over one hundred conceptual user scenarios of search and business intelligence applications, provided the “priori codes” (Gibbs and Taylor, 2010) used for the coding. These include all the codes in Table 8 except Collaborate, Measure, Recognise and Update, which I developed. These are “grounded codes” (Gibbs and Taylor, 2010) that emerged from the observation data, independent of pre-existing theories. Forecast, though identified as a search activity (Marchionini, 2006), is not a distinct search mode under Russell-Rose et al.’s classification. Russell-Rose et al. classified each Search Mode according to the three types of search activities developed by Marchionini, that is, Lookup, Learn and Investigate. Search Modes that entail finding information are classified under Lookup; those that involve knowledge acquisition and development are classified under Learn while others concerned with dissecting and combining information or using judgement are classified under Investigate. Lookup searches retrieve precise information to meet clearly defined needs whereas Learn and Investigate searches are exploratory in nature, with users analysing information and making sense of it (Russell-Rose and Tate, 2013, p.72). These three top-level categories were in turn used in classifying the “grounded codes” with the exception of Update, as it does not fit precisely into any category.

Page 27: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 22

Table 8: Codebook of Search modes (those highlighted in green are my own “grounded codes”)

Code Description

Locate LOOKUP: To find a specific (possibly known) item

Verify LOOKUP: To confirm that an item meets some specific, objective criterion

Monitor LOOKUP: To maintain awareness of the status of an item for the purposes of management or control

Recognise LOOKUP: To identify an item based on prior knowledge

Compare LEARN: To examine two or more items to identify similarities and differences

Comprehend LEARN: To generate independent insight by interpreting patterns within a data set

Explore LEARN: To investigate an item or data set for the purposes of knowledge discovery

Measure LEARN: To determine the quantitative value of an item

Analyse INVESTIGATE: To examine an item or data set to identify patterns and relationships

Evaluate INVESTIGATE: To use judgement to determine the value of an item with respect to a specific goal

Synthesise INVESTIGATE: To create a novel or composite artefact from diverse inputs

Forecast INVESTIGATE: To extrapolate a future value of an item from current data (added from Marchionini's taxonomy)

Collaborate INVESTIGATE: To seek the input of others in interpreting an item

Update Editing the value of an item

The “grounded codes” were deemed to be of a similar level of abstraction to, and conceptually different from, the “priori codes” to merit inclusion. Forecast – classified under Investigate by Marchionini – was added to describe the derivation of future values from existing data. Collaborate stood out as an Investigate search mode that draws on the expertise and professional judgement of colleagues to gain insight about an item. Measure was identified as a Learn search mode peculiar to quantitative data, typically involving performing calculations on a given data set. Recognise, classified under Lookup, was considered to be sufficiently different from Locate as it emphasises the use of prior knowledge gained through experience in identifying an item. Update, perhaps technically not a search activity as it involves adding or changing items, was considered important as it describes a distinct activity within data analysis and interpretation. Russell-Rose et al. (2011) proposed three properties that a Search Mode should possess: ‘consistency’; ‘orthogonality’; and ‘comprehensiveness’. While ‘consistency’ and ‘orthogonality’ can be evaluated without any data, ‘comprehensiveness’ can only be assessed after data has been analysed and as such, the degree to which the “grounded codes” possess the three characteristics are evaluated in Chapter 5.2.1.3.

Page 28: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 23

5 RESULTS AND DISCUSSION The participants were observed as they performed a regular task in their natural work settings. The observation data was coded separately for their User Goals and Search Modes, with coding themes derived from Attfield and Blandford’s Information Journey Model and Russell-Rose and Tate’s search and discovery framework respectively. The individually coded transcripts were combined within QDA Miner to derive an aggregate view of the data exploration and information retrieval behaviour patterns of the marketing professionals. Using the built-in tools of QDA Miner, the number of occurrences (frequency) and order of occurrences (sequence) of the codes were computed and the results exported in Microsoft Excel file format for analysis.

5.1 User Goal The User Goal provides insight into the need that motivates information-seeking activity and the different stages that lead to the fulfilment of the need. Such understanding can be leveraged by designing analytics tools to support users at each stage of their information seeking process as well as their transition between stages.

5.1.1 Code Frequency The code frequency conveys the total number of times each code occurred during the information seeking process, and is indicative of the relative significance of each User Goal to the process. Of the combined 491 occurrences of all goals, the predominant goal of the participants was Interpreting Information, occurring 161 times, followed by Acquiring Information at 155, Using Information at 101 and Recognising Need at 74 (Figure 3). This partially aligns with the assertion that in the Information Journey Model, information interpretation and use are the key drivers of information seeking behaviour (Russell-Rose and Tate, 2013, p.26), although information acquisition seems to play a comparably dominant role. Figure 3: Frequency of User Goals for All Participants

74

155

161

101

Recognising Need

Acquiring Information

Interpreting Information

Using Information

Page 29: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 24

In fulfilling recognised needs, the participants were three times more engaged in information acquisition and interpreting activities as they were in using information. There are a range of probable reasons for the higher frequency of these activities. It is theoretically possible that the process of searching, retrieving and analysing data is demanding and labour-intensive relative to the process of using it once it is prepared. Neither time nor effort was measured however, so this cannot be corroborated by evidence. It could also be possibly due to difficulty in locating the required data or understanding its meaning. This is unlikely however, given the regularity of the observed tasks and “domain expertise” of the participants. Another plausible factor could be relative difficulty in using the data sources and analytic tools, making the process less efficient. A distinguishing feature of ‘technical novices’ compared to experts is a higher tendency of the former to reformulate queries, resulting in longer time and effort expended in acquiring information (Russell-Rose and Tate, 2013, p.4,6). The sole ‘technical novice’ in the group, Participant A, whose technical proficiency self-assessment was Basic, had a ratio of information acquisition to information usage of 2 (Figure 4), in line with the group average (Appendix H). Participant I, whose self-assessment was ‘Advanced’, had a higher ratio of 3, suggesting lower efficiency. On the other hand Participant K, similarly self-assessed as Advanced, had a below-average ratio of 1, suggesting higher efficiency. The results are thus insufficient to demonstrate a link to “technical expertise”. The possible effects of the relative complexity of the tasks have not been investigated, however. Alternatively, the higher proportion of engagement in acquiring and interpreting information could suggest that the participants are comfortable with both the content and the tools used in extracting and analysing them, hence they are willing and able to explore the data in enough detail to gain insight. The regularity of the tasks and the participants’ “domain expertise” lends some credence to this hypothesis but perhaps more important is the nature of the tasks – the performance of the marketing campaigns and factors that influenced them needed to be understood and explained for monitoring and reporting purposes, placing greater emphasis on information acquisition and interpretation. Thus the majority of activities involved exporting data from its source into a separate analytics tool as well as identifying specific items within a data set for examination (Appendix I). Conversely, using the results for insight or decision making (Attfield and Blandford, 2010, p.33), or condensing into a report, involved less iterations. Figure 4: Frequency of User Goals per Participant

05

1015202530354045

Recognising Need

Acquiring Information

Interpreting Information

Using Information

Page 30: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 25

It is striking that unlike other participants, the predominant goal observed in Participant H was Using Information, occurring 11 times compared to 9 and 8 for Acquiring and Interpreting Information respectively (Figure 4). The participant’s information seeking path is indicative, as seen in the code sequences heat map which uses brighter colours to indicate higher occurrences (Figure 5). Out of 35 total activities, 6 information acquisition activities were followed by information usage, without an apparent interpretation process. The nature of the task performed offers a possible explanation, as the “information seeking process” involved iteratively acquiring data from one source and inputting it into another before analysing the combined data. This is evident in statements such as “…and just go through, find the actual SegID and then input the numbers here…”, “…and then just put that number in here…”, “…and put it in here…” and “…so then basically once I fill in everything, I'm done with this data source…”. Other participants demonstrated similar information seeking behaviour albeit to a lesser extent (Appendix J) and this may be a reasonably pervasive process, as noted by Pirolli and Card (2005) in their study of intelligence analysis: “much day-to-day intelligence mainly consists of extracting information and repackaging it without much actual analysis”. Figure 5: Code Sequences - Participant H

5.1.2 Code Sequences The code sequences convey the number of times one code is followed by another and is indicative of the path followed in seeking information, that is, how users transition from one goal to another until their information need is met. Further analysis of the aggregated code sequences of all participants (Figure 6) yields results that are in line with the progressive steps of the Information Journey Model (Attfield and Blandford, 2010, p.30). The total number of times one code was followed by another is 485, with Recognising Need followed by Acquiring Information 62 times, which was in turn followed by Interpreting Information 100 times and the latter followed by Using Information 68 times. This suggests that when seeking information, the participants are likely to follow a linear path of need recognition – information acquisition – information interpretation – information usage; hence analytics tools could be optimised to smooth their journey along the path.

2

Number of times that the User Goal in a row (1) is immediately followed by the User Goal in the intersecting column (2) R

ec

og

nis

ing

Need

Acq

uir

ing

Info

rma

tio

n

Inte

rpre

tin

g

Info

rma

tio

n

Usin

g I

nfo

rma

tio

n

1

Recognising Need 5 3

Acquiring Information 3 6

Interpreting Information 3 1 4

Using Information 4 3 2 1

Page 31: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 26

Figure 6: Code Sequences - All Participants

Considering the theory of the dynamic model of information seeking which the Information Journey Model is built on (Russell-Rose and Tate, 2013, p.26), it is interesting that Interpreting Information was followed by Acquiring Information 38 times. Examples of information acquisition following interpretation without the explicit recognition of a need can be seen in Participants A (“but then if you look at this, this is showing that clicks on this tweet…saying we had 7000 clicks and then when you look at the Bitly one, so that records the actual clicks on link so 598 clicks, but then when you go back and look at this MatterSight report, it only actually delivers 7 registrations”) and D (“so this gives us averages, so weekly increase or decrease, so we can see that was a particularly high week, so you get a flow of...subs, we should have acquisitions in here as well, that is the dark blue so we can see we've kind of been hovering around the 1000 mark recently which is really nice”). Similarly, Using Information was followed by Acquiring Information 37 times, bypassing Recognising Need. This is apparent in Participant G (“the last thing we need to do here is add these numbers in here so I can see the UK split so now that I've done that, the UK numbers will change for the week just so I can have the trial and sub numbers together… so the next bit, I'll just then take the end figures here as well as this graph”) and Participant I (“that I would try and answer with the data in a certain view and I'll be breaking that out further, but also I'd try and take qualitative data to answer that”). These seem to support the standard and cognitive models of information seeking that emphasise the iterative acquisition and interpretation of information until an initially recognised need is fulfilled (Russell-Rose and Tate, 2013, p.24). The lower instances of acquiring and interpreting information leading to the recognition of a new need suggest that the participants’ goals were mostly static in nature, in contrast to goals that are dynamically modified as information is acquired and interpreted. Although information acquisition that does not directly proceed from a recognised need could alternatively suggest serendipitous discovery, where “[information is encountered] without explicitly looking for it” (Blandford, featured in Russell-Rose and Tate, 2013, p.43), this could not be confidently inferred from the data available.

5.1.3 Designing for User Goals Implementing specific features in business analytics software can support the observed information seeking processes, thereby improving the efficiency of users.

2

Number of times that the User Goal in a row (1) is immediately followed by the User Goal in the intersecting column (2) R

ec

og

nis

ing

Ne

ed

Ac

qu

irin

g

Info

rma

tio

n

In

terp

reti

ng

Info

rma

tio

n

Us

ing

Info

rma

tio

n

1

Recognising Need 3 62 6 3

Acquiring Information 15 19 100 21

Interpreting Information 24 38 31 68

Using Information 23 37 25 10

Page 32: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 27

5.1.3.1 Acquiring and Interpreting Information User Goals that involve using information that has been acquired without need for interpretation, as demonstrated by Participant H, are viable candidates for automation. Standardised data that are regularly exported into business analytics software would benefit from direct connection to the data source, which could be refreshed with a single click. Marketing professionals could thus invest a greater proportion of their effort in using the information. Participants did not merely acquire information for usage however; monitoring and reporting on the performance of marketing campaigns involved a high proportion of interpretation in order to make sense of the data. Multiple data sets from one or several data sources were typically acquired for interpretation, a practice described as building a “shoebox” collection of all data items before they are interpreted for relevance and meaning (Pirolli and Card, 2005). Russell-Rose and Tate (2013, p.34) assert that enabling rapid population of the “shoebox” can support this process and this reinforces the utility of a seamless integration between analytics software and data sources, empowering users to acquire and interpret data in a single workflow. Moreover, judging the relevance and meaning of data items is an internal mental process that benefits from externalisation to prevent memory overload and potentially allow collaboration (Russell-Rose and Tate, 2013, p.34). One way of facilitating this is to reorganise and reformat the data to make it suitable for interpretation (Pirolli and Card, 2005) and simple-to-use tools in Microsoft Excel for filtering and sorting partly fulfil the requirement. On the other hand, more complex processing such as charts for visualisation or pivot tables for sub-selection and aggregation require more technical nous and could benefit from simplification. The relative ease of switching between tabular and graphical representation of data in Tableau, for instance, demonstrates a more refined implementation that could be adopted more widely.

Page 33: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 28

5.2 Search Mode Search Mode describes the different strategies employed by users in exploring data during the full cycle of their information seeking activity. A clearer understanding of such strategies and how they tend to be combined could be supported in analytics tools to enhance data exploration and information retrieval processes.

5.2.1 Code Frequency The code frequency conveys the total number of times each code occurred during the information seeking process, and is indicative of the relative significance of each Search Mode to the process. The number of times each code was used and the number of participants in which it appeared were computed to identify recurring themes that could be suitable candidates for generalisation. There was a late discovery of inconsistency with respect to Recognise, which was initially named Recall. The code was renamed part way through the coding process and it was not updated in previously coded participants. Although the figures for Recognise and Recall could have been added to derive the total frequency, it might not be a suitable solution for deriving the code sequences (section 4.2.2) as the computation process used in QDA Miner is unknown. Since the error could not be rectified due to the expiry of the trial licence of QDA Miner, the frequency of Recognise was deemed unreliable and thus discounted from the results. The remaining thirteen codes had a total frequency of 433, ranging between 9 and 61 (Figure 7), and each code could feature in a maximum of 10 participants (Figure 8).

Figure 7: Number of Occurrences per Code

Page 34: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 29

Overall, participants performed more ‘exploratory searches’ than Lookup searches, with 313 and 96 occurrences respectively. 160 of the ‘exploratory searches’ were of a learning nature while 153 were investigative. A further 24 instances of updating items were conducted. The higher occurrence of ‘exploratory search’ activities was reflected in all ten participants with no significant variations between those performing analysis on web-based platforms and those importing data into Microsoft Excel for analysis (Appendix L). Moreover the highest occurring search activity in each participant was exploratory, with the exception of Participant G (Appendix L). The prevalence of Locate in Participant G is not correlated with the nature of the observation task, as this was not repeated in other participants who extracted weekly customer acquisition data from web-based platforms and analysed them in Microsoft Excel for management report, including Participants F and H (Chapter 4.2.4.3: Table 6). No explanation for the divergence can be inferred from the data. The results suggest that finding information is a minor component of the marketing professionals’ data exploration and information retrieval patterns of behaviour; the majority of their activities involve analysing and making sense of information once it has been located. The implications for data sources and business analytics software used by such professionals is the provision of improved features to enhance analytical and sensemaking processes rather than focusing on finding information (Russell-Rose and Tate, 2013, p.72).

5.2.1.1 Analyses of Most Frequent Priori Codes The ‘priori codes’ (those derived from Russell-Rose et al.’s Search Modes) with the highest number of occurrences offer a relatively high degree of confidence for generalisation, since they occur in reasonable quantities to suggest commonality. The most frequently occurring code was Compare at 61, appearing in all observed tasks except Participant J. As a Learn activity concerned with common and/or contrasting qualities between two or more items, it appears to be a vital element in the interpretation of quantitative data. It was a recurring theme in mainly three contexts:

Figure 8: Number of Participants per Code

Page 35: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 30

Assessing periodic performance against set targets, as seen is such statements as “so at the moment we're 500 short of that target, actually we're about 400 short of that target cuz that's minus 100 (Participant F)”, “so we've got a column to tell us what we should be achieving and we can look to see if that's happening or not (Participant C) ” and “so again that's not quite hitting target but it's in line with weekly average (Participant D)”;

Assessing performance across different periods, for example “so we're seeing there were 1 or 2% of percentage point drop on a weekly basis then we get to week 10 and all of a sudden it drops by 40 percentage points (Participant I)”, “and then on week 29, right around here, we switched the sign up forms to just the new version of the sign up forms that performed better across the board on other sources, and you can see that on week 29 there was a jump (Participant H)” and “September got higher click-through (Participant A)”;

Assessing the validity of data sources, evidenced by comments like “what we've done

is really said, if iJento is showing a huge number and MatterSight isn't, and it's big, we'll take some action in looking into it (Participant H)”, “and then I'll have a call on Thursday with the search agency and then we discuss the figures that they're getting through their tracking, Google Analytics, match it up against what we've got here, first to see how accurate it is (Participant G)” and “but then if you look at this, this is showing that clicks on this tweet, so it could be our handle or anywhere around it, saying we had 7000 clicks and then when you look at the Bitly one, so that records the actual clicks on link, so 598 clicks (Participant A)”.

Evaluate also featured highly, garnering 60 occurrences across all 10 participants. There were two contexts in which this Investigate activity was repeatedly identified:

Drawing an objective conclusion as a follow-up to a comparison, which can be seen in extracts such as “so we've got a column to tell us what we should be achieving and we can look to see if that's happening or not. So, we can see that we're generally doing pretty well on registration barriers; on subscription barriers we're struggling a bit (Participant C)”, “so clearly Friday is our poor one, a poor day for click-through and open rate, and consistently I'm seeing here Wednesday and Thursday actually which is quite interesting, consistently we've found Monday is a good day to send news on, and Sundays people are on their iPads and they're highly engaged as well (Participant K)” and “so for example for the last few weeks, SEO has delivered quite a phenomenal amount of subs which is great, but we're trying to understand why it's been delivering more subs than usual...so that said to me it wasn't necessarily an increase it was just more accurate attribution, so that could be important because we've had quite good acquisition results over the last few weeks, and it's quite important to know why we've been up… so that was quite good cuz then we could say there hadn't been a big increase in SEO, it's down to more the programmes that we've run (Participant D)”;

Making subjective professional judgements, as demonstrated by “the US one is exactly the same or we might see it on the site and think that might work well (Participant A)”, “maybe we can modify that by including live Twitter content now or advertising particular Twitter niche fields based on their industry... that would do better, but just to say follow us on Twitter, it's a bit ambiguous, there's no real drive, call to action is not very strong (Participant K)” and “this one I probably wouldn't use because it's probably out of date, with the exception if I'm doing something with comment pieces, they tend to last for a bit longer (Participant C)”.

Page 36: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 31

With a frequency of 57, Explore had the third highest occurrence and featured in 8 of the 10 participants. Two common contexts were reflected in this Learn activity:

Sifting through a data set to gain insight about a specific item, illustrated by “has there been any sort of changes in the traffic, have we put more money towards something, have we done any optimisation, because we're seeing an increase in subscriptions, why is that the case (Participant H)”, “so we're seeing there were 1 or 2% of percentage point drop on a weekly basis then we get to week 10 and all of a sudden it drops by 40 percentage points, that then gives me to say why, and then I'd try and drill down further and say ok, what happened at that week (Participant I)” and “so from there I would just dig down and I can see where is this problem (Participant L)”;

Scanning through a data set for items that stand out without seeking a specific outcome, exemplified by “interestingly 6% have forgotten their password…the upgrade to standard and premium didn't get a click, then you've got the email briefings' done quite well, 11% (Participant K)”, “I just look at that as a reference, to see if I can see anything coming out, in terms of content (Participant C)” and “there are other bits, so this is quite interesting to see number of followers, so that would be how many people have started to follow @FT, which obviously is good because it means they'll then be getting the updates in their feed as well which might help with engagement; retweets are interesting because people are obviously interested in the content (Participant A)”.

Locate, a Lookup activity, occurred 39 times across all 10 participants in two recurrent contexts:

Finding the value of an item, including “so I'll be looking at this section here and I'll be looking at what the total subs is, FT.com, how many individual subs we've acquired (Participant D)”, “I would start with this, so a visualisation, how are print subs doing (Participant I)” and “so looking at this grand totals for Welcome for registered users, we've got 28% open rate, 3.64% click through (Participant K)”.

Identifying a specific item within a data set, such as “so, I know pretty much off by heart the SegIDs in mind, so I can select them (Participant C)”, “this [Sub Source] is what I'm interested in (Participant F)” and “yeah do it side-by-side, and just go through, find the actual SegID (Participant H)”.

Another Lookup activity, Monitor, occurred 39 times in 9 participants. Two recurring contexts were identified in the occurrences:

Maintaining passive awareness of the status of an item to provide a backdrop to assessed items, as seen in “the Asia and the US I'm not so worried about cuz they're nice to know but they're not for us to manage, therefore they're managed from those regions but I do take the number out of MatterSight just so I can see how they're going, just to see cross-region comparisons (Participant G)”, “then obviously we'll look at the number of subscribers that have cancelled, and payment failure, and then look at the net increase for the week so that's sort of quite good to know about but for me it's really how many acquisitions we've achieved (Participant D)” and “we do take insight from that, it's a temperature check more than anything else (Participant K)”.

Page 37: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 32

Maintaining active awareness of the status of an item in order to assess it, demonstrated by “so within Splunk we have a payment dashboard, so we can see for different time spans - go to 7 days - the new payment flow, and this will populate, so we have debit successes, debit failures and debit errors, so this is when there's a technical error. So we can see that at a glance at any point, and then we've got the reasons and the error messages that are coming through here, and then some more data, and then the same for the old flow (Participant L)”, “this was the SegID we put against this tweet, that was a registration push, so I started to look at this to see how many did it actually deliver (Participant A)” and “so we track these every week, so that we can see any fluctuation, try and identify where any marketing activity that's being done is affecting these SegIDs (Participant C)”.

Analyse occurred 37 times in nine participants, with two contexts encountered repeatedly in this Investigate activity:

Determining factors contributing to the value of an item, exemplified by “hey we've achieved what looks like an extra 20, 30 subs, and in this case she came back and said that we had sent out, we had communicated with a lot more people because there'd been a glitch in previous weeks, and so that explains the large number (Participant D)”, “what channel did they come in from, why are they doing well...all of them came in from email and what letter series did they have, they didn't have a letter, they had emails so we then have a hypothesis or something to answer which is, we think people that come in through email and only respond by email are better, are retained better (Participant I)” and “try and identify where any marketing activity that's being done is affecting these SegIDs (Participant C)”.

Discovering patterns within a data set, illustrated by “I'd then go back to the pivot and see, I'd then try and see if I can break it down by currency and see, maybe put some, it's mostly ad-hoc, and say is it affecting one currency more than another (Participant L)”, “we can chart over time to see trends regionally and see whether the growth..., just by looking at their email behaviour, whether regions are growing in volume and in engagement levels as well, this is from January to August, I'd have to export this into a spreadsheet and chart it, that's what I would generally do, I take this data, export it and then play with it (Participant K)” and “the time that we push it out we might think about differently, so should we push it out more on a weekend because it's more of a video, and, it's different, apparently 9 o'clock on a Sunday night works quite well for Twitter as well (Participant A)”.

Synthesise, although only occurring 18 times, was the only code other than Evaluate and Locate to feature in all 10 participants. One recurring context was identified in this Investigate activity:

Condensing multiple items into a management report, for example “so this is the full report that I send to [Senior Manager]...a much more condensed version (Participant F)”, “I'll just then take the end figures here as well as this graph, and include it in my weekly report which is in the form of an email (Participant G)” and “so out of this I write a weekly report for the Optimisation team, which looks at these SegIDs and any barriers and numbers and any reasons why that might be (Participant C)”.

Russell-Rose et al.’s Search Modes were based on user scenarios derived from customers during the development of search and business intelligence applications on Endeca Latitude (Oracle, n.d.), an enterprise data discovery platform. Future work suggested by the researchers

Page 38: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 33

included “empirical research and observation of knowledge workers in context to validate and refine the discovery modes and triggers that give rise to the observed patterns of usage”. In this research, the ‘priori codes’ occurred within the observation data of marketing professionals, who were using a mixture of data sources and analytics tools to perform regular work tasks under normal working conditions. Hence the data supports the applicability of the Search Modes within a live work environment. The implications for marketing professionals include the design of data and analytics software for optimal user experience, based on their data exploration and information retrieval patterns of behaviour (Russell-Rose and Tate, 2013, p.76). By having a clearer understanding of the Search Modes of business users, interaction designers of analytics software would be better equipped to enhance the UI with relevant usability, functionality and content features (Rogers et al., 2011, p.15).

5.2.1.2 Designing Information Systems for Search Modes The observation tasks were highly concentrated on assessing the performance of marketing campaigns for monitoring and reporting purposes, which is the raison d'être for business analytics within the marketing function (SAS, n.d.). Therefore using the three highest occurring codes – Compare, Evaluate and Explore – some insight can be gained into ways business analytics tools might be designed to support the data exploration and information retrieval processes of marketing professionals.

5.2.1.2.1 Compare – Assessing periodic performance against set targets / Assessing performance across different periods

Given that all participants described themselves as visual learners, it might be helpful to present the target and actual data in Microsoft Excel as column or bar charts to facilitate comparisons. The process could be automated by designing a dashboard-style worksheet containing the charts, which are linked by formulae to a table on a different worksheet where the data is updated periodically. Likewise, the dashboard could include a line chart to show trends over time, enabling the spotting of sharp changes at a glance. Building pivots for the tables and charts would facilitate deeper exploration and analysis, as double-clicking on any point of interest on the chart can access the underlying data. Other analytics tools such as QlikView might offer a richer user experience since the design of dashboards and exploratory capabilities available can be more intuitive than in Excel. That said, PowerPivot – available in Excel 2010 and above – offers more advanced features to compete more strongly in the business analytics market. The usability would be further enhanced by connecting the analytics tool to web-based data sources where applicable, enabling direct retrieval of data from within the tool using single-click refresh (Figure 9).

Page 39: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 34

5.2.1.2.2 Evaluate – Drawing an objective conclusion as a follow up to a comparison / Making subjective professional judgements

Evaluation is a sensemaking activity engaged in by participants, following the retrieval and analysis of an item. It may be useful to capture the objective or subjective conclusions drawn for future reference and the ability to annotate data enables this, reducing reliance on memory recall (Russell-Rose and Tate, 2013, p.39). Excel provides the ability to add a comment to a cell (Figure 10) and although they can be viewed easily, the process of editing, extracting or collating them is arguably unwieldy. An added functionality to view all comments on a spreadsheet might be beneficial for retrieving evaluations without having to comb through individual cells.

Figure 9: Workbook Connections

Figure 10: Evaluation Comments

Page 40: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 35

5.2.1.2.3 Explore – Sifting through a data set to gain insight about a specific item The ease with which a data set can be explored depends on both the analytics tool deployed and the structure of the data. Building a relational data table that models the relationships between different sets of data would expedite investigations into how other items might have affected the item of interest. QlikView’s associative data model (QlikTech, n.d.) and Excel’s PowerPivot linked tables (Microsoft, n.d.) are some of the tools that support such features. By displaying related data items on an interactive dashboard that only requires clicking and selecting rather than writing queries in a technical language, serendipitous discovery of knowledge is facilitated (Russell-Rose et al., 2011, p.38). QlikView has the added benefit of displaying a ‘breadcrumb’ of selected fields in its ‘Current Selections’ pane, which enables users to keep track of the layers of selections and retrace their steps if needed (Figure 11), further encouraging exploration (Russell-Rose et al., 2011, p.79).

5.2.1.2.4 Explore – Scanning through a data set for items that stand out without seeking a specific outcome

The visual learning preference of the participants suggests that a dashboard using charts or other graphical formats to visualise the data would expedite such exploration activities. This is supported by the observed behaviours of Participants K and L who made use of the visualisation features of their respective web-based data sources to identify items that stood out, and Participant I who used the chart on an Excel-based data cube to identify areas of interest for further investigation.

Figure 11: Current Selections

Page 41: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 36

On the other hand, it is curious that the majority of Compare and Explore activities observed were conducted with tables rather than charts, despite the participants’ preference for visual learning. This could be due to the nature of the tasks performed, as reporting on the performance of marketing campaigns over a given period involves an analysis of discrete data items rather than simply gaining an overview of associations and relationships between data items (Vessey, 1991). It is also possible that the participants made do with the tools that were available to them even though they might have been better served by visualised data.

Page 42: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 37

5.2.1.3 Analyses of Grounded Codes The ‘grounded codes’ (those I independently derived from the data) require in-depth analyses to determine their validity as Search Modes and the extent to which they can be generalised.

5.2.1.3.1 Collaborate Of the five ‘grounded codes’, Collaborate had the highest occurrence of 29, across 8 participants. The Investigate activity appears to occur in conjunction with other codes, with 22 occurrences found within seven codes spanning Lookup, Learn and Investigate categories:

Locate (1 occurrence) – “for example this Twitter IPO one, [the Social Media Manager] just got in contact with me earlier this week and said this one's coming up, I think it will work really well on Twitter (Participant A)”;

Verify (2 occurrences) – “these are the tweets that we have, and all tweets we have to get signed off by Editorial, so they will sign them off and say yes, that's fine, and then what we'll do is we'll take the link and then work (Participant A)”, “so I have to go back to the guys and ask them to read the SegIDs and make sure they're not double-counting with them (Participant D)”;

Compare (2 occurrences) – “so what we do is we will look at how they’re going and then I'll have a call on Thursday with the search agency and then we discuss the figures that they're getting through their tracking, Google Analytics, match it up against what we've got here, first to see how accurate it is (Participant G)”, “and he also does a weekly trading meeting and report, where we go through the numbers and we just make sure that our top performing SegIDs are performing as they should and if they're not then we'll try and look at why (Participant C)”;

Explore (8 occurrences) – “so it can be quite frustrating sometimes because sometimes we don't know why we're getting fluctuations and when that happens we tend to speak to [data analyst] from Analytics, who will look further into it (Participant C)”, “we saw this, got in touch with Amex, talked through lots of scenarios, they did lots of ad-hoc reporting for us and they had a fraud rule in place that they've now changed which is why you can see it's tailed off (Participant L)”, “[Senior Manager], he sees this and...what's going on with DM, 2 weeks late, what's happening there and I say well, you remember we had issues with the submission files (Participant F)”, “I would look to see if there was a major drop within one specific source in terms of one SegID, or if it's, you know, that would give us a better place to look and say ok, I need to go to [Product Manager] and see why is, is this sort of register button, is that, is it functional (Participant H)”, “I would have to come into the app to do that and then I would also have to go to the Enhancements team and ask them if they can do some investigation into it as well (Participant H)”, “with this one, so we saw some of these issues yesterday through Splunk reporting, but we haven't got a full overview of how many it has affected, I spoke to [Payment Service Provider] and said can you see any problems your side (Participant L)”, “so we then went to our own IT department...and they said we have a connection issue at one of our sites (Participant L)”, “so, those then will go across to [external agency], because this is quite an exploratory phase, so we're working out what types of content work and probably something around timeliness as well, so how long they do they work for (Participant C)”;

Page 43: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 38

Analyse (5 occurrences) – “for example a really booming week for Upper Right Ads, and that would lead me to go back to the guys and say, hey what have you done with the upper right this week (Participant D)”, “then I usually pass a lot of the data over to [Head of Department] and then he changes into the de-duped figures, so looking at it by customer, rather than by transaction (Participant L)”, “if there was something that came out of this that I needed more info on I'd tend to go to the data team and say hey I've seen this campaign, so for example for the last few weeks, SEO has delivered quite a phenomenal amount of subs which is great, but we're trying to understand why it's been delivering more subs than usual and so we have a weekly trading meeting where we talk about the performance of the various activities and in that meeting I ...why is this performing so well (Participant D)” and “if though for example you have like this section here is iPad SEM, if there was a massive jump, yeah, I'd say SEM we're paying for that traffic, we're paying for those acquisitions, I would ask the SEM team and say has there been any sort of changes in the traffic, have we put more money towards something, have we done any optimisation, because we're seeing an increase in subscriptions, why is that the case, sometimes there are answers sometimes there aren't (Participant H)”, “hey we've achieved what looks like an extra 20, 30 subs, and in this case she came back and said that we had sent out, we had communicated with a lot more people because there'd been a glitch in previous weeks, and so that explains the large number (Participant D)”;

Synthesise (2 occurrences) – “I'll just then take the end figures here as well as this graph, and include it in my weekly report which is in the form of an email which I send out every week; [the report commentary] normally comes from the agency, so they provide the commentary around that week (Participant G)”, “so these are ones that the Social Media Manager and I work together to just sort of work up some generic tweets (Participant A)”;

Update (2 occurrences) – “the US numbers come in from this one [MatterSight] but the Asia numbers I ask for every two months or so, so in the last 4 weeks I haven't updated it but I'll ask Asia maybe next or in the next 3 weeks, they'll just update the folder and send it over (Participant G)”, “so [marketing manager] updates this every week so I can just go in and I can look at the specific numbers (Participant G)”.

The most frequent occurrences, at 8 and 5 respectively, were in conjunction with Explore and Analyse – two ‘exploratory search’ activities in the respective categories of Learn and Investigate. This suggests that the marketing professionals have a higher tendency to collaborate when conducting ‘exploratory search’ with a requirement for analysis and sensemaking. However, such correlation is only hinted at as the frequency of occurrences is insufficient to make a concrete observation. Neither of Participants I and J had any occurrence of Collaborate (Appendix L). It is interesting that Participant J also had no occurrences of Explore and Analyse, although the ‘exploratory search’ activities of Evaluate and Synthesise occurred. The nature of the task is conceivably a contributing factor as it was a Lookup activity to retrieve specific information for a colleague as opposed to reporting on performance. The need for analysis or sensemaking was therefore minimal. Conversely, Participant I had relatively high occurrences of Explore and Analyse along with other ‘exploratory search’ activities. The participant was the only one whose task was ad-hoc rather than regular, which may be a relevant factor. Hence any future research into

Page 44: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 39

Collaborate as a Search Mode might need to consider the effect of the type of task performed as implied in both participants.

5.2.1.3.1.1 Appraising Collaborate as a Search Mode Collaborate extends data exploration and information retrieval activities beyond electronic data sources and analytic tools. Theoretically, seeking the expertise of other professionals may be classified as an analysis technique used in interpreting results (O’Day and Jeffries, 1993), that is, a Search Mode. In defining guiding principles for a Search Mode, Russell-Rose et al. (2011) proposed three key properties re: ‘consistency’ – same level of abstraction, ‘orthogonality’ – operational independence and ‘comprehensiveness’ – full range of scenarios, and these can be used to evaluate the appropriateness of Collaborate as a Search Mode.

5.2.1.3.1.1.1 Consistency If the ‘priori codes’ codes were viewed as describing the goal of a search activity, Collaborate would not meet the Consistency requirement, as it is not a goal in itself; rather it is a means of achieving a goal. This is reflected in its occurrences in the observation data in conjunction with other goals. On the other hand, if the ‘priori codes’ were viewed as describing the nature of a search activity, then Collaborate possesses a similar level of abstraction. The level of detail it represents is equivalent to codes such as Verify, Explore and Analyse, which describe the actions taken by searchers in their quest for insight.

5.2.1.3.1.1.2 Orthogonality Collaborate occurs in conjunction with other codes rather than operating independently of them. As such, it does not meet the Orthogonality requirement. However it was noted that Explore has a higher level of abstraction than other search modes, yet this compromising of the principle of Consistency does not prevent its inclusion since “the true value of the modes will be realised not by their conceptual purity or elegance but by their utility as a design resource” (Russell-Rose et al., 2011).

In as much as the characteristics of Collaborate can be used for enhancing the user interface of business analytics software with relevant usability, functionality and content features, it may still warrant the status of a search mode.

5.2.1.3.1.1.3 Comprehensiveness Collaborate occurred in 8 of the 10 participants observed and as such, it technically did not address the complete set of participants. Given the small size of the sample however, in contrast to the 104 user scenarios analysed by Russell-Rose et al., the available data is inadequate to test this requirement. This is underscored by the fact that only three search modes occurred in all 10 participants, with Verify and Explore also occurring in 8 participants and Comprehend occurring in only 6 participants.

Page 45: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 40

It is noted that the recommended analytical approach of ‘propose-classify-refine’ over several iterations (Rose and Levinson, 2004, cited in Russell-Rose et al., 2011) has not been rigorously applied to the observation data and so further iterations could yield different results. In conclusion, it can be argued that the party with which a professional collaborates is the one engaged in a search mode, albeit on behalf of another. Nonetheless, seeking the expertise or input of other professionals as an integral part of exploring data and retrieving information suggests that Collaborate is a distinct technique. Hence more robust empirical studies are needed to test this hypothesis.

5.2.1.3.2 Measure Measure had 21 occurrences in 8 participants and as a Learn activity, it was conducted to obtain the quantitative value of a data item being retrieved. The occurrences were mostly observed on-screen rather than from vocal descriptions by participants. The contexts in which the code was repeatedly observed were:

Aggregating sub-components of an item (12 occurrences), demonstrated by “you can

see we have data here for sends for that email, for their click through and open rates

and I can aggregate together so I can see total for the day (Participant H)”, “and then

we've also got year-to-date averages, so we track these every week so that we can see

any fluctuation (Participant C)”, “32 [used autosum to see the total records on

acquisition data spreadsheet] (Participant F)”, “[copies formula across to calculate the

weekly total] get the totals (Participant G)”, “[copies the formula for the weekly total

to calculate the totals of the outstanding weeks] (Participant D)”, “so if we look at

SEM, 140 subs [highlights the SEM figures and uses autosum to calculate the subtotal]

(Participant D)” and “[mentally adds the Standard and Premium figures for a SegID]

(Participant H)”;

Determining proportion of an item (5 occurrences) – “that then tells me that 40% of

the Trials... (Participant F)”, “58% [calculates percentage of subs that are bundled] (Participant F)” and “I would then convert that to percentage (Participant I)”, “so this, what this tells you without looking, just straight down to the total 123 Actual, Forecast is 322, so 191 62% off (Participant F)” and “[types 103 in the Grand Total of Actual Gross Responses in test grid spreadsheet], 11% [percentage of Target Trials] (Participant F)”.

Measure did not occur in Participants A and L (Appendix L) but the type of tasks performed does not offer a conclusive explanation. While Participant A was the only one not to have used Microsoft Excel, where Measure activities were performed, Participant L used the tool extensively. Similarly, correlation between this and other activities have not been discerned in the data.

5.2.1.3.2.1 Appraising Measure as a Search Mode The degree to which Measure possesses the characteristics of a Search Mode can be evaluated using Russell-Rose et al.’s proposal.

Page 46: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 41

5.2.1.3.2.1.1 Consistency The level of abstraction of Measure seems comparable to those of established search modes, in that it fulfils a discrete step in the search process.

5.2.1.3.2.1.2 Orthogonality In all its identified occurrences, Measure operates independently of other search modes and thus appears to pass the orthogonality test.

5.2.1.3.2.1.3 Comprehensiveness Since Measure occurred in 8 of the 10 participants observed, it did not address the complete set of participants and hence does not met the comprehensiveness requirement. In essence, a valid conclusion about the suitability or otherwise of Measure as a search mode cannot be drawn from the limited data available in this research. The evidence of its occurrence is interesting though and may benefit from further research.

5.2.1.3.3 Update Update, featuring in 24 occurrences across 6 participants, is conceptually different from other search activities as it involves inputting data into an information system rather than interacting with a data output. 18 of those occurrences were a step within the overall search activity necessary for the user goal to be accomplished, illustrated by “so I'll just start inputting them into here [gets figures from iJento spreadsheet and types into MatterSight spreadsheet] (Participant G)”, “so take this weekly mobile, for example, what I would do is create, what I would is first take and do Registrations for example and then [opens the Weekly Report side-by-side with the Autoloader report] yeah do it side-by-side, and just go through, find the actual SegID and then input the numbers here (Participant H)” and “so we've got that many and then add those together [to the analysis spreadsheet]; and then I will just update the graphs by pressing update on them [copies data from pivot, paste or type into table in spreadsheet used for analysis, the table is reformatted on a different worksheet to drive the graphs, which are on another worksheet] (Participant L)”. Hence while Update is strictly not a Search Mode, its role within the cycle of information retrieval, analysis and sensemaking activities is potentially worthy of consideration in optimising data sources and business analytics tools for marketing professionals.

5.2.1.3.4 Forecast Forecast featured in only 9 occurrences by 4 participants; therefore the occurrences are too insignificant to be considered further.

Page 47: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 42

5.2.2 Code Sequences The code sequences convey the number of times one code is followed by another and is indicative of the strategy combinations that tend to be used in seeking information. Recurring sequences of codes were identified by the number of times one code is immediately followed by another. The data is relevant for the potential discovery of Mode Chains, which build on Search Modes to establish how particular modes are grouped together and the order in which searchers move between the modes (Russell-Rose et al., 2011, p.72), giving additional insight into the data exploration and information retrieval patterns followed by the marketing professionals.

5.2.2.1 Analyses of Code Sequences The heat map helps in identifying the Search Mode pairs that co-occurred the most (Figure 12), which could be a clue to the Mode Chains of the participants. Such Mode Chains would represent sequential combinations of data exploration strategies that are deployed repeatedly. Analytics tools may thus be optimised to support the spectra of information seeking behaviour.

The figures denote the number of times each row of Search Modes is immediately followed by the Search Modes in the columns, which suggests the exploration sequence followed by the participants. The codes had a combined occurrence of 445, with the highest co-occurrences highlighted with the brightest colours. Monitor was followed by Compare 13 times; Explore was followed by Evaluate 13 times; Compare was followed by Evaluate 12 times; and Explore was followed by Compare 11 times. Given that Monitor is a Lookup activity, Evaluate is an Investigate activity and both Explore and Compare are Learn activities (Chapter 4.3.2.2: Table 8), these demonstrate a pattern of Lookup

Number of times that the

search mode in a row (1)

is immediately followed

by the search mode in the

intersecting column (2) Lo

ca

te

Ve

rif

y

Mo

nit

or

Co

mp

are

Co

mp

re

he

nd

Ex

plo

re

An

aly

se

Ev

alu

ate

Sy

nth

es

ise

Fo

re

ca

st

Co

lla

bo

ra

te

Me

as

ure

Re

co

gn

ise

Up

da

te

Locate 2 1 4 5 1 7 2 9 1 1 2 6

Verify 4 3 5 1 3 1 1 1 1

Monitor 5 3 13 1 4 4 3 1 2 2 1

Compare 4 4 2 9 7 9 5 12 1 2 4 2 3

Comprehend 2 2 3 2 3 1 3 2 1 1

Explore 1 3 11 1 9 7 13 2 7 2 2 1

Analyse 2 3 5 1 5 4 1 5 1 3 5 1

Evaluate 4 2 4 4 3 7 6 7 5 7 1 3 4

Synthesise 1 3 2 1 3 3 3 1

Forecast 2 1 1 4 1 1

Collaborate 3 1 2 6 4 2 4 3 1 1

Measure 2 3 2 2 4 4 2 3

Recognise 1 1 1 1 2 2 1

Update 4 7 1 1 2 6 3

1

2

Figure 12: Code Sequences

Page 48: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 43

– Learn, Learn – Investigate and Learn – Learn. It is apparent that the participants’ activities were not only weighted towards Learn and Investigate, that is ‘exploratory search’, but the tendency of a sequence ending in a Lookup activity is also low. This is in line with the assertion that Lookup searches, including Monitor and Explore, only feature at the beginning of a sequence while Investigate searches, such as Synthesise and Evaluate, feature at the end (Russell-Rose et al., 2011). A caveat here is that although the code sequences are indicative of likely Mode Chains within the data, they are an aggregation of all codes that occurred in all participants. They are not discrete sequences that precisely mark the beginning and end of a specific search activity and as such, cannot be interpreted as Mode Chains and this limits their utility in testing Russell-Rose et al.’s findings. Nevertheless the aggregation preserves their order of occurrence and thus the highest co-occurring pairs – with double-digit frequencies – can be chained together to identify interesting combinations supported by the data. Hence Monitor – Compare (13 co-occurrences) and Compare – Evaluate (12 co-occurrences) form the chain Monitor – Compare – Evaluate; Explore – Compare (11 co-occurrences) and Compare – Evaluate also form the chain Explore – Compare – Evaluate; and likewise, Explore – Evaluate (13 co-occurrences) stand as a chain. These are suggestive of the order in which the participants explored data, and bear some similarities to the five most frequent Mode Chains found with users of enterprise search and business intelligence applications (Russell-Rose et al., 2011): Monitor – Analyse – Evaluate; Analyse – Compare – Evaluate; Explore – Analyse – Evaluate; Analyse – Comprehend – Evaluate; and Analyse – Compare – Synthesise. Analyse consistently features in the Mode Chains distilled by Russell-Rose et al. from their research and it is noteworthy that it co-occurred relatively infrequently with other Search Modes in this study. This may be correlated with its overall frequency of 37, in contrast to Compare with 61, Evaluate with 60 and Explore with 57 (Chapter 5.2.1: Figure 7).

5.2.2.2 Designing Information Systems for Mode Chains The design of business analytics tools can be taken beyond supporting individual Search Modes to also facilitate transitioning between modes that commonly occur together. Users would thus be able to move from one mode to another with less friction and enjoy a more fluid data exploration experience. The value to software vendors and their customers is reflected in Oracle’s incorporation of Mode Chains features in their analytics products to enhance “usability, engagement and value” (Lamantia, featured in Russell-Rose and Tate, 2013, p.89). As business users, marketing professionals could likewise benefit from having Mode Chains features incorporated in their analytics tools to facilitate data exploration and information retrieval.

5.2.2.2.1 Monitor – Compare – Evaluate This chain could be supported by customised dashboards for each functional area such as Customer Acquisition or Customer Retention, displaying charts of the key metrics reported on. Maintaining a live connection between the data source and the analytics tool which automatically refreshes when opened would enable users to track the latest figures, compare it against both historical figures and forecasts and evaluate performance.

Page 49: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 44

Figure 13 shows a dashboard sample from QlikView with similar features:

5.2.2.2.2 Explore – Compare – Evaluate / Explore – Evaluate Both graphical and tabular data can be explored for knowledge discovery, with the former suited to gaining an overview of associations and relationships within the data and the latter suited to analysis of discrete components of the data (Vessey, 1991). Hence this mode chain may be best served by presenting data in both formats, with the charts offering additional benefits when transitioning from Explore to Compare. The configuration would require dynamic – as opposed to static – charts that users can interact with, a feature provided in tools such as Excel PivotChart, Tableau (Figure 14) and QlikView (Figure 15).

Figure 13: Dashboard Supporting Monitor-Compare-Evaluate

Page 50: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 45

Figure 14: Dashboard Supporting Explore-Compare-Evaluate

Figure 15: Dashboard Supporting Explore-Compare-Evaluate

Page 51: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 46

6 EVALUATION, REFLECTIONS AND CONCLUSIONS This study was planned and carried out to investigate the patterns of behaviour of marketing professionals when interacting with data sources and analytics tools to perform work tasks. The knowledge developed from the investigation would subsequently inform the suggestion of UI design features that support the interactions. The overarching framework under which the study was conducted is the “dimensions of search user experience” proposed by Russell-Rose and Tate (2013), namely User Type, User Goal, Search Context and Search Mode. To achieve the objectives, ten members of the B2C Marketing department of the Financial Times were observed as they performed a regular task in their natural work environment. Data about their User Type and Search Context was collected to define the boundaries of the study and to consider probable correlation between the dimensions, but User Goal and Search Mode were the lens through which their data exploration and information retrieval activities were viewed. The four stages of the Information Journey Model (Attfield and Blandford, 2010) was selected for investigating User Goal because it accommodates dynamic changes in goals as users engage in the exploration process, rather than assume static goals. Investigation of Search Mode was based on the nine modes developed by Russell-Rose et al. (2011), divided into groups of Lookup, Learn and Investigate activities as defined by (Marchionini, 2006), with the latter two classified as ‘exploratory search’. These modes were complemented by five potential modes personally identified within the observation data, one of which was discarded due to a data preparation error. Search Modes that tend to form a distinct cluster of ‘mode chains’, signifying typical search sequences, were also considered.

6.1 User Goal In seeking information, the participants seemed to move successively along the four stages of the Information Journey Model, following a path from the initial recognition of an information need to acquiring the information, interpreting it and finally using it to fulfil the need. Somewhat surprisingly, the results suggested the participants’ goals were largely of a static nature (Chapter 5.1.2), with limited evidence signifying dynamic changes along the way, in contrast to Gossen et al.'s (2012) definition of ‘exploratory search’ (Chapter 3.3). Yet the prevalence of Learn and Investigate activities engaged in by the participants (Chapter 5.2.1) suggest that the observed tasks were indeed of an exploratory nature. Conducting ‘exploratory search’ in fulfilment of apparently static goals does not question the validity of dynamic models in the participants’ activities; rather it suggests that the debate about static versus dynamic User Goals does not have to be an ‘either or’ argument, as they are not necessarily mutually exclusive. The assertion that the Information Journey Model bears resemblance to the ‘standard model’, which assumes static information needs (Russell-Rose and Tate, 2013, pp.24-26), shows the fine line that can exist between the two. ‘Exploratory search’ may be conducted iteratively to fulfil a static need as well as in multiple directions to fulfil dynamic needs. Since ‘exploratory search’ is arguably common to both static and dynamic goals, perhaps more emphasis ought to be placed on how users acquire and interpret information than is placed on the nature of their needs. The predominance of information acquisition and interpretation in the participants’ activities (Chapter 5.1.1) provides support to this perspective. Thus in designing analytics tools for marketing professionals performing similar tasks, features that facilitate exploration might enable users to fulfil different types of goals that may or may not have been anticipated by the UI designers.

Page 52: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 47

6.2 Search Mode The proposal to focus on ‘exploratory search’ in fulfilment of User Goals ties in with the Search Modes exhibited by the participants. As previously mentioned, ‘exploratory search’ activities of Learn (Compare and Explore) and Investigate (Evaluate) were predominant. This was hardly surprising, given that the purpose of the study is to understand how the participants explore data and retrieve information, and this informed the selection of observation tasks. However the significance lies in the supporting evidence it offers for the validity of Russell-Rose et al.’s modes in a real work environment. Given that their studies were based on work scenarios rather than actual work tasks, and during application development rather than live usage, there remains a need to validate them by observing professionals performing real work tasks with real data and analytics tools (Russell-Rose et al., 2011). Thus Search Modes in general, and Compare, Explore and Evaluate in particular, are supported as valid information seeking strategies within the marketing function. The discovery of other information seeking strategies outside of those proposed by Russell-Rose et al. was a rather pleasant surprise, as it was not anticipated. Collaborate; Measure; Update; and Forecast all emerged from the observation data, at varying frequencies of occurrence. Their validity as Search Modes is as yet unproven however, with neither of Collaborate and Measure possessing the complete set of characteristics proposed by Russell-Rose et al. for evaluating Search Modes (Chapters 5.2.1.3.1.1; 5.2.1.3.2.1). Furthermore, evidence alluding to Mode Chains was identified (Chapter 5.2.2.1); this also supports the applicability of Russell-Rose et al.’s findings to actual work tasks within the marketing function. It additionally suggests – albeit not uniquely, in light of Russell-Rose et al.’s work – that the occurrence of Mode Chains is not restricted to searches conducted through professional intermediaries adept at information seeking (O’Day and Jeffries, 1993); end-users conducting their own searches also have a tendency to combine different data exploration strategies to retrieve information. Consequently designing analytics tools to support not only individual Search Modes but also Mode Chains would potentially enhance their usability for marketing professionals and contribute to higher ROI for organisations that invest in them.

6.3 Project Management Planning and executing this study was my first attempt at a project of such magnitude and subject matter, in both academic and professional contexts. It was perhaps inevitable that it would not progress exactly as planned and errors would have to be accommodated. One of the most vital lessons learned while managing this project is the importance of scope definition – identifying clear goals that can be accomplished with available resources and timeframe. My original scope was rather ambitious (Appendix B); plans to implement UI designs by building PowerPivot models in Microsoft Excel and train the participants to use and subsequently evaluate them were shelved, as this would not have been completed before the project deadline. The participants’ evaluation of their current analytics tools, which was to serve as a benchmark for comparing the effectiveness of the UI designs, was therefore discounted from the study. In hindsight, more realistic scoping of the project would have saved the time and effort expended in designing and completing the evaluation questionnaire. Likewise, plans to categorise the participants’ data according to Level, Medium and Quantity (Cool and Belkin, 2002, cited in Russell-Rose and Tate, 2013, p.58) were cancelled as it would

Page 53: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 48

have added unnecessary complexity to the result analysis; representation of data as tabular or graphical was deemed sufficient, and limited an avenue for scope creep. The most significant deviation from plan was the framework used in analysing the observation data. Pirolli and Card’s Sensemaking process (2005, cited in Russell-Rose and Tate, 2013, pp.32-33) was replaced by Russell-Rose et al.’s “dimensions of search user experience”. Although the former would have provided insight into the participants’ goals, it became apparent during the literature review phase that this would only provide incremental insight to the Information Journey Model. Russell-Rose et al.’s framework, on the other hand, offered the opportunity to gain insight into a broader range of perspectives of search behaviour, beyond the goals of the participants. Search Mode was seen to be pertinent to the study and was subsequently adopted as its primary focus. The ability to change focus underscores the need for flexibility in adapting to changing requirements, which may not always be avoidable. While efforts were made to get every London-based member of the B2C marketing team involved in the study, the fact that one member declined and another withdrew implies that the value proposition was insufficient. Given the speculative nature of the project outcomes before its execution, it would have been difficult to be more persuasive without the risk of over-promising, but perhaps more effective negotiation might have achieved 100% participation. The observation sessions posed the greatest challenge, and there were aspects of it that could have been executed better. Scheduling conflicts and temporary unavailability of some participants were to be expected, and contingency plans – such as reallocation of time to other activities – were made towards their occurrence. What caused greater dissatisfaction were significant variations in the length of the tasks performed; a range of 7 to 80 minutes implies huge disparities in the complexity of the tasks observed, even if they were of a similar nature. In hindsight, I could have been more involved in the selection of observation tasks to ensure greater uniformity. A more rigorous test run of the observation sessions before implementation might have helped identify the issue and corrective steps might have been taken, perhaps further enhancing confidence in the results. On a more positive note, the discovery of Screencast-O-Matic screen recording applet, and the successful negotiation of approval to run it on the participants’ work systems, enabled the gathering of rich audio-visual data. The opportunity to mine the data for insight, analysing both on-screen actions and verbal descriptions, would have been lost had the observation been conducted by taking notes. An added benefit was that, in my judgement, there was no longer a need for post-observation interviews since the participants described their actions in detail while carrying them out, and answered questions asked of them for further clarifications. All in all, I have learned lessons crucial to the successful management of a project over the course of this study, which will stand me in good stead in future projects, whether of an academic or professional nature. It is not possible to plan for every aspect of a project, especially when in an unfamiliar territory; what is important is to acknowledge this fact, recognise resource limitations and leave allowances for the unexpected. Also important is the need to conduct trials before actual execution, wherever possible. The early discovery of potential problems would go some way in ensuring project success.

Page 54: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 49

6.4 Further Research A study of such potential importance to both academia and industry ideally should not rest exclusively on the shoulders of a sole, inexperienced researcher, even with the considerable guidance received from the supervisor. The design of the case study – including scope definition; number of participants; range of tasks; time scale; data gathering methods; and analysis – would require revisiting to strengthen the validity and generalisability of the findings, and certainly would benefit from the skills and experience of a team of seasoned researchers. Nevertheless, further research opportunities are identified within the confines of what has been accomplished in the study. The prospect of discovering significant correlation between the “dimensions” of User Type, User Goal, Search Context and Search Mode is intriguing. Although they are meant to be considered independently within the framework, faint hints of correlation in the analysis of the observation data could be investigated further, and implications of the interplay between the “dimensions” considered. The ‘grounded codes’ of Collaborate, Measure, Update, Forecast and Recognise, which were identified in this study, would benefit from more extensive investigation to ascertain their validity as Search Modes or otherwise. There is always a danger of finding what you are looking for; therefore bias cannot be ruled out of the use of Russell-Rose et al.’s Search Modes as ‘priori codes’ in this study. Hence further research into their applicability in a live work environment could take an exclusively grounded theory approach, with results subsequently compared with Russell-Rose et al.’s findings for similarities or contradictions. As the design suggestions for User Goals and Search Modes were not implemented as originally planned, there is a research opportunity to build on the findings by implementing the designs for marketing professionals and evaluating their effects on usability, if any. In my professional capacity as Marketing Operations Executive at the Financial Times, I will be implementing, testing and refining the suggested designs as part of my work responsibilities but this will not be in an academic context.

Page 55: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 50

7 GLOSSARY

7.1 Usability The qualities of an information system that makes it “easy to learn, effective to use, and enjoyable from the user’s perspective” (Rogers et al., 2011, p.19).

7.2 User Experience “How people feel about [an information system] and their pleasure and satisfaction when using it…” (Rogers et al., 2011, p.13).

7.3 User Interface The way information is presented to users of an information system to enable them to perform their tasks, incorporating design principles of usability and possibly generating a positive user experience (Rogers et al., 2011, p.157).

7.4 Cognition “Thinking, comparing and decision making” (Rogers et al., 2011, p.66).

Page 56: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 51

8 REFERENCES Attfield, S., Blandford, A., 2010. Interacting with Information. Morgan & Claypool Publishers,

California. Baeza-Yates, R., Ribeiro-Neto, B., 2011. Modern Information Retrieval: The Concepts and

Technology Behind Search, 2nd ed. Addison Wesley. Baker, J., Jones, D., Burkman, J., 2009. Using Visual Representations of Data to Enhance

Sensemaking in Data Exploration Tasks. J. Assoc. Inf. Syst. 10, 533–559. Barton, D., Court, D., 2012. Making Advanced Analytics Work for You. Harv. Bus. Rev. 90, 78–

83. Belkin, N., 1993. Interaction with Texts: Information Retrieval as Information-Seeking Behavior. Bowling Green State University Career Center, n.d. Self-Assessment - BGSU Career Center

[WWW Document]. BGSU Career Cent. URL http://www.bgsu.edu/downloads/enrollment/file74450.pdf (accessed 9.4.13).

Church, R.M., 1979. HOW TO LOOK AT DATA: A REVIEW OF JOHN W. TUKEY’S EXPLORATORY DATA ANALYSIS1. J. Exp. Anal. Behav. 31, 433–440.

Denig, S., 2004. Multiple intelligences and learning styles: Two complementary dimensions. Teach. Coll. Rec. 106, 96–111.

Edelman, S., 1997. Curiosity and Exploration [WWW Document]. Calif. State Univ. Northridge. URL http://www.csun.edu/~vcpsy00h/students/explore.htm (accessed 9.7.13).

Exploring, n.d. Merriam-Webster Dict. Fisher, D., DeLine, R., Czerwinski, M., Drucker, S., 2012. Interactions with Big Data Analytics -

Microsoft Research. Interactions 50–59. Gibbs, G., Taylor, C., 2010. “How and what to code”, Online QDA Web Site [WWW Document].

URL http://onlineqda.hud.ac.uk/Intro_QDA/how_what_to_code.php (accessed 11.25.13).

Gossen, T., Nitsche, M., Haun, S., Nürnberger, A., 2012. Data exploration for bisociative knowledge discovery: A brief overview of tools and evaluation methods, in: Bisociative Knowledge Discovery. Springer, pp. 287–300.

Greenberger, M. (Ed.), 1971. Computers, communications, and the public interest. Johns Hopkins Press.

Isograd, n.d. TOSA by ISOGRAD [WWW Document]. URL http://www.isograd.com/EN/score.php (accessed 12.15.13).

Järvelin, K., Ingwersen, P., 2004. Information Seeking Research Needs Extension towards Tasks and Technology. Presented at the Information Research, 10(1) paper 212.

Jenkins, C., Corritore, C.L., Wiedenbeck, S., 2003. Patterns of information seeking on the Web: A qualitative study of domain expertise and Web expertise. IT Soc. 1, 64–89.

Kim, K.-S., 2001. Information seeking on the Web: Effects of user and task variables. Libr. Inf. Sci. Res. 233–255.

Li, Y., Belkin, N.J., 2010. An exploration of the relationships between work task and interactive information search behavior. J. Am. Soc. Inf. Sci. Technol. 61, 1771–1789.

Loewenstein, G., 1994. The Psychology of Curiosity: A Review and Reinterpretation. Psychol. Bull. 116, 75 – 98.

Magal, S.R., Word, J., 2009. Essentials of Business Processes and Information Systems. Wiley. Marchionini, G., 2006. Exploratory search: from finding to understanding. Commun ACM 49,

41–46. Mayer, R.E., Massa, L.J., 2003. Three Facets of Visual and Verbal Learners: Cognitive Ability,

Cognitive Style, and Learning Preference. J. Educ. Psychol. 95, 833–841. McAfee, A., Brynjolfsson, E., 2012. Big Data: The Management Revolution - Harvard Business

Review. Harv. Bus. Rev. McLellan, L., 2012. By 2017 the CMO will Spend More on IT Than the CIO [WWW Document].

Gart. Webinars. URL

Page 57: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 52

http://my.gartner.com/portal/server.pt?open=512&objID=202&mode=2&PageID=5553&resId=1871515 (accessed 12.27.13).

Microsoft, n.d. Microsoft Excel [WWW Document]. URL http://office.microsoft.com/en-gb/excel/ (accessed 12.15.13a).

Microsoft, n.d. PowerPivot [WWW Document]. URL http://www.microsoft.com/en-us/bi/powerpivot.aspx (accessed 12.15.13b).

Microsoft, n.d. Overview of Online Analytical Processing (OLAP) [WWW Document]. URL http://office.microsoft.com/en-gb/excel-help/overview-of-online-analytical-processing-olap-HP010177437.aspx (accessed 1.2.14c).

Microsoft, n.d. PowerPivot Slicer [WWW Document]. URL http://office.microsoft.com/en-gb/videos/how-to-use-a-powerpivot-slicer-with-a-pivottable-VA101967202.aspx (accessed 12.27.13d).

Microsoft, n.d. Microsoft Word [WWW Document]. URL http://office.microsoft.com/en-gb/word/ (accessed 1.2.14e).

Microsoft, n.d. PowerPivot Window: Linked Tables Tab [WWW Document]. URL http://technet.microsoft.com/en-us/library/gg399186.aspx (accessed 1.4.14f).

Moth, D., 2013. 71% of businesses plan to spend more on digital marketing technology in 2013 [WWW Document]. Econsultancy. URL http://econsultancy.com/blog/62095-71-of-businesses-plan-to-spend-more-on-digital-marketing-technology-in-2013?utm_campaign=bloglikes&utm_medium=socialnetwork&utm_source=facebook (accessed 12.27.13).

Murphy, C., 2012. Will CMOs Outspend CIOs? Wrong Question [WWW Document]. InformationWeek. URL http://www.informationweek.com/global-cio/interviews/will-cmos-outspend-cios-wrong-question/240009521 (accessed 12.27.13).

O’Day, V.L., Jeffries, R., 1993. Orienteering in an Information Landscape: How Information Seekers Get From Here to There, in: Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing Systems, CHI ’93. ACM, New York, NY, USA, pp. 438–445.

Oates, B.J., 2006. Researching Information Systems and Computing. Sage Publications Limited. Oracle, n.d. Oracle SQL [WWW Document]. URL

http://docs.oracle.com/cd/E11882_01/server.112/e41084/intro.htm#SQLRF001 (accessed 1.2.14a).

Oracle, n.d. Oracle Endeca Information Discovery [WWW Document]. URL http://www.oracle.com/us/solutions/business-analytics/business-intelligence/endeca/overview/index.html (accessed 1.3.14b).

Oracle Applications [WWW Document], n.d. URL http://www.oracle.com/us/products/applications/overview/index.html (accessed 12.18.13).

Pirolli, P., Card, S., 2005. The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis. Presented at the International Conference on Intelligence Analysis.

Pohl, J.G., 2001. Transition from data to information. Collab. Agent Des. Res. Cent. Tech. Rep.-RESU72 1.

Provalis Research, n.d. QDA Miner [WWW Document]. Provalis Res. URL http://provalisresearch.com/products/qualitative-data-analysis-software/ (accessed 12.16.13).

QlikTech, n.d. QlikView [WWW Document]. QlikView. URL http://www.qlikview.com/us/explore/products/overview (accessed 12.15.13a).

QlikTech, n.d. The Associative Experience [WWW Document]. QlikView. URL http://www.qlikview.com/us/explore/resources/whitepapers/the-associative-experience (accessed 1.4.14b).

Page 58: Data Exploration and Information Retrieval in the B2C Marketing Department of the Financial Times

Page | 53

Researchware, n.d. HyperTRANSCRIBE [WWW Document]. URL http://www.researchware.com/products/hypertranscribe/ht-nutshell.html (accessed 12.16.13).

Rogers, Y., Sharp, H., Preece, J., 2011. Interaction Design: Beyond Human - Computer Interaction, Third. ed. John Wiley & Sons Ltd, West Sussex.

Russell-Rose, T., Lamantia, J., Burrell, M., 2011. A taxonomy of enterprise search and discovery. Proc. HCIR 2011.

Russell-Rose, T., Tate, T., 2013. Designing the Search Experience, 1st ed. Morgan Kaufmann, Massachussets.

SAS, n.d. Marketing Analytics: What It Is and Why It Matters. Screencast-O-Matic, n.d. Screencast-O-Matic [WWW Document]. URL http://www.screencast-

o-matic.com/ (accessed 12.16.13). Search, n.d. Merriam-Webster Dict. Sicular, S., 2013. Gartner’s Big Data Definition Consists of Three Parts, Not to Be Confused with

Three “V”s [WWW Document]. Forbes. URL http://www.forbes.com/sites/gartnergroup/2013/03/27/gartners-big-data-definition-consists-of-three-parts-not-to-be-confused-with-three-vs/ (accessed 5.5.13).

Smith, M., 2013. Why Business Intelligence Software is Failing Business [WWW Document]. Ventana Res. URL http://marksmith.ventanaresearch.com/2013/02/16/why-business-intelligence-software-is-failing-business/ (accessed 5.6.13).

Tableau Software, n.d. Tableau Desktop [WWW Document]. Tableau Softw. URL http://www.tableausoftware.com/products/desktop (accessed 12.27.13).

Taft, D., 2012. Business Analytics Market to Reach $50.7B by 2016 on Big Data Hype: IDC [WWW Document]. eWeek. URL http://www.eweek.com/c/a/Enterprise-Applications/Business-Analytic-Market-to-Reach-507B-by-2016-on-Big-Data-Hype-IDC-179369/ (accessed 5.6.13).

The Chartered Institute of Marketing, 2009. Marketing and the 7Ps [WWW Document]. URL http://www.cim.co.uk/files/7ps.pdf (accessed 12.27.13).

The Chartered Institute of Marketing, n.d. Types of Membership [WWW Document]. Chart. Intitute Mark. URL http://www.cim.co.uk/Membership/TypesOfMembership.aspx (accessed 12.15.13).

The Technology Strategy Board, 2013. Data exploration - creating new insight and value [WWW Document]. URL https://www.innovateuk.org/competition-display-page/-/asset_publisher/RqEt2AKmEBhi/content/data-exploration-creating-new-insight-and-value (accessed 12.31.13).

Toms, E.G., Freund, L., Kopak, R., Bartlett, J.C., 2003. The effect of task domain on search, in: Proceedings of the 2003 Conference of the Centre for Advanced Studies on Collaborative Research. pp. 303–312.

Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley Publishing Company. Vesset, D., McDonough, B., Wardley, M., Schubmehl, D., 2012. Worldwide Business Analytics

Software 2012 – 2016 Forecast and 2011 Vendor Shares (Market Analysis No. 235494). IDC.

Vessey, I., 1991. Cognitive Fit: A Theory-Based Analysis of the Graphs Versus Tables Literature*. Decis. Sci. 22, 219–240.

Yacci, M., 1999. The Knowledge Warehouse: Reusing Knowledge Components. Perform. Improv. Q. 12, 132–140.