Download - Radio Cloud Evolution Towards Memory-Driven Computing

Olli Ylänne

Radio Cloud Evolution Towards Memory-Driven Computing

Metropolia University of Applied Sciences

Master of Engineering

Degree Programme

Thesis

4 December 2019

Abstract

Author Title Number of Pages Date

Olli Ylänne Radio Cloud Evolution Towards Memory-Driven Computing 92 pages + 3 appendices 4 December 2019

Degree Master of Engineering

Degree Programme Information Technology

Specialisation option Networking and Services

Instructors

Jari Karppinen, M.Sc. (Tech.) Antti Koivumäki, Principal Lecturer Ville Jääskeläinen, Head of master’s program in IT

This thesis was done for telecommunications equipment company Nokia. Thesis objective was to evaluate new technology for low latency data storage solution in cloud environment. Mobile phone network cloudification aims to move mobile phone network computation into cloud data centres. Cloud based mobile phone network (Telco Cloud) products consists of Virtual Machines (VMs) running in data centre servers. Nokia Telco Cloud solution con-tains a data storage which can be accessed from different VMs. This data storage is known as Shared Data Layer (SDL). Currently SDL is accessed through a TCP/IP network. Certain 5G use cases require ultra-low latency. Data access over TCP/IP network is chal-lenging from latency target point of view. Memory-Driven Computing (MDC) is a new computer architecture. MDC-architecture con-tains a shared memory which can be directly accessed from different physical computing nodes. Objective in this thesis was to evaluate the possibility to implement SDL compati-ble, low latency, data storage in the MDC-architecture shared memory. Such data storage could help to reduce data-access latency in Nokia Telco Cloud products. Evaluation was conducted following way. SDL and MDC-architecture compatible shared-memory data storage solutions were designed and implemented. Performance (access la-tency) and reliability of the implemented solutions were evaluated in MDC-architecture type environment (Hewlett Packard Enterprise Superdome Flex product). Evaluation was performed using Case Study research method. Set of real-world SDL use cases from Nokia Cloud Base Station product were identified. Identified use cases were simulated in evaluation environment for the implemented and existing SDL data storage solutions. Sim-ulation results were analysed, and conclusions were drawn. SDL compatible data storage was successfully implemented to MDC-architecture shared memory. SDL implemented to MDC-architecture shared memory provided notably lower latency results than the existing SDL implementation in the evaluation environment. How-ever, scope of the evaluated subject is broad, and this thesis studied the subject from a limited scope and further study is recommended to be conducted.

Keywords Telco Cloud, Memory-Driven Computing, Data Storage

Contents

List of Abbreviations

1 Introduction 1

1.1 Business Challenge 2

1.2 Research Objective 4

1.3 Related Research 5

1.4 Thesis Structure 5

2 Cloud Based Mobile Phone Network 7

2.1 Mobile Phone Networks 7

2.2 Mobile Phone Network Base Station 9

2.3 Cloud Computing 13

2.4 Telco Cloud 16

2.4.1 Shared Data Layer in Nokia Telco Cloud Solution 18

2.4.2 Cloud Base Station 21

3 Memory-Driven Computing 24

3.1 Computer Architecture Evolution 24

3.2 MDC Architecture Main Principles 32

3.3 Enabling Technologies 34

3.4 Software in MDC Architecture 35

3.5 MDC Architecture Influences in Superdome Flex Product 37

4 Solution Development 42

4.1 Research Methods 42

4.2 Research Design 43

4.3 Provided Artefacts 44

4.4 Evaluation Environment and Methods 49

4.4.1 Superdome Flex Based Evaluation Environment 49

4.4.2 Tools Used in Evaluation 52

4.4.3 Evaluation Methodology 53

5 Solution Evaluation 56

5.1 Implemented Software Solutions 56

5.2 Simulated SDL API Use Cases 62

5.2.1 Use Case 1: Stateless Applications 62

5.2.2 Use Case 2: Stateless Applications Which Store Stable State 64

5.2.3 Use Case 3: Non-Intrusive Analytics 66

5.3 Simulated SDL API Edge Cases 67

6 Evaluation Results 69

6.1 Result Data 69

6.2 Results from SDL API Use Case Simulations 71

6.2.1 Use Case 1 Results 72



6.3 Results from SDL API Edge Case Simulations 81

6.4 Summary of the Results 83

7 Summary and Conclusions 85

References

Appendices

Appendix 1. SDL API client simulator program input and output

Appendix 2. Intel Memory Latency Checker output from Superdome Flex

Appendix 3. Asynchronous SDL C++ API storage interfaces

List of Abbreviations

1G First generation mobile phone network technology.

5G Fifth generation mobile phone network technology.

API Application Programming Interface. Set of clearly defined methods for com-

munication between various software components.

BBU Base Band processing Unit. BTS functional component handling baseband

processing.

BLOB Binary Large Object. Collection of binary data stored as a single entity in a

database.

BSC Base Station Controller. GSM (2G) mobile phone network element.

BTS Base Station. Mobile phone network element.

CBTS Cloud-based mobile phone network Base Station.

CN Core Network. One functional entity in mobile phone network.

COTS Commercial-Off-The-Shelf.

CPRI Common Public Radio Interface.

CPU Central Processing Unit. Hardware component of a computer system. Han-

dles program instruction execution.

CRAN Cloud based mobile phone Radio Access Network.

CU Centralized Unit. Part of the gNB.

DU Distributed Unit. Part of the gNB.

eNB Enhanced NodeB (eNodeB). LTE (4G) mobile phone network Base Station.

gNB 5G mobile phone network Base Station.

EPC Enhanced Packet Core. CN technology in the LTE (4G) mobile phone net-

work.

ETSI European Telecommunications Standards Institute.

FAM Fabric-Attached Memory.

FAME Fabric-Attached Memory Emulator.

GSM Global System Mobile. Second generation mobile phone network technol-

ogy.

IaaS Infrastructure as a Service. Cloud service model.

IoT Internet of Things.

LAN Local Area Network.

LFS Librarian File System. Linux file system which is optimized for MDC envi-

ronments.

LTE Long Term Evolution. Fourth generation mobile phone network technology

(4G).

MDC Memory-Driven Computing.

MEC Mobile Edge Computing.

MNO Mobile Phone Network Operator.

NCIR Nokia AirFrame Cloud Infrastructure for Real-time applications. Nokia NFV

solution.

NF Network Function. A functional building block in the network infrastructure.

NFV Network Function Virtualization.

NFVI Network Functions Virtualisation Infrastructure.

NodeB UMTS (3G) mobile phone network Base Station.

NoSQL Not Only SQL. Database type which uses different data model than tradi-

tional relational databases.

NUMA Non-Uniform Memory Access. Memory access architecture in parallel com-

puting.

NVM Non-Volatile Memory.

OBSAI Open Base Station Architecture Initiative.

PaaS Platform as a Service. Cloud service model.

PCM Phase Change Memory. One type of Non-Volatile Memory.

RAN Radio Access Network. One functional entity in mobile phone network.

RCP Radio Cloud Platform. Guest operating system and PaaS layer for Telco

Cloud implemented by Nokia.

RDBMS Relational Database Management System. Traditional database type.

RNC Radio Network Controller. 3G mobile phone network element. Has same

main responsibilities as BSC in 2G mobile phone network.

RRH Remote Radio Head. BTS functional component handling radio related

functionalities.

RTT Round-Trip Time.

SaaS Software as a Service. Cloud service model.

SDF Superdome Flex. NUMA type server computer product from HPE. Has in-

fluences from MDC architecture.

SDL Shared Data Layer. Data storage solution in Nokia Telco Cloud.

SDL API Interface for software components to access the data storage provided by

SDL.

SMP Symmetric Multiprocessor. Multiprocessor system where operating system

and memory are shared between processors.

SoC System on Chip. A system on a chip combines the required electronic cir-

cuits of various computer components onto a single, integrated chip.

SOC8 SUSE OpenStack Cloud 8. SUSE Linux and OpenStack based Cloud com-

puting platform.

UE User Equipment. Device communicating using mobile phone network. One

functional entity in mobile phone network.

UMA Uniform Memory Access. Memory access architecture in parallel compu-

ting.

UMTS Universal Mobile Telecommunication System. Third generation mobile

phone network technology (3G).

UPI Ultra Path Interconnect. Processor interconnect developed by Intel. Re-

places QPI technology.

UTRAN Universal Terrestrial Radio Access Network. 3G network RAN.

VM Virtual Machine.

VNF Virtualized Network Function. A functional building block in the cloud-based

network infrastructure.

VPN Virtual Private Network.

QPI Quick Path Interconnect. Processor interconnect developed by Intel. Re-

placed by UPI.

1

1 Introduction

Today, the two main development activities in mobile phone networks are cloud based

mobile phone network (Telco Cloud) and 5th generation (5G) network technology. Both

of these impose many targets for mobile phone network equipment vendors to fulfil.

Cloud computing is attractive due to, for example, its scalability and the easy deployment

of services. Mobile phone network cloudification is taking place rapidly at present, as the

benefits of the cloud provide many opportunities for telecom operators. Resource needs

can vary significantly inside a mobile phone network. For example, cells located in office

areas can be very heavily loaded during daytime hours and in much lower use during

the evening and night, while cells located in residential areas tend to be used at the

opposite times. Thus, it is very beneficial to possess the required capacity just in time via

the scalability of the cloud. Many new services are being continuously introduced in mod-

ern mobile phone networks. Cloud offers the possibility to quickly deploy and test those

services and even to remove them quickly if they are not successful. Cloud also offers

the means to enhance mobile phone network operation and maintenance. Mobile phone

network updating can be handled without downtime with rolling updates. Cloud provides

new ways to implement high availability support into mobile phone network.

5G will deliver extreme broadband, ultra-robust, low latency connectivity, and massive

networking to support many different use cases and business models. Use cases de-

manding low latency connectivity include, for example, autonomous driving and remote

health care.

Nokia Networks is a multinational data networking and telecommunications equipment

company headquartered in Espoo, Finland, and a wholly owned subsidiary of Nokia Cor-

poration. This thesis work was done for Nokia Networks, and the thesis work is related

to data-access latency optimization in a cloud based mobile phone network. The thesis

work evaluates a new kind of computer hardware which could be utilized in Telco Cloud

data centre.

2

1.1 Business Challenge

Traditional mobile phone networks are composed of individual functional blocks called

Network Functions (NF). Each NF block has a certain function in a mobile phone net-

work, and it contains hardware and software to implement that function. In a cloud based

mobile phone network this building block is called Virtualized Network Function (VNF).

Virtualization of Network Functions is done according to general principles of hardware

virtualization. In VNFs, the software defining the network function is separated from the

hardware and generic software. [1, pp. 16-17]

Nokia Telco Cloud solution supports the 5G technology and therefore there is a need for

Nokia Telco Cloud products to fulfil the ultra-low latency requirements set by the 5G

technology. As one of the primary causes for latency is distance, cloud computing must

take place closer to source within delay limits. This has led to into dividing Cloud based

mobile phone Radio Access Network (CRAN) into two architectural elements: edge cloud

located closer to the source serving latency critical applications and centralized cloud

serving less performance critical applications. Edge cloud processing needs be handled

efficiently in order to meet the low latency requirements. [2, pp. 24-25]

The Nokia Telco Cloud solution has a separate layer for data storage, Shared Data Layer

(SDL). Currently the data storage functionality in SDL is implemented using conventional

database solutions. When SDL is accessed from several different computing nodes in

an edge cloud data centre, it is difficult to handle it efficiently enough from the latency

requirements point of view. That is mainly due to most existing database solutions trans-

ferring data via a local area network (LAN) when the database is shared between differ-

ent computing nodes. There is a clear need to study other possibilities of SDL implemen-

tation in edge cloud data centres.

Hewlett Packard Enterprise (HPE) is currently developing a new kind of computer archi-

tecture in their research project called “The Machine”. The new architecture is called as

Memory-Driven Computing (MDC) architecture. MDC architecture optimises data shar-

ing in a computer cluster (like a cloud computing data centre). The main idea is that a

computer cluster contains an abundant shared non-volatile memory. All the computing

nodes in the computer cluster can very quickly directly access the shared memory with-

out the need to contact other nodes. In the current computer architecture-based clusters,

3

each node has its own insignificant memory and data is typically shared via a LAN. Fig-

ure 1. illustrates the difference between a traditional architecture-based computer cluster

and an MDC architecture-based computer cluster.

Figure 1. System on Chip (SoC) based computer clusters using traditional computer architec-ture (left) and Memory-Driven architecture (right) [3]

Although the MDC architecture is originally being developed in an HPE research project,

it is not an HPE proprietary technology but open architecture. There is a consortium that

specifies the standards and protocols for memory access in the MDC architecture. This

consortium is called Gen-Z Consortium, and it is made up of several IT industry compa-

nies. Most of the software that HPE has already developed for MDC architecture is open

source software.

Today, the growth in computing processing power described by Moore’s Law (described

in Section 3.1) is coming to an end using current computing architecture. Computing

power of individual computers is not increasing as efficiently as before because, due to

the physical limits, Central Processing Units (CPUs) are not developing as rapidly as

before. MDC is one of the promising technologies that would make it possible to keep

computing power growing according to Moore’s Law even in the future. It is therefore

important to evaluate this new technology. The problem behind this thesis provides a

well-fitting, real-world, use case for evaluating MDC architecture from one viewpoint.

4

1.2 Research Objective

The main business goal behind the objective was to ensure low latency data-access in

CRAN. This thesis work evaluated whether it is possible and feasible to create a low

latency shared database between edge cloud data centre computing nodes using MDC

architecture shared memory as the data storage technology behind the SDL Application

Programming Interface (API). Figure 2. provides a high-level illustration of the current

SDL deployment and the thesis objective type of SDL deployment. Figure 2. is similar as

Figure 1. but it shows MDC and traditional architecture based cluster difference from

SDL point of view. In the current deployment, SDL is a database deployed to some of

the data centre computing nodes and it is accessed through the TCP/IP LAN. In the

thesis objective deployment, SDL is implemented into MDC architecture shared memory

and thus all nodes can access it via a low latency memory bus. SDL clients access SDL

via SDL API, which is identical in both deployments, and therefore deployment change

would not affect SDL clients. Having a low latency shared database in edge cloud data

centre would provide good possibilities to reduce the overall latency of the CRAN. The

main evaluation criterion was performance, as the main goal was to achieve a low la-

tency solution. However, the suitability of the Memory-Driven Computing architecture as

edge cloud data storage technology was also briefly evaluated from reliability viewpoint.

Figure 2. Current SDL deployment (left) and thesis objective SDL deployment (right).

5

1.3 Related Research

MDC architecture is a rather new technology and it has mainly been researched by HPE

who started the development of this new computer architecture. HPE has carried out

experiments on MDC in some real-world use cases, like the Spark big data framework

[4], but not with any use cases that would be very similar to the one in this thesis. There

are a few cases where MDC has been evaluated outside of HPE, for example at the

German Center for Neurodegenerative Diseases and the University of Bonn [5]. The use

cases in those evaluations have been completely different from the one in this thesis.

They have mainly concentrated on the efficient processing of enormous amounts of data.

In the usage considered in this thesis the amounts of data are relatively small, but the

main objective is to be able to access that data with very low latency.

Master’s thesis research done by Tero Lindfors studied MDC architecture from CRAN

performance point of view and is therefore the most closely related research with this

thesis work [6]. Main differences between this thesis and Lindfors’s thesis are following:

• Lindfors’s research evaluated MDC architecture from the whole CRAN point of view. In this thesis, evaluation was from access network (see Sec-tion 2.1) point of view. And further on using a specific access network prod-uct, Cloud Base Station (CBTS), and a specific use case in CBTS (SDL API usage).

• In this thesis different methods were used to implement a data storage so-lution to MDC architecture than in Lindfors’s thesis and evaluation was done using different methodology (real-world use case simulations).

As a summary, Lindfors’s study was an initial evaluation about MDC architecture’s suit-

ability for CRAN. Lindfors’s study proved that MDC architecture is a potential alternative

for high performance data storage solution in CRAN. This thesis continued evaluating

the subject further on, using a different, more detailed, viewpoint and methods.

1.4 Thesis Structure

This thesis has 7 chapters. Chapter 1 provides a basic introduction to the main concepts

as well as the business problem and the goal behind the research. Chapter 1 also de-

scribes the research objective and related research. Chapters 2 and 3 supply basic the-

oretical background information about the main concepts behind this thesis. Chapter 4

6

shows how the research was done - what kind of design it had, what research methods

were used and what kind of data was collected. Chapter 4 also introduces how the solu-

tion was developed and evaluated. Chapter 5 illustrates the developed solution and its

evaluation in detail. Chapter 6 discusses the solution evaluation results. Chapter 7 gives

a summary of the evaluation process and draws conclusions from the evaluation results.

7

2 Cloud Based Mobile Phone Network

This chapter provides theoretical background information about cloud based mobile

phone networks. The chapter begins with a brief introduction to mobile phone networks

and to cloud computing. This provides a basis to the terms and concepts used in the

later sections of this chapter. The chapter then moves on to cover Telco Cloud and

CRAN. From there we move on to the exact scope of this thesis work; CBTS product,

mobile edge computing and SDL data storage solution.

2.1 Mobile Phone Networks

Mobile phone networks have enabled completely new possibilities for communication.

The first mobile phone networks provided voice communication possibilities on the go.

Since then mobile phone networks have evolved rapidly. Mobile phone network evolution

has enabled wireless data communication. At first the mobile data communication pos-

sibilities were rather limited but have then developed into broadband wireless data com-

munication. Major updates in mobile phone network technology have created new mobile

phone network generations. 1st generation (1G) mobile phone networks were introduced

in the early 80s and first 5th generation (5G) mobile phone networks are currently being

delivered. Figure 3. below shows the timeline of existing mobile phone network genera-

tions, different techniques used in them, and the main features they have provided.

Figure 3. Timeline of mobile phone network generations and their main features [7]

Mobile phone network can be divided into four main functional entities:

8

• User Equipment (UE). A device communicating using the mobile phone network. For example: mobile phone, laptop equipped with mobile broad-band modem, Internet of Things (IoT) device with mobile broadband mo-dem. Also called with other names, for example: Mobile Station.

• Access Network. Provides a wireless network to which UE connects. It is usually referred to as Radio Access Network (RAN), or even with some more specific names depending on the mobile network generation, like Uni-versal Terrestrial Radio Access Network (UTRAN) in 3G network. RAN consists of several radio cells which are controlled by Base Stations.

• Core Network (CN). Connects different Access Network components to-gether. Mobile phone network functions handled in core network include for example: mobility management, subscriber/user databases. CN also pro-vides gateway to other, external networks.

• External Network. Other networks to which the mobile phone network is connected to in order to provide the needed communication possibilities for UEs. These can be for example: a fixed phone network or another mobile phone network.

Figure 4. below show the above-mentioned four functional entities in a GSM (2G) mo-

bile phone network.

Figure 4. GSM Mobile Phone Network Architecture [8, p. 7]

The same four main functional entities exist in the later mobile phone network genera-

tions as well, but the details, like the exact names and components within the functional

entities, differ. Out of those four entities, two are part of the actual mobile phone network

implementation: Radio Access Network and Core Network. This same architectural divi-

sion into two main parts can also be found in later mobile phone network generations as

well. [8]

9

2.2 Mobile Phone Network Base Station

The mobile phone network element which this thesis work is mostly related to is CBTS.

Therefore, the Base Station (BTS) mobile phone network element is studied in more

detail here. We first look at the architecture of the traditional (not cloudified) BTSs and

how they have evolved over time. This helps to better understand the CBTS, which is

introduced later in this main chapter.

BTS functionality can be divided to two main parts: radio functionality and baseband

processing functionality. Two different deployment architectures have been used for

Base Stations:

• Traditional architecture. In this architecture, radio functionality and baseband pro-

cessing functionality components are in close proximity to each other (within few

meters) and therefore all the BTS hardware is located within the BTS site. This

deployment model was popular in 2G and 3G networks.

• Remote Radio Head (RRH) architecture. In this architecture radio functionality

and baseband processing functionality components are distributed and can be

rather far away from each other (up to 40km). The advantage of this architecture

is that the baseband processing components can be placed in a location which

is easily accessible for maintenance, and only radio functionality components

need to be located in the BTS site which is typically laborious to reach (for exam-

ple: rooftop, pole). This deployment model was first introduced in the 3G network

and currently most of the existing BTSs use it. [9, pp. 407-408]

Figure 5. below demonstrate the above described two BTS deployment architec-

tures:

10

Figure 5. BTS deployment architectures (traditional on left, RRH on right) [9]

RRH architecture is the basis for CBTS. In RRH architecture BTS is divided into two

main components. The component handling radio functionality is known as the Remote

Radio Head; it is also known as Radio Unit (RU) and Radio Equipment (RE). Another

component is handling baseband processing and it is called Baseband Unit (BBU). It is

also referred to as Radio Equipment Controller (REC). Further on in this thesis the terms

Remote Radio Head (RRH) and Baseband Unit (BBU) are used. RRH architecture BTS

has following three interfaces:

• Fronthaul network. Provides the BTS internal interface between RRH and BBU.

Standards used in this interface include: Common Public Radio Interface (CPRI)

and Open Base Station Architecture Initiative (OBSAI).

• Midhaul network. Provides the interface between BBUs of different BTSs. Com-

monly used standard for this interface is X2.

• Backhaul network. Provides the interface between BTS and Core Network. Com-

monly used standard for this interface is S1.

Figure 6. below shows the above-mentioned BTS components and interfaces. eNB in

the figure refers to LTE (4G) mobile phone network BTS which is known as enhanced

NodeB (eNodeB, eNB). [10]

11

Figure 6. RRH architecture BTS interfaces and standards used in those interfaces [10]

Base Station in Different Mobile Phone Network Generations

GSM (2G) mobile phone network has a network element called the Base Station Con-

troller. This element is part of the RAN and its function is to control the BTSs. Several

BTSs are connected into one BSC and BSC is connected to Core Network. [8, p. 6] This

architecture is visible in Figure 4. UMTS (3G) mobile phone network has similar archi-

tecture but the network element handling similar functions as the 2G network BSC is

called the Radio Network Controller (RNC) in a 3G network. BTS mobile phone network

elements are called NodeBs in 3G network. [8, p. 9] LTE (4G) mobile phone network

introduced a major architectural change. LTE network no longer has a centralized RAN

element controlling BTSs (like BSC in 2G and RNC in 3G). Instead, all the BTSs are

connected directly to Core Network. In LTE network BTSs handle the functionality which

BSC and RNC handled in previous mobile phone network generations. Figure 7. below

shows the LTE network architecture. The main reason behind this architectural change

was to remove a centralized component which was a single point of failure in previous

mobile phone network architectures. [8, pp. 18-19]

12

Figure 7. LTE (4G) mobile phone network architecture [8, p. 19]

Term EPC in Figure 7. refers to Evolved Packet Core, CN technology in LTE (4G) net-

work. From Figure 7. it can also be seen that in LTE network architecture BTSs are

connected to each other using X2 interface (midhaul network). Previous mobile phone

networks (2G and 3G) did not contain such interconnected BTSs. In LTE the connection

between neighbouring BTSs is needed because there is no centralized RAN element

controlling several BTSs. Instead all BTSs are directly connected to CN using S1 inter-

face (backhaul network). [8, pp. 18-19]

The high-level architecture of 5G mobile phone network is similar as the 4G mobile

phone network architecture described above. 5G mobile phone network base station is

known as gNB. One major difference between 5G RAN architecture and previous gen-

erations, is that in 5G RAN BTS (gNB) is split into two parts, called CU (Central Unit) and

DU (Distributed Unit). [11]

Due to the above discussed differences in different RAN generations, 4G and 5G BTSs

contain much more functionality than BTS from previous generations. Thus, only 4G and

5G BTSs are cloudified. In the earlier generations, centralized component (BSC or RNC)

handles majority of the processing, and those components have been cloudified as sep-

arate products. As this thesis is done from the CBTS product point of view, this thesis

mainly concentrates on the use cases from 4G and 5G mobile phone networks.

13

2.3 Cloud Computing

Cloud computing is a model for providing computing resources (e.g., networks, servers,

storage and applications) as services. The NIST definition of cloud computing defines

the essential characteristics, service models and deployment models for cloud compu-

ting. [12]

Essential characteristics define how computing resources are provided in cloud compu-

ting. For example, resources need to be available on-demand over a broad network ac-

cess. The computing resources (hardware and software), which essential characteristics

of the cloud computing provide, are referred to as cloud infrastructure. [12]

Service models define the type of capability which is provided to the cloud computing

consumer. Three different official service models exist:

Software as a Service (SaaS). The capability provided to the consumer is to use an

application which is running on a cloud infrastructure. Consumer has no control over the

cloud infrastructure.

Platform as a Service (PaaS). The capability provided to the consumer allows the con-

sumer to deploy applications (either self-made or acquired) to the cloud infrastructure.

Consumer can only deploy such applications which are supported by underlying cloud

infrastructure software development environment (operating system, programming lan-

guages, libraries, etc.). Therefore, in this model the consumer can only control the de-

ployed applications in the cloud infrastructure, but all other parts are controlled by the

cloud service provider.

Infrastructure as a Service (IaaS). The capability provided to the consumer solely pro-

vides fundamental computing resources. Consumer can provision these resources and

deploy arbitrary software on them (including operating systems). Consumer has no con-

trol over the cloud infrastructure hardware resources but has more control over the soft-

ware resources than in the PaaS model. [12]

Together the three above-mentioned cloud service models form the cloud stack. For ex-

ample, SaaS and PaaS software can be deployed and developed over an IaaS service.

[13]

14

Deployment models define for whom the cloud infrastructure is provisioned to, who owns

and manages the cloud infrastructure and where the cloud infrastructure is located. Ex-

amples of the most common deployment models include the following:

Public Cloud. The cloud infrastructure is provisioned for open use. The cloud infrastruc-

ture is owned and managed by a business, academic or government organization. The

cloud infrastructure is located on the cloud provider’s premises.

Private Cloud. The cloud infrastructure is provisioned for exclusive use of a certain or-

ganization. The cloud infrastructure can be owned and managed by that organization or

by a third party. The cloud infrastructure can be located on or off the premises of the

cloud provider.

A typical installation site for the cloud infrastructure is a cloud data centre [12]. A single

hardware resource in a cloud data centre is known as a node. An individual server, for

example, can be referred to as a computing node. Nodes in a cloud data centre can be

connected to form a cluster, where several nodes operate together.

Virtualization

Cloud computing utilizes virtualization technology as providing computing hardware re-

sources according to the essential characteristics of cloud computing requires the usage

of virtualized resources. Hardware virtualization refers to the creation of one or several

Virtual Machines (VM) from a single physical hardware system. The physical machine is

referred to as the host machine and the VM is referred to as the guest machine. Similar

terms are also used to refer to software running in a virtualized environment. For exam-

ple, an operating system in a physical machine is referred to as the host operating sys-

tem and an operating system in a VM is referred to as the guest operating system. The

software handling the creation of VMs is known as hypervisor. [14]

Data Storage in Cloud

The volume of data produced has increased significantly over the recent years. There

are several reasons for this. For example, the number of devices producing data has

increased via the increase in mobile devices and IoT devices. The rapid processing of

this ample data is referred to as Big Data.

15

It is popular to utilize cloud computing in Big Data processing. Mobile and IoT devices

producing the data typically have limited amounts of resources for processing and storing

the data. A distributed cluster of computing nodes in a cloud computing data centre can

provide enough processing power to process such large amounts of data. Such pro-

cessing in a distributed environment has introduced challenges to data storage. For ex-

ample, traditional Relational Database Management System (RDBMS) databases have

not been suitable in a cloud environment from the scalability point of view and therefore

alternative technologies, like No-SQL databases, have been studied and used. Cloud

based data storage is currently being heavily researched. [15] MDC architecture also

aims to provide a data storage solution suitable for cloud and Big Data.

The CAP theorem discusses the limitations of distributed storage systems. The CAP

theorem states that a networked shared data storage can only have two of the following

three desired storage properties:

• Consistency

• Availability

• Partition tolerance

Consistency means that there must be a single up-to-date copy of the data. In practice,

any read must return the value of the most recent write.

Availability means that data can always be read from any non-failing node.

Partition tolerance means that the system functions normally even if network partitions

occur.

An example of the CAP theorem can be taken from a simple networked data storage

having two storage nodes. If there is a network partition and the two storage nodes can-

not communicate, it is possible that data is not consistent between the storage nodes. If

a client in computing node can access both storage nodes, it must be decided whether

to return data immediately (and sacrifice consistency) or to return consistent data when

the storage nodes can communicate again (and sacrifice availability). After the introduc-

tion of the CAP theorem there has been a lot of research on the subject. The author of

the CAP theorem, Eric Brewer, later noted that all the properties discussed in the CAP

16

theorem are not binary, for example, there can be several levels of consistency and dif-

ferent kinds of partitions. Therefore, it is not always mandatory to select “two out of three”

but it is possible, for example, to have limited support for all three. [16]

2.4 Telco Cloud

The cloudification of the mobile phone network is taking place rapidly. Scalability in the

cloud provides possibilities for more efficient hardware utilization to Mobile Phone Net-

work Operators (MNO). This will be essential in the future as both the traffic and the

number of connected devices is expected to grow enormously via the coming 5G net-

work. Cloud-based mobile phone networks are built on top of Commercial-Off-The-Shelf

(COTS) servers which utilize standard IT hardware instead of proprietary hardware which

is commonly used in traditional mobile phone networks. In the traditional mobile phone

network, equipment hardware and software are typically tied together. The Telco Cloud,

on the other hand, is a cloud computing stack-based solution where IaaS service is in-

dependent from the PaaS and SaaS services. This offers possibilities to quickly deploy

new features via standard APIs. As was discussed in the previous section, the amount

of data processed in the cloud is expected to increase heavily in the future. To process

this data effectively, processing needs to take place as close to the source of data as

possible. One potential solution is the concept of a cloud integrated network which is

described in Bell Labs’ future vision. Cloud integrated network would blur the line be-

tween the cloud and the network. For example, Telco Cloud could provide APIs which

give possibilities to process the data already in the Telco Cloud. If the network would be

used just to transfer data to an external cloud for processing, the network could become

a bottleneck for the cloud computing processing capability (data access would limit the

cloud processing capacity). A similar bottleneck can be found in traditional computing,

where memory bandwidth is a bottleneck for CPU in the well-known Von Neumann bot-

tleneck (described in Section 3.1) [2, pp. 18-25]

Cloud Computing Service Stack in Telco Cloud

As discussed in the paragraph above, Telco Cloud requires all the services in the cloud

computing stack: IaaS, PaaS and SaaS. As those different services are independent,

telco operators now have the new flexibility to acquire different services from different

vendors (in legacy telco equipment hardware and software are typically tied together).

17

Telco operators also have the possibility to buy some parts of the stack as a service

instead of hosting them themselves. For example, telco operators can buy an IaaS ser-

vice from a cloud service provider instead of hosting their own datacentre [17]. Product

evaluated in this thesis work belongs to the IaaS layer of the cloud computing stack as

the evaluated product is part of the cloud computing datacentre hardware. However,

software used as an evaluation use case (SDL) belongs to the PaaS layer.

Telco Cloud High Level Architecture

Cloud based mobile phone network contains the same two main functional blocks as the

traditional mobile phone network ̵ radio access network and core network. Those are

referred to as CRAN and Cloud CN. CRAN has much tighter latency requirements than

Cloud CN.

Network Function Virtualization (NFV) is a high-level architecture for the Cloud-Based

mobile phone network. The objective of the NFV is to create a generic network functions

virtualisation infrastructure (NFVI) which defines the architecture for functional blocks

(VNFs) in a virtualized mobile phone network. NFV and NFVI are specified by European

Telecommunications Standards Institute (ETSI). VNFs are typically virtualized equiva-

lents of existing mobile phone network elements. For example, CBTS acts as one VNF

in CRAN and CBTS is a virtualized version of BTS NF. VNFs typically consist of several

VMs, which all provide some predefined functionality. VMs in VNF are also known as

VNF components (VNFC). [1] For example, CBTS consists of a pool of VMs in a cloud

computing data centre. CBTS VMs are VNFCs in CBTS VNF. Figure 8. illustrates this.

IaaS level provides the needed computing resources and their management services for

provisioning Telco Cloud VNFs. For example, CBTS VMs are deployed using IaaS level

services. OpenStack is a free, open-source, software platform which is typically used as

IaaS software providing virtualized could resources [18]. Nokia provides an NFV solution

called: Nokia AirFrame Cloud Infrastructure for Real-time applications (NCIR) which is

specifically targeted for edge cloud datacentre use case (and therefore for CBTS). NCIR

utilizes OpenStack services. [19]

18

2.4.1 Shared Data Layer in Nokia Telco Cloud Solution

Nokia Telco Cloud solution has a separate layer for data storage, the SDL. This layer

provides a data storage interface for Telco Cloud applications and hides the actual data

storage implementation from them. SDL also enables data sharing between different

VNFs, between different VMs in the VNF and furthermore between applications inside a

VM. [20] Figure 8. provides an example of the points above. Firstly, CBTS applications

access SDL data storage via SDL API which hides the actual data storage technology

from the applications. Secondly, SDL is accessed from different CBTS applications, from

different CBTS VMs (VNFCs) and from different CBTSs (VNFs). Sharing data between

different CBTSs in SDL can be used to reduce the amount of data transferred between

CBTSs via X2 interface (discussed in Section 2.2 and in Figure 6.).

Figure 8. SDL usage though SDL API. SDL is used at different levels

Actual data storage behind SDL API can either reside in the same VM(s) as the applica-

tions which use SDL, or it can reside in separate dedicated VM(s). That depends on how

SDL is deployed and is not visible in the SDL API. To highlight this, actual data storage

in Figure 8. is not drawn into any VM(s) although in reality it is deployed in some VM(s).

[21]

19

By storing data to SDL, data is separated from the logic. This enables stateless applica-

tions, as in stateless applications all long-lasting states must be external to the applica-

tion [22, p. 40]. Similarly, SDL also enables stateless VMs and VNFs. Stateless applica-

tions and stateless VMs furthermore enable high availability, scaling and easy deploy-

ment as the new applications/VMs can be deployed and they can immediately start to

work with data stored into the SDL.

At present, SDL is implemented using conventional No-SQL database solutions (for ex-

ample Redis). The performance of these is sufficient, when SDL is used inside a single

VM. But when SDL is used between different CBTS VMs, using existing database solu-

tions is challenging from CBTS latency requirements point of view.

SDL API

SDL API provides a simple C++ API for accessing the shared data storage. One of the

main purposes of the SDL API is to hide the actual data storage technology. Data storage

technology behind the API is not visible at all in the API and this makes it possible to

change the data storage technology behind the SDL API without affecting SDL API cli-

ents. SDL API is a key-value storage API. Therefore, data in SDL is accessed using key-

value pairs, where key is an identifier for the data and value is the data itself. Key is a

string type identifier and value is a byte array (more specifically C++ vector) data struc-

ture. SDL API does not care about the data contents, format of the data inside the byte

array can be anything (e.g. serialized JSON). SDL client decides the data format and the

data is meaningful only for the SDL client. The methods which SDL API provides are

simple data manipulation methods. There are, for example, methods for storing data,

reading data and deleting data.

One essential concept in SDL API is the namespace concept. When SDL clients access

the shared data storage using SDL API, they provide a namespace identifier which is a

string value. Namespaces provide data isolation between different clients. Data in certain

namespace is isolated by SDL API so that only client(s) which use given namespace

identifier can access that data. Client(s) using some other namespace identifier cannot

see that data. Figure 9. below provides an example of this. There are two clients which

access the data storage via SDL API, SDL client 1 and SDL client 2. SDL client 1 uses

“Namespace A” string as namespace identifier whereas SDL client 2 uses “Namespace

B” as namespace identifier. As it can be seen from the figure, both clients have isolated

20

data allocations in the backend data storage and those two SDL API clients cannot see

each other’s data.

Figure 9. SDL API namespace concept example

Different SDL API clients can see each other’s data if so desired. That would happen

simply by using the same namespace identifier in multiple SDL clients. If in Figure 9.

example above both clients would use “Namespace A” as namespace identifier, those

two clients could see each other’s data and that way they could have shared data.

SDL API implementation is a shared C++ library. SDL API provides both synchronous

and asynchronous versions of its methods. Synchronous methods block the SDL client

execution until given SDL operation is completed. Asynchronous methods on the other

hand return immediately and therefore allow SDL client to execute other tasks while SDL

operation is still ongoing. SDL clients need to provide a callback function when using

asynchronous methods. When the asynchronous SDL operation completes, SDL API

invokes the given SDL client callback function.

Below are some code level (C++) examples of the SDL API. First, some essential data

type definitions:

using Key = std::string;

using Data = std::vector<uint8_t>;

using DataMap = std::map<Key, Data>;

21

Then an example of the synchronous set (data write) method:

virtual void set(const DataMap& dataMap) = 0;

Namespace is not visible in this method as it is given earlier when SDL API instance

(C++ object) was created by SDL client.

And an example of the asynchronous version of the same set method:

virtual void setAsync(const DataMap& dataMap, const ModifyAck& modifyAck) = 0;

As discussed earlier, client needs to provide a callback function for asynchronous meth-

ods. In the example above, modifyAck parameter is the callback function client needs to

provide. Its datatype (ModifyAck) looks like this:

using ModifyAck = std::function<void(const std::error_code& error)>;

As seen from the example above, SDL API provides an error code when it invokes the

callback function. Error code describes the status of the request. Synchronous version

did not return an error code, that is because it throws exceptions in error situations (in-

stead of returning error codes).

2.4.2 Cloud Base Station

CBTS is an evolutionary step from RRH BTS architecture which was described in Section

2.2. In CBTS virtualized BBUs are centralized into a cloud data centre to form a BBU

pool/hotel and only RRHs are located at the BTS site. Figure 10. below shows an exam-

ple of this in an LTE mobile phone network:

22

Figure 10. Cloud BTS in an LTE mobile phone network [9]

The backhaul network connects the BBU Pool with the (Cloud) CN. At a BTS site, RRHs

are co-located with the antennas. RRHs are connected to the BBU Pool through low

latency, high bandwidth optical transport links. BBU pool contains BBUs of multiple BTSs

which enables efficient resource sharing between heavily and lightly utilized BTSs and

enables cost savings via more optimal resource usage. [9]

Complete baseband processing in BBU pools results in low latency and high data rate

requirements for the fronthaul network. This increases the costs and energy require-

ments for CBTS and therefore somewhat offsets its benefits. This has led into studies

about which BTS functionalities to move into centralized cloud data centres. [23] In the

first CBTS deployments some part of the baseband processing is still located near the

BTS site (the most latency critical functionalities) and only part of the baseband pro-

cessing is moved to the BBU pool. In 5G mobile phone network BTS is split into two

parts, Centralized Unit (CU) and Distributed Unit (DU). This split helps the BTS cloudifi-

cation efforts [11]. The target in the long run, however, is the architecture described in

the beginning of this section where only RRHs are located at the BTS site.

Mobile Edge Computing and CRAN

Some new use cases for mobile phone networks, like autonomous driving, require ex-

tremely low latency. One of the primary causes for latency is the data transfer distance.

For example, based on the speed of light, computing should take place within a proximity

of ~100km to be able to support a latency of 1ms round-trip time (RTT). Optical data

transfer is assumed in the previous example [2, p. 22]. Mobile Edge Computing (MEC)

23

is a technology which aims to solve this challenge. In MEC, computing and storage re-

sources are brought to the edge of the mobile phone network, within close proximity to

UEs. Therefore, MEC is based on a highly distributed computing environment. [24] This

may sound a bit strange considering the way the CBTS description was looked at in the

previous section, as CBTS aims to centralize computing resources. The key here is the

creation of distributed edge cloud data centres which, in addition to BTS baseband pro-

cessing functionalities, contain also some of the CN functionalities. That way edge cloud

data centres make CRAN more centralized but cloud CN more distributed and that way

the total end to end latency is reduced. [25] Edge cloud data centres require extremely

efficient hardware in order to fulfil the lowest latency requirements for modern mobile

phone networks.

24

3 Memory-Driven Computing

This chapter provides theoretical background information about MDC architecture. The

chapter first describes the traditional computer architecture and how it has evolved. This

helps to better understand what challenges MDC architecture aims to solve. After that

MDC architecture is illustrated. Last part of this chapter contains some examples of MDC

architecture usage.

3.1 Computer Architecture Evolution

von Neumann Computer Architecture

Most modern computers are based on architecture known as von Neumann architecture.

It was described by mathematician and physicist John von Neumann in 1945. In this

architecture a computer contains a separate CPU which handles instruction execution

and memory which stores the instructions and data. Alan Turing had shown an idea of

such a concept in 1936 and it can be seen that von Neumann utilized Turing’s studies in

his work. Both aimed at creating a universal computer, capable of handling any comput-

able task. Earlier computers were generally capable of handling only certain pre-defined

tasks which were mechanically build into computer machines. Changing them to do

something else required notable effort, such as mechanical rewiring. Storing computer

programs into memory was seen as means to create a universal computer by both Turing

and von Neumann. The concept behind von Neumann architecture is known also as

stored program computer. [26, pp. 181-192]

von Neumann Bottleneck

In von Neumann architecture CPU needs to fetch program instructions and data from

memory via a bus which is commonly known as memory bus. In this configuration it is

unavoidable that CPU needs to wait for the data to be transferred from memory via

memory bus. This creates latency which is known as von Neumann bottleneck. The se-

verity of this bottleneck depends on memory latency, memory bus throughput and on

CPU speed. In the early stored program computers this bottleneck was not severe as

the CPUs were rather slow and memory amounts were quite limited. But the CPUs have

developed fast, and memory amounts have increased rapidly. Memory bus throughput

25

and memory performance, on the other, hand have not developed as fast [27]. This has

caused the von Neumann bottleneck to become more severe and caused it to limit the

overall increase in computing performance.

Moore’s Law

CPUs has developed greatly since the beginning of von Neumann architecture comput-

ers. This development has been enabled by CPU components getting smaller, which has

provided means to fit more components into CPU. In 1965 Gordon Moore noted in his

paper that the number of components per integrated circuit had doubled each year. In

1975 he revised that the number of components doubles every two years. This observa-

tion is known as “Moore’s law”. [28]

CPU Cache

To avoid von Neumann bottleneck limiting the benefits of the more powerful CPUs, cache

memory has been included into CPU units close to CPU core. This cache removes the

need to constantly transfer data between the CPU and the main memory. Instead, data

frequently needed is transferred once from memory to the CPU cache and utilized from

the CPU cache several times. The size of the CPU cache memory is rather small com-

pared to the main memory size, thus data in the CPU cache memory needs to be con-

stantly replaced. To handle this data replacement efficiently, sophisticated algorithms

have been developed. How well these algorithms work has a notable effect on CPU

performance. Also, individual programs can be designed and implemented so that they

work efficiently from the CPU cache point of view. However, such optimizations are typ-

ically done only for the most performance critical programs. [29, pp. 344-345]

Hard Disk

Manufacturing tremendous main memories has proven to be costly. Main memories have

typically been volatile, meaning data is not maintained when memory loses power. Due

to these limitations, a slower but non-volatile storage has been added in addition to the

main memory. This storage is known as the hard disk. Persistent data and data which

does not fit in the main memory is typically stored on the hard disk. As the hard disk is a

notably slower storage than the main memory, also the interconnects from CPU to hard

disk are typically notable slower than the CPU main memory interconnects. Therefore,

26

data stored on a hard disk is typically notably slower to access than data stored in the

main memory, and programs have been designed accordingly. As with CPU caches,

programs aim to store consistently needed data to main memory.

Parallel Processing

Several processing units can work together to achieve greater throughput than what is

possible using only one processing unit. There are three main targets for parallel pro-

cessing systems: high throughput, low latency and good scalability. Scalability means

that adding more processing units effectively increases the performance of the whole

system. [29, p. 350] There are numerous different ways to provide parallel processing.

Next some of the most common high-level parallel processing architectures are intro-

duced.

Multicore CPUs

Parallel processing is common nowadays, as most of the CPUs today are multicore

CPUs. Multicore CPUs have become popular because it has become more and more

challenging to increase the performance of single core CPUs [30, p. 1]. Figure 11. illus-

trates this: it has become difficult to increase CPU frequencies any further and at that

point the number of cores per CPU has started to increase.

Figure 11. Microprocessor development [3]

27

Typically, multicore CPUs have several levels of cache memory, some being private for

each core and some being shared between different cores. A memory bus between mul-

ticore CPU and main memory is typically shared between all cores. A shared memory

bus makes von Neumann bottleneck even worse as the bottleneck resource is shared

by several consumers. [31] Having separate CPU caches per core again offsets the von

Neumann bottleneck but it brings a new challenge with cache coherence. Memory co-

herence means that the value returned by a read is the same as the value written by the

most recent write. Therefore, if one core updates (writes) some data in the main memory

which is stored to some other core’s CPU cache, the value needs to be updated also to

another CPUs cache to keep that memory location coherent. Several different cache

coherency schemes have been implemented. Cache coherency schemes can be hard-

ware or software based. Multicore CPUs typically handle cache coherency in hardware

and therefore it does not need to be considered in programs running in multicore CPUs.

[32, pp. 4-5]

Programming the multicore CPUs has proven to be challenging. To get the performance

benefits of the multicore CPU, the program needs to be able to divide its computing tasks

efficiently to all cores. These different computing tasks (threads) of a single program

usually need to communicate and synchronize their work, which causes overhead and

creates possibilities for new kind of error situations. [30, pp. 1-2]

Typically, the above-mentioned challenges in programming get more difficult when the

amount of threads in a single program increase. The shared main memory bottleneck

also becomes more severe when the number of cores increase. Similarly, maintaining

cache coherence becomes more complicated as the number of cores (and core specific

caches) increases. Due to these challenges there has been some doubts about how long

the CPU performance can be increased by adding more cores. Figure 11. shows that in

the recent years the number of cores per CPU has not increased as rapidly as before.

[33]

Parallel Processing Using Multiple CPUs

In addition to multicore CPUs, parallel processing can also be done using multiple CPUs

(which can be multicore CPUs). Parallel processing using multiple CPUs has existed for

28

a long time (before multicore CPUs) as such systems provide more computing perfor-

mance for tasks requiring significant processing capacity. Multiple CPU based super-

computers have provided significant processing capacity even before multicore CPUs.

Systems having multiple CPUs can be divided into two main classes: multiprocessor and

multicomputer systems. Multiprocessor systems have several CPUs inside one com-

puter which is running a single operating system instance. A multicomputer system con-

tains several computers which all have their own operating system. These several com-

puters are connected using some interconnection method, for example LAN, and they

can operate together working with the same task.

Memory Access Architectures in Multiple CPU Systems

There are two main memory access types for multiple CPU systems: Shared Memory

Architecture and Distributed Memory Architecture. In the Shared Memory Architecture

all the CPUs in the multiple CPU system share the same memory while in the Distributed

Memory Architecture each CPU has its own private memory which other CPUs cannot

access. Multiprocessor systems typically use Shared Memory Architecture and multi-

computer systems typically use Distributed Memory Architecture. Shared Memory Archi-

tecture can be further divided into two main groups: Uniform Memory Access (UMA) and

Non-Uniform Memory Access (NUMA). [34]

In the UMA architecture, access time to any memory location is independent of which

processor makes the request. UMA architecture-based systems are also known as Sym-

metric Multiprocessor (SMP) machines. In typical UMA systems memory access is done

using the same shared memory bus. This can lead to similar problems as described for

multicore CPUs, for example: a shared memory bus becomes a bottleneck when the

number of CPUs increases. In fact, multicore CPUs are also typically based on UMA

architecture. As the problems in the UMA architecture become worse when the number

CPUs increases, the UMA architecture does not scale very well to numerous CPUs. Fig-

ure 12. below illustrates the typical layout in the UMA architecture. Two CPUs are con-

nected to the memory controller with a bus connection and both CPUs access memory

through the shared bus. [34]

29

Figure 12. Example of UMA architecture [34]

In the NUMA architecture each CPU has its own memory (known as local memory) but

it is not its private memory (NUMA is a form of Shared Memory Architecture), and other

CPUs can also access it (it is known as remote memory to those other CPUs). One CPU

and memory attached to that CPU forms one NUMA node in NUMA system. In NUMA

architecture accessing local memory is faster than accessing remote memory, therefore

memory access is non-uniform. Access to remote memories in NUMA systems is imple-

mented using some scalable interconnect system between CPUs. There are several dif-

ferent interconnect systems available for NUMA systems. Figure 13. below shows a

practical example of a NUMA system which utilizes Intel Quick Path Interconnect (QPI)

technology. CPUs are now connected directly to memory (memory controllers are em-

bedded in CPUs) and CPUs are connected using QPI interconnect. CPUs can access

other CPU’s remote memory via the QPI interconnect and the memory controller in the

other CPU. [34]

30

Figure 13. Example of NUMA architecture using Intel Quick Path interconnect technology [34]

NUMA provides better scalability than UMA. CPUs have their own directly accessible

local memory and the interconnect topology is scalable. Therefore, NUMA does not suf-

fer of similar shared bus contention problems as UMA. Both UMA and NUMA require

careful programming as the memory is shared between all programs and threads. Sim-

ulations accessing the same memory locations need to be done in a controlled manner.

To get the full benefits of the NUMA system, programs need to use local and remote

memories efficiently. This adds more complexity to the programs. [34]

Both UMA and NUMA can be cache coherent (cache coherency was explained in multi-

core CPUs) or not. Cache coherent UMA and cache coherent NUMA are referred to as

CC-UMA and CC-NUMA, respectively. As with multicore CPUs, maintaining cache co-

herency becomes more complicated as the number of CPUs increases. If the system

does not support cache coherency, it needs to be handled explicitly in programs which

makes programming such systems more complicated. [34]

In Distributed Memory Architecture all CPUs have their own private memory, which other

CPUs cannot directly access. Thus, Distributed Memory Architecture typically consists

of individual computing nodes (having own CPU and memory) connected together. Fig-

ure 14. below illustrate such interconnected nodes. [34]

31

Figure 14. Distributed Memory Architecture [34]

As the memory access in Distributed Memory Architecture is private within computing

nodes, cache coherency is required only within computing nodes and thus does not get

more complicated when the number of computing nodes increases. Programming can

be simpler as there is no need to specifically handle the access to shared memory. How-

ever, in Distributed Memory Architecture computing nodes working together must trans-

fer data between each other using an interconnect (which typically is a LAN). Intercon-

nects in Distributed Memory Architecture are notably slower than shared memory inter-

connects in NUMA. This can cause latency to the data processing operation. [34] At-

tempts have been made to avoid the latency problem in the Distributed Memory Archi-

tecture by reducing the amount of data transferred between the computing nodes. For

example, the MapReduce programming model aims to maximize data processing in the

local nodes by transferring the processing operations instead of data. [35]

The very high-level architecture of the Distributed Memory Architecture bit resembles

NUMA, as the CPUs/computing nodes can access data on the other nodes via an inter-

connect and CPU/memory controller in another node. Therefore, access to certain

memory location can get congested in the memory controller if several computing nodes

need to access same data concurrently. Such congestion is however less likely than in

UMA/NUMA. NUMA and Distributed Memory Architecture differ at a more detailed level:

another node’s memory cannot be directly (via memory read/write operations) accessed,

and interconnects are slower. Computing nodes in the Distributed Memory Architecture

are more loosely coupled than in NUMA, for example they each have own operating

system instances (which can be different operating systems).

Cloud data centres are typically based on Distributed Memory Architecture as explained

in Section 2.3.

32

3.2 MDC Architecture Main Principles

The novel idea in the MDC architecture is to have an abundant shared memory which all

individual computing nodes can access directly and without the need to contact other

CPUs (like in NUMA). The access route is shown in Figure 16. As discussed in the pre-

vious chapter, NUMA and Distributed Memory Architecture resemble each other at the

very high level. MDC architecture, in turn, contains elements from both the NUMA archi-

tecture and the Distributed Memory Architecture, and it therefore brings them even closer

to each other. The memory is shared directly between computing nodes like in NUMA,

but computing nodes are individual, having their own memories, CPUs and operating

systems like in the Distributed Memory Architecture. A high-level illustration of this setup

was shown in Figure 1. In addition to the memory being shared between computing

nodes, memory in the MDC architecture is also non-volatile and abundant. Therefore,

MDC architecture no longer needs to have hard disks.

Figure 15. below illustrates the high-level architecture of one computing node in an MDC

architecture-based computer cluster. It can be seen from the figure that the nodes still

contain traditional volatile DRAM directly attached to the CPU. This private local memory

is faster to access than the Non-Volatile Memory (NVM) which is shared between all

nodes. It can also be seen from the figure that in, addition to private DRAM memory, the

node also contains NVM which is corrected to fabric switch. The fabric switch provides

a novel new memory interconnect (described in Section 3.3) which enables the possibility

to access NVM in a certain node without contacting the CPU in that node. Each compu-

ting node in an MDC system contains such NVM, which can be accessed from every

computing node. Accessing another computing node’s NVM is illustrated in Figure 16.

The total combined NVM in all computing nodes form the abundant NVM memory pool

of the MDC system. This shared pool of NVM memory is also known as Fabric-Attached

Memory (FAM). FAM consisting of several computing node’s NVM is also illustrated in

Figure 16. [36]

33

Figure 15. Single MDC computing node architecture [36]

Figure 16. Accessing shared NVM location which is physically in a different computing node [36]

Currently, the MDC architecture does not contain cache coherency between different

computing nodes. However, cache coherency does exist inside the computing nodes.

The lack of cache coherency between the computing nodes means that software needs

to explicitly handle the cache coherency when accessing the memory which is shared

between all nodes.

34

3.3 Enabling Technologies

The two main technologies utilized in MDC architecture are NVM and new optical

memory interconnect technology. This chapter briefly illustrates those technologies and

their usage in MDC.

Non-Volatile Memory

NVMs aim to provide a new storage technology which is persistent like hard disks, but

which resembles main memories (like DRAM) in other properties. These other properties

include: byte addressability, high density, and DRAM-like latency and throughput. [37]

There are several different types of NVMs, such as Phase Change Memory (PCM),

3DXpoint and the memristor. [37] The memristor NVM type is used in the Machine re-

search project. However, due to memristor technology still being in the development

phase, the current MDC prototype does not use memristor type of memories but volatile

DRAMs, which are secured to have power all the time. [3]

As discussed in Section 3.2, having sufficient amount of NVM in the system removes the

need for hard disks. However, this does not mean that NVM based systems would not

have any volatile memory. CPUs contain cache memory (Section 3.1) and at least this

memory is still volatile and thus lost when CPU loses power. Therefore, even in systems

having only NVM type of main memory, programs need to ensure that data stored only

to volatile CPU cache memory is moved to persistent NVM when data persistence needs

to be assured. Programs have the technical means to handle this. Programs can flush

the data in CPU cache back to main memory (NVM in this case). Also, this is not a

conceptually new thing. Main memories in traditional computer systems have typically

been volatile and thus programs are designed in such way that they store data to persis-

tent storage (typically hard disk) when data persistence is beneficial. From the existing

programs’ point of view, a transition to an NVM environment can therefore be as simple

as changing the disk writes operations to CPU cache flushing operations. This would at

least ensure that the needed data persistence is still provided. However, to achieve more

notable performance benefits from the NVM environment, it is likely that more significant

redesign to existing programs is required. When such redesign is done, also the persis-

tence of the volatile CPU caches might need to be reconsidered. [38]

35

Gen-Z Memory Interconnect Technology

The target in MDC is to have near-uniform memory access within the whole system. That

is, accessing memory anywhere in the FAM should be possible with a near-uniform la-

tency. As seen in Figure 16., accessing memory located in different node requires longer

distance and going through multiple bridges, therefore memory access technology needs

to be very efficient to achieve a near-uniform latency. Shared memory pool access tech-

nology in MDC is based on silicon photonics data transmission which provides low la-

tency data transmission with vast capacity. [3]

Interconnect technology in MDC is based on the Gen-Z standard. Gen-Z interconnect

standard is an open standard, developed by Gen-Z consortium. Gen-Z consortium is

formed by several technology companies. [3] Gen-Z aims to create a hardware agnostic

interconnect technology which would be suitable for several use cases, such as memory

bus, peripheral bus or local area networking. Currently released Gen-Z specification

mainly addresses processor-memory interconnect. Memory access via interconnect be-

gins from the CPU which is processing the memory access request. [39] An example of

this is seen in Figure 16., where the bridge is a Gen-Z based memory type agnostic

memory controller. Gen-Z based new interconnect technology aims to solve challenges

seen in previous interconnect technologies, like memory controller congestion and

memory access latency, which were discussed in Section 3.1. The topology of the Gen-

Z interconnect network needs to be efficiently designed to receive the full benefits of the

new interconnect in MDC systems which contain numerous computing nodes accessing

the shared NVM memory [3].

3.4 Software in MDC Architecture

Current MDC architecture prototype is based on the Linux operating system. Certain

modifications have been made to the Linux kernel to optimize its operation in the MDC-

architecture environment. [40] One of the most significant modifications is the addition of

the new file system, Librarian File System (LFS), which is optimized for MDC architec-

ture. Even though the MDC architecture removes the need for hard disks, the file system

is still being developed to access the shared NVM memory. The file system provides a

familiar API for applications to access the data in the NVM, and it is thus beneficial to

have. LFS also provides an optimized version of the mmap Linux system call. [41]

36

Virtual Memory and Paging

To better understand the optimized mmap system call in LFS, concepts of virtual memory

and paging are briefly explained first. Virtual memory is utilized in most modern operating

systems. When using virtual memory, processes use virtual memory addresses, and the

operating system maps them into physical memory addresses. Virtual memory provides

several benefits, for example, it frees the processes from advanced memory manage-

ment duties as the operating system can handle those. Virtual memory also allows pro-

cesses to use larger memory address space than what the system main memory pro-

vides. Larger virtual memory address space is achieved via paging. Paging technique

divides virtual address space into fixed sized blocks which are called pages. Pages, in

addition to main memory, can also reside in secondary storages, which allows the virtual

address space to exceed the main memory physical address space. Secondary storage

can be, for example, a hard disk. Operating system can move the process’s virtual

memory pages between different storages. The mmap system call in Linux allows pro-

cess to map a device or file into the process’s virtual memory address space. When

calling mmap for a file, pages of that file are added to the calling process’s virtual address

space. After that the process can access the file using memory operations instead of file

operations, which typically gives performance benefits as memory system calls are no-

tably faster to execute than file operations (which typically are not system calls). [42]

One implementation of a shared memory concept in Linux based systems is /dev/shm

filesystem, /dev/shm always utilizes virtual memory and never a persistent storage. That

is, files located in /dev/shm are never moved to secondary storage via paging. Regular

files added to process’s virtual address space via mmap call can be moved to secondary

storage whenever operating system decides to do so (for example due to memory limi-

tation). Filesystem under /dev/shm has a predefined, limited space. If that space is fully

utilized, no more data can be added under /dev/shm, but all the existing data will be kept

in memory. Typical method to create programs which only utilize shared memory in sys-

tems having /dev/shm filesystem, is to first open/create a file in /dev/shm and subse-

quently calling mmap for the opened file.

LFS allows the mmap system call to map MDC architecture FAM into the process’s vir-

tual address space and subsequently processes can directly access FAM via memory

operations, which is the most efficient way to access FAM. [41]

37

In addition to Linux operating system modifications, HPE has also developed several

other software components to support MDC architecture usage. For example, Foedus

database (discussed more in Section 4.3), Managed Data Structures programming

model, FAM emulator (FAME) and more. [43]

3.5 MDC Architecture Influences in Superdome Flex Product

HPE has recently introduced MDC architecture influenced Superdome Flex (SDF) prod-

uct. Superdome Flex does not contain all the elements which MDC architecture defines.

However, Superdome Flex is designed using MDC principles. Superdome Flex product

is available already now, while fully MDC architecture compliant products are still in pro-

totype phase. Therefore, Superdome Flex provides limited possibilities to experiment

MDC technology even today. [44]

Superdome Flex is a NUMA (described in Section 3.1) based system. The Superdome

Flex system consists of one to eight chassis. Architecture of a single Superdome Flex

chassis is illustrated in Figure 17. below:

38

Figure 17. Architecture of a single Superdome Flex chassis [45]

As is seen in Figure 17., each Superdome Flex chassis contains four CPUs and each

CPU has 12 DDR4 type of memory modules attached to it. Therefore, each chassis con-

tains four NUMA nodes. The CPUs in the same chassis are connected using Intel Ultra

Path Interconnect (UPI) processor interconnect. UPI technology replaces QPI technol-

ogy which was illustrated in Figure 13. Each Superdome Flex chassis also contains two

HPE Superdome Flex ASICs and each of the four CPUs is connected to either of those

using additional UPI connections. The HPE Superdome Flex ASICs provide ability to

connect several chassis together. It can be seen in Figure 17. that each Superdome Flex

ASIC provides 16 grid ports to Superdome Flex chassis. These grid ports are used to

connect Superdome Flex chassis using HPE Superdome Flex Grid interconnect. Super-

dome Flex Grid interconnect is based on copper cables. Figure 18. below illustrates Su-

perdome Flex Grid connections between chassis in an eight chassis Superdome Flex

system:

39

Figure 18. Eight Superdome Flex chassis connected using Superdome Flex Grid interconnect [45]

There are three different memory access types in Superdome Flex, each type having

different memory access latency (as is typical in NUMA systems). Accessing memory

connected to the same CPU is called local memory access and it has the lowest latency.

Accessing memory in another CPU within the same chassis (via UPI) is called direct

memory access and it has the second lowest latency. Accessing memory in another

chassis (via UPI and Superdome Flex Grid) is called indirect memory access, and it has

the highest latency. Superdome Flex is a CC-NUMA, that is, it has hardware-based

cache coherency within the whole system. Even indirect memory accesses via Super-

dome Flex Grid are cache coherent. [45]

Superdome Flex contains some elements from MDC architecture. The MDC architec-

ture-based feature in Superdome Flex is the possibility to partition the chassis of the

40

system into individual computing nodes each of which runs its own operating system

(unlike in a traditional NUMA system which always has only one operating system run-

ning). Figure 19. below illustrates these two different ways to use Superdome Flex:

Figure 19. Eight chassis Superdome Flex as traditional NUMA system (left) and as partitioned system (right) [44]

However, Superdome Flex is not fully implemented according to MDC-architecture, even

when it is used as a partitioned system. Three clear main differences can be found.

The first main difference is in the memory access details. Memory access in Superdome

Flex happens in a CC-NUMA style manner, that is, access to remote memory (direct and

indirect) needs to take place via the CPU which controls that memory and therefore the

memory access latencies are not uniform (like described earlier in this chapter). In MDC-

architecture, all nodes can access any part of the shared memory in near uniform latency

without the need to contact other CPUs (like illustrated in Figure 16.). Gen-Z based op-

tical memory access technology enabled FAM makes this possible.

The second main difference is in cache coherency. Memory access in whole Superdome

Flex is cache coherent, while in MDC there is cache coherency only within each compu-

ting node. Maintaining cache coherency between all CPUs in Superdome Flex presum-

ably increases memory access latency as the CPUs need to communicate the cache

coherency via CPU interconnects. On the other hand, as discussed in Sections 3.1 and

41

3.2, the lack of cache coherency in multiprocessor systems causes the need for pro-

grams to explicitly handle the cache coherency between the CPUs. Handling cache co-

herency in programs potentially makes them more complicated and can also have per-

formance effects.

The third main difference is in persistent data storing. The MDC architecture is based on

coming NVM memories and does not have hard disks (discussed in Section 3.2). Super-

dome Flex, however, utilizes traditional DRAMs instead of NVMs. Therefore, Superdome

Flex does still have hard disks.

42

4 Solution Development

This chapter provides an overview of the research methods and research design used

in this thesis. This chapter also describes the implemented solution and how the imple-

mented solution was evaluated.

4.1 Research Methods

This study uses a qualitative research method. This study is focused on a new technol-

ogy (MDC architecture), which has not yet been researched extensively. Therefore, the

amount of existing data about MDC architecture is quite limited and thus the study didn´t

involve the analysis of large amounts of data, but rather, a very carefully conducted anal-

ysis of the limited data. This was the main reason for selecting a qualitative approach for

this study. The research objective being tied to a company context with a very specific

use case also supported selecting a qualitative approach.

Under the qualitive main research category, this study utilizes the Case Study research

method. Case Study, as research method, has been around for a very long time and it

has been used in a number of disciplines. It has a reputation as an effective methodology

to investigate and understand complex issues in real world settings. [46] The objective

of this study was to evaluate a complex new technology (MDC architecture) in a company

use case (SDL API in edge cloud data centre). This closely relates to Case Study

method. In this thesis, Case Study analysis was conducted following way:

• Set of SDL API real world use cases were identified

• MDC architecture as SDL API backend data storage was evaluated by sim-ulating identified SDL API use cases in MDC architecture hardware.

• Results were analysed.

• Conclusions were made based on result analysis.

Identified SDL API use cases are discussed more detailed in Section 5.2.

43

4.2 Research Design

The requirements for this study were provided by the company that ordered this study

(Nokia Networks). The following requirements were provided:

The MDC architecture should be carefully and comprehensively evaluated for the CBTS

SDL API use case. The target is for this thesis work to provide reliable information about

whether MDC architecture is suitable for CBTS SDL API use case and therefore help

make a business decision about MDC architecture-based hardware usage in mobile

edge cloud data centre. A fully working solution, that is a CBTS product running on MDC

architecture hardware, is not required. It is enough that the study provides information

on whether MDC architecture is suitable to a given CBTS use case (data storage behind

SDL API). The evaluation can be done with only a limited set of CBTS functionality or

even with a separate testing program. Also, evaluation can be done in a simulator if real

MDC architecture-based hardware is not available at the time of the study.

Even though the evaluation can be done with separate testing program, it should be done

in an environment where the execution of CBTS software would be theoretically possible.

This study involved the following steps:

Existing information about Telco Cloud (especially about CBTS, SDL and the mobile

edge cloud) and MDC architecture was collected. Information was gathered through a

literature review and by interviewing relevant experts in HPE and Nokia.

The information collected was carefully analysed and, on that basis, some SDL compliant

data storage solutions were designed and implemented for MDC architecture hardware.

Data storage solutions ware not implemented from scratch, but suitable existing data-

base solutions were searched and utilized as SDL API implementation (Section 4.3 dis-

cusses this).

The suitability of MDC architecture for an edge cloud data centre SDL use case was

evaluated. This evaluation was done utilizing a case study analysis method. The cases

for the case study analysis were different SDL API use cases from the real environment

which were identified during interviews with SDL API experts. Those real-world SDL API

use cases were simulated using the SDL API testing framework.

44

The work was carried out as an iterative process, meaning that the steps listed above

were repeated several times and learnings from previous rounds were utilized when re-

executing the steps. Also, the steps were not followed tightly in a step-by-step order but

rather, work related to several steps was carried out many times simultaneously.

The possibilities for the generalizability of the developed solution and/or results should

be considered during the thesis work. Giving new information about the possibilities of

using MDC architecture as a shared database would provide valuable new knowledge

useful also outside the company context.

4.3 Provided Artefacts

The main artefact provided by this thesis work were SDL API compliant data storage

solutions for an MDC architecture-based computer. They are software solutions imple-

mented using C++ programming language and they are based on existing database im-

plementations. In addition to the implementation of the data storage solutions, this thesis

also provides a comprehensive evaluation of the implemented data storage solutions. A

secondary artefact provided by this thesis was a set of use cases for modelling SDL API

usage in CBTS. These use cases were utilized in this thesis for evaluating the different

SDL data storage solutions and they are described in detail in Section 5.2.

The requirements for the developed solutions were derived from the study requirements

stated in the beginning of Section 4.2.

It was required that the evaluation must be done in an environment where it would be

possible to execute CBTS software. CBTS software is currently running in Linux VMs.

Therefore, the target environment for data storage solutions developed was Linux. An-

other requirement for the implemented data storage solutions was that they need to im-

plement SDL API which is used in CBTS. As it was not required to do the evaluation

using complete CBTS product, deploying CBTS VMs over MDC architecture-based hard-

ware was left out of the scope of this thesis work. However, in a separate work which

was done outside this thesis, possibilities to deploy CBTS product on top of the used

evaluation environment were studied. Based on work done so far in that study, possibil-

45

ities to deploy CBTS to evaluation environment look promising. Therefore, based on cur-

rent knowledge environment used in evaluation can fulfil the requirement being such

environment where CBTS software can be executed.

As possibilities to use MDC architecture-based environments were limited, the develop-

ment work was almost completely done in a simulated environment. Suitable data stor-

age solutions were developed mostly in a Linux based VM. When there was a need to

simulate accessing shared memory from several computing nodes, that was handled

running several processes which accessed the same shared memory database. During

development phase simulations, all processes were running in a single computing node

(Linux based VM). Final evaluation was carried out in a Superdome Flex system (de-

scribed in Section 3.5) which closely resembled MDC architecture. However, Superdome

Flex has only one operating system instance which is running in all computing nodes

(NUMA nodes). Because of Superdome Flex having only one operating system instance

and because of CBTS VM deployment being out if scope for this thesis work, it was

possible to do development phase simulations using only one computing node (and only

one operating system instance). Section 4.4.1 discusses this subject from evaluation

point of view.

Evaluated Data Storage Solutions

As the development of a completely new SDL API compliant data storage system would

have required too much effort to fit in to the context of this thesis work, suitable existing

database solutions were researched during thesis work. The requirement for the data

storage solution to be SDL API compliant limited potential databases to key-value type

of databases. To gain maximal performance benefits of the shared memory, in-memory

databases were preferred. Those databases which were seen as potential alternatives

were implemented as data storage behind SDL API. Below are listed all researched da-

tabases, first those which were researched but not found suitable for further evaluation

and later those which were further evaluated as SDL API data storage.

Databases only researched:

Foedus: HPE had developed a database solution named Foedus which is optimized for

MDC type of environments. However, that database was not selected for further evalu-

ation as its implementation was based on utilization of NVM memory and the product

46

which was used for final evaluation (Superdome Flex) did not have NVM type of memory.

[47]

Redis cluster: The Redis database is the current data storage technology behind SDL

API. Redis is an open-source key-value type No-SQL in-memory database. Client ac-

cesses Redis database using a TCP/IP based network connection. Redis database pro-

vides several different deployment modes, like standalone and cluster deployments. In

cluster deployment there are several Redis database instances (nodes) working to-

gether. In cluster deployment Redis automatically divides the stored data between dif-

ferent Redis cluster nodes. Client using Redis cluster can also direct data to certain Re-

dis cluster node by adding a suitable hash tag identifier into the stored key. Cluster de-

ployment in Redis is supported from Redis version 3.0 onwards. [48] SDL API supports

both Redis standalone and Redis cluster deployments. When Redis cluster deployment

is used as SDL API backend data storage, SDL API directs all keys in certain namespace

in to the same Redis cluster node. Due to the limited time available to use the evaluation

environment, Redis cluster deployment was not evaluated during this thesis work. In-

stead only standalone Redis deployment (discussed below) was evaluated and it was

theoretically analysed what differences Redis cluster deployment could be made to the

results, this discussion can be found from Section 6.4.

Databases evaluated as SDL API data storage:

Standalone Redis: Redis database general information was discussed above. In

standalone deployment mode there is only one Redis database instance and all data is

stored to that instance. To reliably evaluate the effect of MDC architecture on SDL per-

formance, a Redis based SDL solution was also evaluated in an MDC environment. Re-

dis still used TCP/IP network as the data sharing interconnect between computing nodes

and therefore using it in an MDC environment made it possible to evaluate the extent to

which SDL use cases benefit from using memory-based data sharing instead of LAN

based data sharing. As discussed above, due to time constraints, evaluation was done

only using standalone Redis deployment (Redis cluster deployment was not evaluated).

WhiteDB:

WhiteDB is a shared memory-based lightweight no-SQL database. WhiteDB is imple-

mented using C programming language and WhiteDB provides a C language API.

47

WhiteDB supports Linux and Windows environments. WhiteDB has an own memory al-

location implementation. In Linux environment WhiteDB memory allocation implementa-

tion utilizes shmget system call (so called System V shared memory model) to allocate

shared memory.

In WhiteDB data is stored in records. Each record contains n fields. Fields contain the

stored data. Data stored to WhiteDB must be encoded before it is written to WhiteDB.

Likewise, when data is read from WhiteDB, it must be decoded before WhiteDB client

can use the data. One main reason for data encoding/decoding is that WhiteDB utilizes

that to detect the type of the stored data. WhiteDB supports storing several different data

types but when WhiteDB records are created there is no need to specify what kind of

data is stored to those records. WhiteDB is schemaless and therefore WhiteDB detects

the type of the stored data from the encoded data. WhiteDB supports, for example, stor-

ing data in BLOB (Binary Large Object) format, which is compatible with byte array data

format used in SDL API.

WhiteDB provides database locking capability which can be utilized to ensure database

consistency during concurrent database access. However, WhiteDB never automatically

locks the database. Instead WhiteDB client needs to lock the database when needed.

Clients locks the database by using the locking functionality provided by WhiteDB.

WhiteDB documentation recommends locking the database before each read and write

operation in concurrent access environments. WhiteDB has separate locks for both read

and write operations. WhiteDB locking implementation is based on database level lock-

ing, that is, the whole database is locked for both read and write locks. The difference

between read and write locks is that write lock locks the database exclusively, that is, no

other operations at all are permitted while some client holds the write lock. Read lock,

however, locks the database only for write operations, other read operations are still

permitted while the database is locked for reading. In Linux environment, WhiteDB has

three different locking implementations. WhiteDB has access type preferred locking im-

plementations for both reader- and writer-preference. These are called respectively as

reader-preference and as writer-preference locking. Reader-preference and writer-pref-

erence locks in WhiteDB are based on spinlock locking technique which was described

by Mellor-Crummey & Scott in 1992. In addition to access type preferred locking implan-

tations, WhiteDB also contains so called task-fair locking implementation. This locking

type handles different access types (reads and writes) equally. When compared to ac-

cess type preferred locks, the benefit of the task-fair locking is that there is no risk that

48

non-preferred access type would get starved. Downside is higher overhead. Used lock-

ing implementation is selected during WhiteDB compilation. By default, WhiteDB utilizes

task-fair locking in Linux environment and it was in use during this thesis work evaluation.

Known limitation in the WhiteDB locking is that dead processes will hold the locks indef-

initely. Spinlock based preferred locking’s also have a limitation that maximum timeout

for lock waiting is 2000ms. [49]

Sharedhashfile:

Like WhiteDB, sharedhashfile is a shared memory-based lightweight no-SQL database.

Sharedhashfile is available for Linux environment. Other environments are also planned,

but at the moment sharedhashfile is only available for Linux environment. Like WhiteDB,

sharedhashfile also has an own memory allocation implementation. Sharedhashfile

memory allocation utilizes mmap system call (POSIX shared memory model) when al-

locating Linux shared memory. As discussed in Section 3.4, MDC architecture provides

an optimized version of mmap system call to access MDC FAM and therefore mmap

based implementation might be more optimal in MDC environment than System V shared

memory model based implementation (like WhiteDB).

Unlike WhiteDB, sharedhashfile does not require clients to encode the data before stor-

ing. Likewise, clients do not need to decode the data they read from sharedhashfile.

Sharedhashfile supports only string-format for data storage. This limitation is not a major

concern for SDL API use case as SDL API provides an API only for byte array data

format (which can be converted into a string).

Like WhiteDB, sharedhashfile uses locking for providing consistency in concurrent ac-

cess environments. However, the locking implementation is quite different than the one

in WhiteDB. Sharedhashfile does not use single global database-level lock like WhiteDB,

instead stored keys are divided between 256 locks to avoid lock contention. Also, unlike

in WhiteDB, sharedhashfile handles locking automatically and therefore the locking is

not directly visible for sharedhashfile clients. [50]

49

4.4 Evaluation Environment and Methods

In addition to the developed SDL compliant data storage solutions, an evaluation of the

different data storage solutions is an important part of this thesis work.

The evaluation criteria for this thesis was based on the requirements behind the study

(Section 4.3) and the business objective behind the study (Section 1.2). As the target

was to create a low latency shared database between Cloud BTS VMs, the main evalu-

ation criterion was performance, and specially, how does the performance compare with

the current SDL data storage solution (Redis database). Performance was mainly meas-

ured from SDL operation latency. Another evaluation criterion was reliability which was

measured from SDL operation success rate.

4.4.1 Superdome Flex Based Evaluation Environment

To acquire reliable measurement of this performance change, the aim was to perform

the evaluation as close to a real hardware MDC computer as possible. Real MDC based

hardware was not available at the time the solution was evaluated. However, HP re-

leased a Superdome Flex product (described in Section 3.5) during the solution devel-

opment phase and there was a possibility to perform solution evaluation using Super-

dome Flex product.

Superdome Flex was seen suitable for the evaluation purposes of this thesis work. In the

original requirements supplied by the party who ordered this thesis work (Section 4.3), it

was stated that the evaluation can be done even in a simulated environment. The Su-

perdome Flex offered the opportunity to evaluate the desired high-level concept, that is

sharing data between different computing nodes in memory rather than in a network, in

a real environment. The differences in the memory access technologies between the

Superdome Flex and the MDC-architecture did mean that the memory access in Super-

dome Flex was not expected to be as low and uniform in latency as in MDC-architecture.

But when compared to network-based data sharing, latency was still expected to be

magnitudes lower, evaluation already performed by Tero Lindfors also suggested that

[6]. The lack of NVM in the Superdome Flex was also not critical for this evaluation case.

The SDL API did not allow storing persistent data during the time of this evaluation.

Therefore, volatile shared memory in Superdome Flex was sufficient for the use case

behind this evaluation work (it produced some limitations to database options though).

50

Due to the above discussed factors, Superdome Flex was a potential alternative for the

given use case. MDC based architecture is expected to be available later. The evaluation

done with Superdome Flex provided a pre-study for MDC, as similar data storage tech-

nologies and evaluation methods should be suitable for MDC as well.

During the evaluation of this thesis work, same Superdome Flex system was also eval-

uated from CBTS product deployment point of view. CBTS deployment evaluation was

not part of this thesis work, but it provided beneficial information also for this thesis work.

As the target in this thesis was to do the evaluation from CBTS product SDL API use

point of view, the CBTS deployment work done provided information whether it is actually

possible to deploy CBTS product on top of Superdome Flex hardware. CBTS deployment

could also have provided a possibility to do the evaluation of this thesis work using a real

CBTS product, but the CBTS deployment was not fully completed during the time Super-

dome Flex system was available for Nokia’s evaluations. Therefore, evaluation of this

thesis work was not done using complete CBTS product but using only SDL API and rest

of the CBTS was simulated. CBTS deployment was, however, completed rather far and

the remaining issues were not estimated to be unsolvable. Therefore, based on the in-

formation gathered from CBTS deployment evaluation, it is estimated that deploying

CBTS product on top of Superdome Flex hardware is possible.

For the CBTS deployment evaluation purposes, IaaS service was installed to evaluation

environment and Superdome Flex system was part of the computing resources provided

by IaaS service. Used IaaS service was SUSE OpenStack Cloud 8 (SOC8) cloud com-

puting platform. SOC8 is OpenStack based solution like NCIR (discussed in Section 2.4)

which supported CBTS deployment work [51]. In the CBTS deployment evaluation target

was to deploy CBTS product on top of Superdome Flex system according to Figure 20.

below:

51

Figure 20. CBTS cloud stack in Superdome Flex

As illustrated in Figure 20., target was to deploy CBTS VMs on top of SOC8/Superdome

Flex based IaaS layer. VM deployment was done by using OpenStack services provided

in SOC8. CBTS VMs consisted of guest operating system, PaaS and SaaS layers. Used

guest operating system and PaaS layer come from Nokia implemented Radio Cloud

Platform (RCP). SDL API is part of RCP. Software implementing the CBTS functionality

(CBTS applications) was part of the SaaS layer in this cloud stack.

As stated already earlier, CBTS product deployment in Superdome Flex was not fully

completed. VMs containing only RCP services were successfully deployed but VMs con-

taining the whole CBTS stack (also CBTS applications) were not deployed successfully

during this time. As SDL is part of the RCP services, it would have been possible to do

the evaluation for this thesis work using VMs containing only RCP services. However,

evaluation of this thesis work was not done in VMs but directly in SUSE host operating

system in Superdome Flex hardware. Reason for this was that RCP based VMs were

successfully deployed in quite late phase of the Superdome Flex evaluation period

granted for Nokia. There was not anymore enough time to do the evaluation for this

thesis work when RCP based VMs were successfully deployed and thus evaluation

which was done in SUSE host operating system parallel with CBTS deployment work

was the final evaluation of this thesis work. Such evaluation method provided sufficient

information about the main evaluation target as the shared memory was accessed from

different physical computing nodes. If evaluation would have been done from VMs, the

evaluation setup would have been closer to the ultimate target setup (CBTS deployed

on MDC architecture). But that would have required additional work to setup access to

SDF shared memory from different VMs. Access to SDF shared memory from VMs might

have been possibly to implement, for example, by using Inter-Virtual Machine Shared

Memory [6, pp. 38-39].

52

Superdome Flex system used in the evaluation contained two chassis and therefore it

contained eight NUMA nodes. Each NUMA node contained Intel Xeon Platinum 8180

CPU which has 28 CPU cores, thus the whole Superdome Flex system contained 224

physical CPU cores. Intel uses hyperthreading technology which allow to split each phys-

ical CPU core into two virtual CPU cores. Hyperthreading technology was used in this

Superdome Flex system, and thus it had 448 virtual CPU cores. Each DDR4 memory

module in this Superdome Flex system had a capacity of 64GB. As shown in Section

3.5, each Superdome Flex NUMA node contains 12 DDR memory modules. Therefore,

each NUMA node had 768GB of DDR4 memory and the whole Superdome Flex system

had 6144GB (6.144TB) of DDR4 memory.

In addition to SOC8 based Superdome Flex hardware, evaluation environment contained

a HP DL560 server which also contained Intel Xeon Platinum 8180 CPUs. This HP

DL560 server was called as DL37 in the evaluation environment. DL37 was not part of

the SOC8 based IaaS, but it was an individual physical server. DL37 was, however, in

the same network as Superdome Flex and thus it could access the Superdome Flex

system. DL37 also had same version of SUSE Linux OS as the SOC8.

4.4.2 Tools Used in Evaluation

Main tools utilized in the evaluation of this thesis were SDL API client simulator and test

framework. Both tools are implemented in Nokia. Implementation of these tools was not

part of this thesis work. However, some minor improvements were done to SDL API

client simulator during this thesis work, those improvements are discussed more detailed

in Section 5.1. In addition to above mentioned two Nokia based tools, also Intel Memory

Latency Checker tool was utilized during the evaluation.

SDL API client simulator is a command line program implemented with C++ program-

ming language. SDL API client simulator generates SDL API operations. When SDL API

client simulator generates write (set) operations, it generates random data to be written.

Purpose of random data is to reduce the effect of certain optimizations (e.g. CPU

prefetch) which might happen if identical or similar data would be written every time. That

way SDL API operation latency can be measured more reliably and also SDL is used

more like in real-world usage. User can specify the generated operations as command

line arguments. SDL API client simulator was decided to be used in evaluation instead

of using real SDL API clients (like CBTS applications) because it would have been a

53

complicated task to execute real SDL API clients in evaluation environment (as the CBTS

deployment to evaluation environment was not fully completed during this thesis work)

and therefore it was decided to be left out of the scope for this thesis work. Also, it was

easier to measure different kinds of SDL API usage situations in a controlled manner by

using a testing program. SDL API client simulator stores statistics to a JSON format file.

These statistics include, for example, the latencies and statuses of each performed SDL

API operation. SDL API client simulator command line arguments and statistics file con-

tent are shown in Appendix 1.

Test framework is program implemented using python programming language. Test

framework contains two entities, deployer and executor. Test framework deployer de-

ploys needed computing resources (for example VMs) based on user defined configura-

tion files. Test framework executor runs scripts on deployed computing nodes. Test

framework executor as well works based on user defined configuration files. In a high

level, single execution task in test framework executor works by first copying a local

directory to target computing node, then executing a given script and lastly copying the

directory (containing possible result files produced by executed scripts) back to execu-

tion computing node. Test framework executor has broad configuration options, for ex-

ample, scripts can be configured to run parallel in multiple computing nodes and scripts

can be configured to run in sequence (one after other).

Intel Memory Latency Checker is a command line program provided by Intel. Version 3.6

was utilized in this thesis work. Intel Memory Latency Checker can measure memory

latencies and bandwidths. Intel Memory Latency Checker supports also NUMA configu-

rations and it can thus measure the memory latencies between different NUMA nodes.

[52] In this thesis work this tool was utilized to measure the idle memory latencies in

Superdome Flex system, which was used as evaluation environment. Output of this

measurement can be found from Appendix 2.

4.4.3 Evaluation Methodology

The evaluation was performed as follows:

Evaluation was performed using Superdome Flex product which was discussed in detail

in previous section. Used Superdome Flex product was physically located in Geneva

54

and it was used remotely using SSH/SCP connections over a Virtual Private Network

(VPN).

As discussed in the previous section, CBTS deployment was also evaluated in the same

environment concurrently with the evaluation work of this thesis. To minimize the risk of

causing disturbances to CBTS deployment work, aim was to do the evaluation for this

thesis work with minimal changes to Superdome Flex system. Because environment had

support server DL37 (discussed in previous section) which had identical CPU and OS

as the Superdome Flex, it was possible to do needed compilations in DL37 and transfer

complied binaries into Superdome Flex.

Evaluation work started by installing needed software into DL37 server. Needed software

contained software required for compiling SDL API, SDL API client simulator and the

evaluated databases and the software required for running the test framework executor.

DL37 had SUSE Linux OS installed. Installed SUSE OS already contained some of the

needed software (like C++ compiler). Next is listed the additional software which was

installed to DL 37. SDL API dependencies were installed for SDL API compilation. SDL

API has rather limited dependencies and thus installing boost C++ libraries [53] was

enough to be able to compile SDL API in DL37. Boost library version 1.69.0 was installed

to DL37. SDL API client simulator compilation required boost C++ libraries as well, and

in addition also CMake tool [54]. CMake version 3.13.4 was installed to DL37. Of the

evaluated SDL API backend data storages, compiling whiteDB (version 0.7.3) and

sharedhashfile (latest version from github on 5.2.2019) did not require any additional

software to be installed into DL37. Using Redis as SDL API backend data storage re-

quired the installation of Redis server [48] and C programming language API for Redis

called hiredis [55]. Redis server version 5.0.3 and hiredis library version 0.13.3 were

installed to DL37. Running the test framework required python version 3.5 or higher,

therefore python 3.6.8 was installed to DL37. For test framework installation purposes,

also pip [56] (version 19.0) and virtualenv [57] (version 16.4.3) tools were installed to

DL37.

Once all needed software for doing compilations was installed to DL37, software needed

for evaluation was compiled in DL37. That is, SDL API, SDL API client simulator and the

evaluated databases were compiled. Binaries which were needed for running SDL API

client simulator with all evaluated data storages in Superdome Flex were then transferred

55

to Superdome Flex using SCP connection. Transferred binaries included: SDL API li-

brary, certain boost libraries, hiredis library, whiteDB library, sharedhashfile library, redis-

server binary and SDL API client simulator binary.

The testing framework was installed into DL37. The testing framework was utilized to

simulate the designed SDL API use cases (described in detail in Section 5.2) in a Super-

dome Flex environment. Simulations were executed using different data storage solu-

tions behind SDL API (evaluated data storage solutions were described in the end of

Section 4.3) and the results were collected from each simulation run. Results were ana-

lysed and based on that; slight adjustments were done to the simulations. Example of

such adjustment can be found from Section 5.1, using own whiteDB/sharedhashfile da-

tabase for each SDL API namespace was decided based on initial evaluation run results.

Analysis of the final simulation runs is documented in detail in Chapter 6.

56

5 Solution Evaluation

This chapter provides detailed description of the implemented solutions. First chapter

discusses how SDL API compliant data storage solutions were implemented on top of

selected shared memory databases. Latter part of the chapter discusses the SDL API

use cases which were utilized during solution evaluation.

5.1 Implemented Software Solutions

As discussed in Section 4.3, main artefact in this thesis work was to implement MDC

architecture optimized (shared memory-based) data storage solutions behind SDL API.

SDL API was introduced in Section 2.4.1 and there was also listed some code level

examples of the SDL C++ API interfaces. SDL API client simulator utilizes asynchronous

SDL C++ API. Full listing of the asynchronous SDL API storage related interfaces in C++

can be found from Appendix 3.

SDL API client simulator utilizes following asynchronous SDL API storage interfaces:

setAsync(), getAsync() and waitReadyAsync(). As can be seen from Appendix 3, SDL

API contains also several other interfaces as well. Like removeAsync() interface for de-

leting data and setIfAsync() interface for conditionally writing data. To keep the scope of

this thesis reasonable, only the three interfaces which were needed by SDL API client

simulator were implemented using the evaluated data storage solutions during this thesis

work. These three interfaces are also the most commonly used ones by real SDL API

clients, thus implementing and measuring these gives the most beneficial results. Rest

of the SDL API interfaces were not implemented. Possibilities for implementing the rest

of the SDL API interfaces were theoretically analyzed and that analysis did not reveal

any major obstacles in possibilities to implement them. More detailed description of the

three implemented SDL API interfaces for each evaluated data storage solution is pro-

vided next.

The waitReadyAsync() interface takes one parameter which is a callback function which

SDL API invokes once it is ready to process requests. SDL API itself does not contain

any long-lasting initialization tasks, thus how soon the callback is invoked depends solely

on the backend data storage.

57

The setAsync() interface writes given data to backend data storage. The setAsync() in-

terface takes data to be written as a parameter, this parameter is a map type data struc-

ture which contains key (string), value (byte array) pairs. Therefore, the setAsync() inter-

face provides a possibility to write several key/value pairs with one call. Another param-

eter for the setAsync() interface is a callback function which is invoked once the write

operation is completed, invoked callback function receives error code parameter which

indicates the status of the request.

The getAsync() interface reads data from the backend data storage. The getAsync inter-

face takes as parameters the keys to be read and a callback function to be invoked once

operation is ready. Invoked callback function receives two parameters: a map which con-

tains key/value pairs for all requested keys and error code which indicates the status of

the request.

As seen from the API descriptions above, all asynchronous SDL API functions take

callback function as parameter. This is typical with asynchronous interfaces; asynchro-

nous interfaces work so that function returns almost immediately and callback is invoked

when operation is completed. This allows function client to work with other tasks while

operation is ongoing.

WhiteDB/Sharedhashfile based asynchronous SDL API implementations do not work as

typical asynchronous functions (described above). Instead implemented functions return

only when operation is fully completed, and client provided callback is called at the same

time. Reason why the implementation is done this way is that implementation is lot sim-

pler and in this use case there is no real need for proper asynchronous API. In this eval-

uation, SDL API is used only by SDL client simulator which does not have any other

tasks to perform while generated SDL operation is ongoing. Therefore, SDL API call

blocking SDL client simulator execution does not have any practical effect unless exe-

cution is blocked for so long time that the next operation cannot be performed within

specified interval time. Shared memory-based memory database operations are fast and

because of that it did not happen during the evaluation that this implementation choice

would have caused a delay to SDL client simulator operation generation. However, this

implementation choice did bit affect SDL API edge case simulation design, this is further

discussed in Section 5.3.

WhiteDB Based SDL API Implementation

58

WhiteDB was utilized as SDL API implementation so that all necessary SDL API inter-

faces were implemented utilizing WhiteDB. WhiteDB was utilized in SDL API by using C

language API provided by WhiteDB [49]. In practical level this was done by including

header files provided by WhiteDB in SDL code and by compiling WhiteDB library and

linking the compiled WhiteDB library to SDL API binary.

WhiteDB database needs to be created before data can be stored into it. Therefore, SDL

API WhiteDB implementation does WhiteDB database initialization as a first task when

it starts. WhiteDB API contains following functions for database initialization:

void* wg_attach_existing_database(char* dbasename);

void* wg_attach_database(char* dbasename, wg_int size);

SDL API first tries to attach to possibly already existing WhiteDB instance (created by

some other SDL API instance). If that does not succeed, SDL API creates completely

new WhiteDB database.

As can be seen from the API above, database creation requires database name and size

as parameters. WhiteDB supports creating several individual databases, different data-

bases are distinguished based on database name identifier given during database crea-

tion. In WhiteDB based SDL API implementation, SDL namespace identifier (illustrated

in Figure 9.) was decided to be used as database name. As data in certain SDL

namespace is isolated from other namespaces (namespaces are kind of virtual data-

bases), there was no reason why each namespace could not use separate WhiteDB

memory database. Clear benefit in using separate WhiteDB database for each SDL API

namespace was the reduced lock contention. For evaluation purposes, also SDL version

where same database name was used for all SDL API instances was evaluated. But as

it caused major decrease in performance, the final evaluation was done with version

where each SDL API namespace used own WhiteDB database. Default database size,

that is 10MB, was used as it was seen well sufficient for the evaluation purposes.

As SDL does WhiteDB initialization during start up (in SDL C++ API constructor),

WhiteDB based SDL implementation is always ready to handle incoming requests. Thus,

WhiteDB based waitReadyAsync() interface implementation for calls always provided

callback function immediately.

59

WhiteDB based setAsync() interface implementation first acquires write lock for WhiteDB

database using wg_start_write() WhiteDB API call. Once lock is successfully acquired,

all key-value pairs provided as parameters are written to WhiteDB in a loop. Before stor-

ing a key-value pair, wg_find_record_str() and wg_delete_record() WhiteDB APIs are

used to delete possible old value stored for the given key. As described in Section 4.3

data is written to WhiteDB using records which can contain N-fields. WhiteDB based SDL

API implementation creates records having two fields using WhiteDB API wg_cre-

ate_record() call. Key is stored to one field and data is stored to another field. Storing

key to own field provides the possibility to later retrieve the data based on given key (this

is done in the getAsync() interface implementation). Before data can be stored to

WhiteDB record it needs to be encoded, key is encoded with WhiteDB API wg_en-

code_str() call and value is encoded using wg_encode_blob() WhiteDB API call. En-

coded data is stored to record using WhiteDB API wg_set_field() call. Once data writing

is completed (either all keys were successfully written, or some operation failed), write

lock is released using WhiteDB API wg_end_write() call. Finally, callback function pro-

vided as parameter for setAsync(), is called.

WhiteDB based getAsync() interface implementation first acquires read lock for WhiteDB

database using wg_start_read() WhiteDB API call. Once lock is successfully acquired,

all requested keys are read from WhiteDB database in a loop. Reading is done by first

searching the correct record from whiteDB database, that is, a record having requested

key as first field contents. This searching is done using WhiteDB API wg_find_rec-

ord_str() call. Once correct record is found, its encoded contents needs to be decoded,

decoding is done using WhiteDB API wg_decode_blob() call. Successfully decoded data

is then stored to a map data structure which is later provided as parameter for callback

function. Once data reading is completed (either all keys were successfully read, or some

operation failed), read lock is released using WhiteDB API wg_end_read() call. Finally,

callback function provided as parameter for getAsync(), is called.

Sharedhashfile Based SDL API Implementation

Sharedhashfile based SDL API implementation is very similar with WhiteDB based SDL

API implementation by the main principles. Sharedhashfile C++ API is utilized in SDL

API by including needed Sharedhashfile headers into SDL API implementation and by

linking Sharedhashfile library to SDL API library [50].

60

Like WhiteDB, Sharedhashfile database needs to be created/initialized before it can be

used. SDL API does this initialization as first task when Sharedhashfile based SDL API

implementation starts (in SDL C++ API constructor). Sharedhashfile creation/initializa-

tion is done using following two Sharedhashfile API calls:

bool AttachExisting(const char * path, const char * name);

bool Attach(const char * path, const char * name, uint32_t de-

lete_upon_process_exit);

SDL API first tries to attach to existing (created already by some other SDL API instance)

sharedhashfile database using AttachExisting() API. If existing database is not found,

SDL API creates a new Sharedhashfile database using Attach() API. Both APIs take

database name and directory path as parameter. Database name is derived from SDL

namespace name in order to create own database for each SDL API namespace (similar

approach was used with WhiteDB and reasoning for that can be from WhiteDB imple-

mentation section above). Directory path used in Sharedhashfile based SDL API imple-

mentation is /dev/shm, which ensures that shared memory is always used (described

more in Section 3.4). Attach() API, which creates a new Sharedhashfile database, has

delete_upon_process_exit parameter which specifies whether the database is deleted

once the process, which created it, terminates its execution. In SDL API case this pa-

rameter is set to not delete the database as database needs to be accessible for other

SDL API instances even database creator exits.

As with WhiteDB, Sharedhashfile initialization is fully done during start up, thus also

Sharedhashfile based SDL implementation is always ready to handle incoming requests.

Sharedhashfile waitReadyAsync() implementation calls always provided callback func-

tion immediately.

Sharedhashfile based getAsync() and setAsync() implementations differ bit from the cor-

responding WhiteDB implementations. This is because Sharedhashfile does not require

data encode/decode, Sharedhashfile only supports string format data, and because

Sharedhashfile handles locking automatically.

Sharedhashfile based setAsync() interface implementation processes provided keys in

a loop like WhiteDB based implementation. For each key, implementation first calls

MakeHash() Sharedhashfile API giving key as parameter. That way, key is assigned with

unique identifier within Sharedhashfile database. After MakeHash() call, all following

61

data operation API calls are by default targeted to that key identifier. Next DelKeyVal()

Sharedhashfile API is used to delete possible old value stored for the given key. Finally,

given value (which was passed to the SDL API setAsync() call) is stored to that key using

Sharedhashfile PutKeyVal() API (SHF_CAST() Sharedhashfile API is utilized to convert

data into string format). Once call keys are handled, callback function provided as pa-

rameter for setAsync(), is called.

Sharedhashfile based getAsync() implementation processes requested keys similarly in

a loop. For each key, first MakeHash() Sharedhashfile API is utilized to target following

read operation to requested key. Read operation is done utilizing GetKeyValCopy()

Sharedhashfile API. If GetKeyValCopy() returned data (key existed), received data is

stored to map data structure which is later provided as parameter for getAsync() callback

function. Once call keys are handled, callback function provided as parameter for getA-

sync(), is called.

SDL API Client Simulator Modifications

As mentioned in Section 4.4.2, SDL API client simulator implementation was not part of

this thesis work. However, some minor improvements to SDL API client simulator were

done during this thesis work. New command line argument, SDL namespace, was added

to SDL API client simulator. Argument specifies SDL namespace identifier used during

one simulation round. This was needed the model the designed use cases more accu-

rately. And as described earlier in this section, it was necessary to evaluate whether to

use own WhiteDB and Sharedhashfile databases for each SDL namespace or one com-

mon database for all namespaces. Performing this evaluation required the possibility to

specify the namespace identifier in simulations. Namespace command line parameter,

which was added during this thesis work is visible in SDL API client simulator command

line help listed in Appendix 1.

Adding namespace parameter to SDL API client simulator was the only functional

change to it. Other improvements were related to SDL API client simulator compilation,

which required some changes due to the evaluation environment used.

62

5.2 Simulated SDL API Use Cases

Three different SDL API use cases were utilized during the evaluation. Use cases were

designed based on input received from SDL API experts. Use cases are based on real

SDL API use cases from CBTS product, but operation amounts in simulated use cases

are higher than currently in corresponding real uses cases. That is because evaluation

was done from future proof point of view. The three SDL API use case simulations were

executed using each different evaluated SDL backend data storage solution. Evaluated

data storage solutions were described in Section 4.3 and they were: WhiteDB,

sharedhashfile and Redis. As Redis (existing SDL API backend data storage solution)

database is accessed via a TCP/IP connection, Redis was evaluated both so that Redis

server was running in Superdome Flex and so that Redis server was running in DL37 (to

model different Redis deployment options). As WhiteDB and sharedhashfile are shared

memory-based databases, they must be running in Superdome Flex system. However,

memory database can be allocated from any of the eight SDF numa nodes. It was desir-

able to know from which numa node memory database was allocated (better result eval-

uation possibilities), therefore simulations were executed such way that shared memory

databases were always allocated from numa node 0. That was done by running extra

“memory allocation” simulation run before the actual use case simulation. In the simula-

tions, one Superdome Flex numa node simulated one CBTS VM. In reality, there would

most likely be several CBTS VMs deployed on each Superdome Flex numa node. But to

keep the simulation setup simpler, one-to-one mapping between CBTS VMs and Super-

dome Flex numa nodes was used.

5.2.1 Use Case 1: Stateless Applications

As discussed in Section 2.4.1 one of the targets in SDL API is to provide support for

stateless applications. In the stateless applications, all state information is stored exter-

nal to the application. This enables scalability and resiliency as application can continue

its work after restart by reading the state information from external storage. Or completely

new applications can be deployed, and they can similarly start working with the stored

data. Or even completely new computing nodes (e.g. VMs) can be deployed and appli-

cations in new computing nodes can start to work with the shared data (assuming that

data is stored external to application VMs). SDL API provides the data storage needed

to implement stateless applications.

63

There are several ways to implement stateless applications. For applications which have

relatively short lifetime and which process operations which are clearly separated from

each other, stateless operation can be implemented in such manner that when applica-

tion starts execution/operation processing it reads the needed information (state data)

from external storage and after application has processed its work, application stores the

information needed in the future to external storage and terminates execution/pro-

cessing. This use case models SDL usage by such stateless applications. Use case 2.

is also related to stateless applications but it models stateless applications implemented

another way.

According to use case description above, in this use case each operation processed by

CBTS applications generates one read and one write SDL operation. Simulation also

modelled a high traffic scenario where CBTS applications receive incoming operations

with short interval. Therefore, simulation executed same amount of read and write oper-

ations with rather short operation interval. There was equal amount (40) SDL API client

simulator applications running in each Superdome Flex numa node (simulated VM).

There was 40 different SDL namespace used so that each SDL API client simulator ap-

plication inside one numa node used different SDL namespace, but the same

namespaces were used between all numa nodes. SDL API client simulator applications

were started (by test framework) concurrently in all eight numa nodes. Inside the numa

nodes there was a short delay between the starting of the 40 SDL API client simulator

applications (operation interval was used also as starting delay) as otherwise all SDL

operations would have been done almost concurrently which does not correspond with

real usage. Data written and read was sized to 2kB. Below is a summary of the key

values of this simulation:

• 40 SDL simulator applications per numa node = 320 concurrent SDL simulator

applications running in Superdome Flex system

• 40 different SDL namespaces used (each SDL simulator application inside single

numa node used separate namespace but same namespaces were shared be-

tween 8 numa nodes)

• Each SDL simulator application did SDL operations in 50ms interval = 20 SDL

op/s

64

• Each SDL simulator application did 2kB reads and writes (same amount of both

reads and writes)

• 40 SDL simulator applications * 20 SDL op/s = 800 SDL op/s per numa node =

800 * 8 = 6400 SDL op/s total in Superdome Flex system

• Total simulation time was 50s

5.2.2 Use Case 2: Stateless Applications Which Store Stable State

This use case model stateless applications which read the possible existing state infor-

mation during application start-up and during operation processing applications store

state information to external storage at certain critical phases. These critical phases are

places where such changes happen that state information needs to be stored in order to

be able continue operation normally in case of application restart. The stored state infor-

mation is called as “stable state”. This model is suitable for long running applications

having complicated data structures which are accessed from the processing of several

different operations. Modifying these applications to read the state information only dur-

ing the start of the application/processing would require notable architectural changes to

these applications. Stable state method provides a way to implement stateless operation

to such applications with relatively small changes. Further on, implementation can be

improved towards the model described in use case 1.

In this use case each operation processed by CBTS applications generates several SDL

write operations but no read operations at all. Read operations are generated only during

application during start-up, if application needs to continue work from stable state. This

can happen for example, if application is spontaneously restarted or if whole VM is spon-

taneously restarted and another VM continues its work. In these scenarios multiple SDL

read operations are generated as the application needs to read complicated data struc-

tures from the external storage (SDL). Exact amount of SDL read and write operations

generated depends on the implementation of the applications. In this use case simulation

SDL operations amounts are estimates based on SDL expert analysis of the possible

stable state application implementation.

In the high level, this use case simulation works so that in the beginning there are eight

VMs, seven containing stable state applications which are processing operations and

65

therefore are doing SDL write operations, and one idle VM as backup. At certain point,

VM switchover (one VM doing operations fails and idle VM continues the operations that

failed VM was handling) happens. Amount of operations handled by CBTS was simu-

lated to be the same as in use case 1. Therefore, there were 40 SDL API client simulator

applications running in seven Superdome Flex numa nodes (VMs). These simulate sta-

ble state applications processing CBTS operations. Like in use case 1. there was 40

different SDL namespace used so that each SDL API client simulator application inside

one numa node used different SDL namespace, but the same namespaces were used

in all numa nodes. SDL API client simulator applications were started (by test framework)

concurrently in seven numa nodes and one numa node was left idle (simulated idle

backup VM). Inside the numa nodes there was a short delay between the starting of the

40 SDL API client simulator applications (operation interval was used also as starting

delay) as otherwise all SDL operations would have been done almost concurrently which

does not correspond with real usage. Data written and read was sized to 2kB. After run-

ning simulation in seven numa nodes for 10s, switchover was simulated so that SDL API

client simulator applications doing read operations were launched to idle numa node (by

test framework). Reading period in the new VM was rather short as the new VM needed

to read only the state information. In this simulation, new VM did no further actions after

initial state reading (new VM normal processing after state reading was not simulated).

Below is a summary of the key values of this simulation:

• For the first 10s there were 40 SDL simulator applications in seven numa nodes

= 280 concurrent SDL simulator applications running in Superdome Flex system



tween 7 numa nodes)

• Each SDL simulator application did SDL 2kB write operations in 50ms interval =

20 SDL op/s

• After 10s, switchover was simulated. Switchover was simulated by starting SDL

simulator applications to idle numa node. These SDL simulator applications sim-

ulated continuing the processing by reading stable state from SDL.

66

• As the idle (new active) numa node simulated similar VM as the other ones, 40

SDL simulator applications were started also in it. These 40 SDL simulator appli-

cations did 2kB SDL read operations with 5ms interval (200 SDL op/s). Opera-

tions were targeted to same 40 SDL namespace as that were used in other numa

nodes as well.

• For the first 10s, there were 40 SDL simulator applications * 20 SDL op/s = 800

SDL op/s per numa node = 7 * 800 = 5600 SDL write op/s total in Superdome

Flex system. After 10s, there was also 40 SDL simulator applications * 200 SDL

read op/s in one numa node = 8000 SDL read op/s (in addition to 5600 SDL write

op/s ongoing in other 7 numa nodes). Total simulation time was 25s. Data reading

period in the new VM lasted for 1.25s, that is total amount of SDL read operations

in the new VM was 8000 * 1.25 = 10000 SDL read operations.

5.2.3 Use Case 3: Non-Intrusive Analytics

In the beginning of Section 2.4 cloud integrated network vision was discussed. In this

vision data analytics is done already in Telco Cloud in order to reduce latency. In this

use case this concept is modelled. Such external analytics is wanted to be performed in

non-intrusive manner. That is, analytics processing should not disturb normal network

operation which is processed in the same cloud. Therefore, a separate “analytics VM” is

deployed to process the external analytics task. Analytics VM reads the data to process

from SDL API.

In this use case this scenario is modelled such way that there are first VMs doing same

simulation as in use case 1. (VMs running stateless applications), later new VM is de-

ployed and this new VM starts to read and process data from SDL.

• For the first 10s there were 40 SDL simulator applications in seven numa nodes

= 280 concurrent SDL simulator applications running in Superdome Flex system



tween 7 numa nodes)

67

• Each SDL simulator application did SDL operations in 50ms interval = 20 SDL

op/s



tween 7 numa nodes)

• Each SDL simulator application did 2kB reads and writes (same amount of both

reads and writes)

• After 10s, adding new analytics VM was simulated. Idle numa node was simu-

lated to start as analytics VM. This was simulated by starting (using testing frame-

work) SDL simulator applications, which simulated analytics applications, in the

idle numa node.

• Analytics applications were simulated by SDL simulator applications which read

rather small amounts of data with rather long interval (interval simulated the data

processing done by analytics applications). There were 20 SDL simulator appli-

cations doing 1kB SDL read operations with 200ms interval (5 SDL op/s). Oper-

ations were targeted to 20 SDL namespaces that were used in other numa nodes

as well (simulation simulated that analytics applications read data produced by

full stateless CBTS applications which were simulated in other 7 numa nodes)

• For the first 10s, there were 40 SDL simulator applications * 20 SDL op/s = 800

SDL op/s per numa node = 7 * 800 = 5600 SDL op/s total in Superdome Flex

system. After 10s, there were also 20 SDL simulator applications doing 5 SDL

read op/s in one numa node = 80 SDL read op/s (in addition to 5600 SDL op/s

ongoing in other 7 numa nodes). Total simulation time was 50s.

5.3 Simulated SDL API Edge Cases

In addition to real-world SDL API use case described in previous section, two edge sim-

ulations were also run during the evaluation. Purpose of these edge case simulations

was to ensure the stability and reliability of the evaluated data storage solutions during

extreme scenarios.

68

Edge Case Simulation 1. Extremely High Operation Frequency

In this simulation extremely high amount of SDL operations/s were simulated. Operation

amount did not correspond with any current real-world usage but instead aim was to

evaluate the data storage solutions with highest possible operation amount frequency

which could be reliably produced with used evaluation tools.

Setup was this simulation was otherwise identical as in use case simulation 1. but SDL

operation interval was reduced from 50ms to 5ms. This change in SDL operation interval

increased total amount of SDL operation/s in Superdome Flex system from 6400 op/s to

64000 op/s. 5ms was selected as operation interval because lower intervals would not

have worked reliably due to simplified SDL API implementation for the evaluated shared

memory database (this was discussed in Section 5.1). SDL API client simulator intervals

are given in milliseconds, thus 1ms interval would have been the lowest possible with

current SDL API client simulator implementation.

Edge Case Simulation 2. Long Simulation Duration

This simulation aimed to verify that evaluated data storage solutions can perform oper-

ations successfully for extended periods of time. Setup was this simulation was otherwise

identical as in use case simulation 1. but simulation total run time was increased from

50s to 2000s (33,3min). This is not an extremely long time period, but due to rather short

time Superdome Flex system was available for this evaluation, longer period would have

been difficult to arrange.

69

6 Evaluation Results

This chapter first shows the results obtained from solution evaluation which was dis-

cussed in Chapter 5. Latter part of the section draws conclusions from the evaluation

done.

6.1 Result Data

As explained in Sections 5.2 and 5.3, simulations, which were run during evaluation,

started multiple (up to 40) SDL API client simulator applications to each SDF numa node.

Each SDL API client simulator produced a result statistics file. Example of such result

file is shown in Appendix 1. Result file is in JSON format. Result file contains statistics of

each performed operation. Set (write) and get (read) operations are in own sections in

the result file. Meaning of all properties in the result file is explained in table below:

Table 1. SDL API client simulator application result file properties

Property name Property description

totalTestDuration_s Simulation run total duration in seconds

operationInterval_ms Interval between generated SDL operations in milliseconds

scheduled Amount of SDL operations started during the simulation run

completed Amount of completed (acknowledgement received) SDL operations

failed Amount of failed (error status received in acknowledgement) SDL op-erations

late Amount of SDL operations which had latency higher than 9999 μs (de-lays-array below contains latencies for other operations)

delays Array which contains all operations latencies (up to 9999 μs) measured during simulation run

op_counts Array, each element contains amount of SDL operations which had a latency value indicated in the same position at delays-array.

Result Data Post Processing

Result statistics file contains latencies of all individual SDL API set and get operations.

For result analysis purposes, values describing the statistical quality of the whole simu-

lation run, were calculated. These values included, for example, mean latencies of all

get and set operations in each numa node.

Two small java script programs were implemented for result post processing purposes.

70

First java script program processed all SDL client simulator result files from one SDF

numa node and calculated following data from all found latency values: mean latency,

median latency, standard deviation, min latency and max latency. Get and set operation

latencies were calculated separately. Result were stored to a new JSON-format result

file. Below is an example of such result file:

{

"totalTestDuration_s": 50,

"operationInterval_ms": 50,

"getStatistics": {

"scheduled": 20000,

"completed": 20000,

"failed": 0,

"late": 0,

"latencies": [],

"latency_count": 20000,

"mean_latency": 56.7932,

"median_latency": 56,

"standard_deviation": 46.16904302408704,

"min_latency": 13,

"max_latency": 3416

},

"setStatistics": {

"scheduled": 20000,

"completed": 20000,

"failed": 0,

"late": 0,

"latencies": [],

"latency_count": 20000,

"mean_latency": 72.2928,

"median_latency": 69,

"standard_deviation": 59.44963808266634,

"min_latency": 18,

"max_latency": 3735

}

}

Object named: latencies is empty because storing all found latency values from one

numa node results would have created very large result file, and individual latency values

were eventually not needed in result analysis.

Second java script further processed the result files produced by the first java script pro-

gram. Second post processing operation was rather simple. Single numa node results

(received from first post processing operation) were rounded if needed and stored to an

array containing results from all eight numa nodes. These results were stored to a JSON-

format result file, which was the final result file, used for result analysis. Below is an

example of such result file:

71

{

"get_failures": 0,

"get_lates": 0,

"get_mean_latencies": [57, 59, 58, 60, 58, 57, 58, 58],

"get_median_latencies": [56, 57, 57, 57, 57, 57, 58, 57],

"get_min_latencies": [13, 12, 11, 17, 10, 10, 22, 14],

"get_max_latencies": [3416, 3040, 2188, 657, 2153, 1849, 479, 2104],

"get_standard_deviations": [46, 39, 30, 23, 31, 34, 26, 29],

"set_failures": 0,

"set_lates": 0,

"set_mean_latencies": [72, 76, 74, 76, 73, 72, 75, 72],

"set_median_latencies": [69, 70, 70, 71, 71, 70, 71, 70],

"set_min_latencies": [18, 19, 14, 24, 19, 14, 25, 24],

"set_max_latencies": [3735, 1709, 958, 2535, 2821, 740, 1758, 1897],

"set_standard_deviations": [59, 42, 30, 38, 44, 32, 37, 31]

}

One such result file was produced for all evaluated SDL backend data storage solutions

in all evaluation use cases. Next results from all evaluation use cases are discussed.

Evaluation Criteria

Based on discussions with SDL API experts, following target values was set:

Performance: Mean SDL API operation access latency for intra VNF operations (opera-

tions between VMs) should be less than 100 microseconds in use case simulations.

Reliability: All operations in all simulations (both use case and reliability simulations)

should be successful.

6.2 Results from SDL API Use Case Simulations

This section discusses results from SDL API use case simulations, which were described

in detail in Section 5.2. Short summary of the use case SDL operation amount is shown

also before the result. Analyzed results are obtained from post processed result files

which were discussed in the previous section. All obtained result data is not shown for

all use cases, only such data, from which some notable findings were made, is shown.

Result data is shown in figures, each figure containing one result property from the JSON

result shown in previous section (e.g. get-operation mean latency). Like in JSON result

file, given result property value is shown separately from each numa node (x-axis). Fur-

thermore, each figure contains result value for each evaluated data storage solution (y-

axis): Redis deployed to SDF numa node 0, Redis deployed to DL37, sharedhashfile

72

deployed to SDF numa node 0 and WhiteDB deployed to SDF numa node 0. All values

are in microseconds, if not otherwise mentioned.

6.2.1 Use Case 1 Results

In this simulation there were 40 SDL simulator applications running in each SDF numa

node. They performed SDL operations with 50ms interval (20op/s) = 40 * 20op/s = 800

SDL op/s per numa node (2kb operations, same amount of get (read) and set (write)

operations). In total there were 8 numa nodes * 800op/s = 6400 SDL op/s in SDF system,

and they were targeted to 40 different SDL namespaces. Total simulation time was 50s.

Figure below shows get-operation mean latencies:

Figure 21. SDL API Use Case 1. get-operation mean latencies (μs)

It can be immediately seen that both shared memory databases have notably lower la-

tency than Redis. WhiteDB has lower access latency than sharedhashfile. There are no

significant differences in different numa node access latencies. As the shared memory

databases are deployed in numa node 0, access from numa nodes 0-3 should be faster

(as described in Section 3.5). However, if we look at the idle numa node memory access

latencies from Appendix 2., we can see that idle access latencies are magnitudes lower

(in nanosecond level) and the latency difference between different numa nodes is at

maximum less than 300ns. Thus, it is rather clear that numa node differences are not

that visible, because these results are in hundreds of microsecond level. This leads into

question, why these results have so much higher latencies than what is theoretically

possible in SDF. High amount of concurrent access and lock contention caused by it is

most likely the biggest reason. Also, the amount of write operations is rather high (50%)

73

in this simulation (it is high in all simulated use cases). Median latencies are not dis-

cussed, as they did not provide any significant new information compared to mean la-

tencies.

Redis latencies are bit lower when Redis is deployed in SDF, but the difference is mar-

ginal. Typical Redis operation latency using a TCP/IP based connection is 200 micro-

seconds + additional latency caused by the used system [58]. Redis results here are in

line with this information.

Figure below shows set-operation mean latencies:

Figure 22. SDL API Use Case 1. set-operation mean latencies (μs)

In Redis based solution, set-operation latencies do not differ much from the get-operation

latencies. This was expected based on measurements done with existing, Redis based,

SDL implementation. This was expected also based on Redis documentation [59].

WhiteDB latency is bit higher than in get-operations. sharedhashfile latency is signifi-

cantly higher and it even exceeds Redis latency. This is likely due to high amount of write

operations in the simulation. According to sharedhashfile documentation, sharedhashfile

suffers from such access profile, there are also some suggested optimizations for

sharedhashfile write operations but there was not enough time to try them during this

evaluation [50].

Figure below shows the minimum get-operation latencies, that is the single lowest la-

tency value during the whole simulation run:

74

Figure 23. SDL API Use Case 1. get-operation minimum latency (μs)

Shared memory databases are same level and they have magnitude lower minimum

latency than Redis. However, even these lowest latencies are not in the nanosecond

level as the idle latencies. Suspected reason for this is that database lock handling cre-

ates this amount of latency even in situations where lock can be acquired immediately.

Shared memory database based SDL API implementations do not contain any other

potentially long-lasting operations than lock handling (see Section 5.1).

Figure below shows the other limit, the maximum latency values for get-operations:

Figure 24. SDL API Use Case 1. get-operation maximum latency (μs)

Redis results draw the main attention. In this result, the Redis deployment first time

makes a significant difference to latency values. Conclusion from this is that Redis la-

tency is mostly from TCP/IP network. For most operations this does not cause notable

effect, but for some operations the network latency does make a difference.

For shared memory databases, the highest latency values are also clearly lower than in

Redis. The relative difference is lower than with minimum values though. High operation

75

frequency can create notable lock contention, that is the likely reason for this high max-

imum latency values. From shared memory databases, WhiteDB has bit lower maximum

latency values than sharedhashfile.

Next Figure shows standard deviation values for get-operations. Standard deviation val-

ues are calculated using the well-known formula. That is, they are found by taking the

square root of the average of the squared differences of the values from their average

value.

Figure 25. SDL API Use Case 1. get-operation standard deviation (μs)

Standard deviation provides information about the amount of jitter expected. Shared

memory databases have lower standard deviation values than Redis. WhiteDB values

are bit lower than sharedhashfile ones. Results correspond with already discussed other

result properties.

Next, we will look at the set-operation minimum latencies:

Figure 26. SDL API Use Case 1. set-operation minimum latency (μs)

76

In addition to already familiar pattern of shared memory databases having notable lower

latencies than Redis and WhiteDB having somewhat lower latency than sharedhashfile,

there is one notable aspect in this result value. In set mean latencies, sharedhashfile had

higher values than Redis. Here sharedhashfile values are notably lower than in Redis.

This suggests that, as described in sharedhashfile documentation, it is likely possibly to

optimize sharedhashfile mean latencies as well.

It is also worth noticing, that the difference between Redis and shared memory data-

bases is relatively lower than in get-operation minimum latencies. But this was expected,

as mean latencies showed the same effect.

Set-operation maximum values are shown in the next Figure:

Figure 27. SDL API Use Case 1. set-operation maximum latency (μs)

There is not much new information in this result. Like was seen in get-operation minimum

values, Redis deployment has notable effect to this result value. And like in get-operation

minimum values, sharedhashfile has lower values than Redis, although sharedhashfile

set-operation mean latencies were higher.

Finally, set-operation standard deviation values are investigated:

77

Figure 28. SDL API Use Case 1. set-operation standard deviation (μs)

This result is similar as get-operation standard deviation result. Like with set-operation

minimum and maximum values, sharedhashfile standard deviation is lower than in Redis,

although sharedhashfile set-operation mean latencies were higher.


In this simulation there were first 40 SDL simulator applications running in seven SDF

numa nodes (nodes 0-6). They performed just set (write) operations (2kb data size) with

50ms interval (20op/s). In total there were 40 * 20op/s = 800 op/s * 7 numa nodes = 5600

SDL op/s in SDF system, and they were targeted to 40 different SDL namespaces (like

in use case 1). After initial 10 seconds 40 new SDL simulator applications were started

to idle numa node 7. These new SDL simulator applications performed just get (read)

operations (2kb data size) with 5ms interval (200op/s). In total there were 40 * 200op/s *

1 numa node = 8000op/s in numa node 7. Total simulation time was 25s and data reading

phase in numa node 7 lasted for 1s.

Figure below shows set-operation mean latencies in numa nodes 0-6 and get-operation

mean latencies in numa node 7:

78

Figure 29. SDL API Use Case 2. set-operation mean latencies (μs) in numa nodes 0-6, get -operation mean latencies (μs) in numa node 7

Redis results were again almost identical regardless of Redis deployment, thus only one

set of Redis results (Redis deployed to SDF numa node 0) are displayed here.

Set operation results in numa nodes 0-6 are very similar with use case 1. set operation

mean latencies (Figure 22.). Amount of set-operations was much higher in this simula-

tion. In use case 1. there were 6400 op/s (half set, and half get) during the whole simu-

lation, in this use case 2. there were 5600 set op/s during the whole simulation (numa

nodes 0-6) and in addition where was 8000 get op/s in numa node 7 for one second

period in the middle of the simulation. It is not surprising that set-operation mean laten-

cies are similar although set-operation frequency was increased from 3200 set op/s to

5600 set op/s. After all, for most of the time, the total (get+set) operation frequency was

lower (5600op/s vs 6400op/s). As seen from use case 1. results, and from Redis docu-

mentation, Redis performance is rather identical for both read and write operations. For

shared memory databases, it was seen from use case 1. results that write operation

performance is lower than read operation performance. That was analysed to be mostly

due to shared memory database lock handling, both read and write operations block

other write operations, while read operations block only write operations (described in

Section 4.3). Because both read and write operations block other write operations, it is

not surprising that write operation performance is rather similar in use case 1. and use

case 2. (the total operation frequency was on the same level).

Get-operation results on the other hand are rather surprising. They are also in the same

level as in use case 1. However, during the 1 second when get-operations were per-

formed in numa node 7, all operation frequencies were higher than in use case 1 (get:

8000op/s vs 3200op/s, set: 5600op/s vs 3200op/s, total: 13600 vs 6400op/s). For Redis

this is quite expected, 13600op/s is not even close to the operation frequency which

would cause Redis performance decrease [59]. But for shared memory databases, the

79

lock contention effect noticed in use case 1. would have been expected to reduce the

get-operation performance even more than in use case 1. simulation. But instead, get-

operation mean latencies are in the same level, if not bit better, than in use case 1. One

theory for this result is that get-operations were performed during rather short time period

(1 second of the total 25s simulation), and therefore they did not yet start to create major

lock contention. Short period for get-operation was due to the nature if the simulated

real-world use case. Stable state restoration can be performed in rather short period of

time. Other possible factor for this result could be that all get-operations were performed

from the same numa node (in use case 1. simulation all numa nodes performed get-

operations). But this should not cause major effect, because get-operations performed

in numa node 7 accessed the same SDL namespaces (and therefore same databases)

as the set-operations in other numa nodes. Thus, the reason for this bit surprising result

was not thoroughly resolved during this thesis work, and this is something which could

be further investigated in future.

Other result properties were as well in the same level as use case 1. results, thus they

not discussed further here but the conclusions for mean latencies above apply to them

also.

Improvement Possibilities to Use Case 2 simulation

While analysing use case 2 results, it was noticed that there were improvement possibil-

ities in use case 2. simulation. Simulation did not model the real use case correctly in

that sense that failed VM processing before the switchover was not simulated but the

failing VM (simulated in numa node 7) was kept idle until the switchover. Simulation

would have been more realistic if failing VM would done normal processing until the

switchover. Missing processing was, however, only a small fraction of the total operations

in the simulation and thus it is estimated that this did not cause major differences to

result.


In this simulation numa nodes 0-6 were performing identical simulation as in use case 1

(simulating stateless applications). That is, in total there were 800op/s (2kb set and gets)

in 7 numa nodes = 7 * 800op/s 5600 SDL op/s, and they were targeted to 40 different

SDL namespaces. After initial 10 seconds 40 new SDL simulator applications were

80

started to idle numa node 7 (which simulated analytics VM). These new SDL simulator

applications performed just get (read) operations (1kb data size) with 200ms interval

(5op/s). In total there were 40 * 5op/s * 1 numa node = 200op/s in numa node 7. Total

simulation time was 200s and data reading in numa node 7 was performed during the

last 190s.

Like with use case 2., Redis results were again almost identical regardless of Redis de-

ployment, thus only one set of Redis results (Redis deployed to SDF numa node 0) are

displayed here.


Figure 30. SDL API Use Case 3. get-operation mean latencies (μs)

Results with all data storage solutions are in the same level than in use case 1. Only

minor exception is sharedhashfile result in numa node 7 (simulated analytics VM). Result

was quite as expected based on other results. Total operation frequency is bit smaller

than in use case 1. simulation (analytics VM performs operations with lower frequency

than processing VM). Sharedhashfile performance increase in numa node 7 could be

because numa node 7 operation frequency is much lower. In other results it has been

seen that sharedhashfile performance (especially write) decreases when operation fre-

quency is very high.


81

Figure 31. SDL API Use Case 3. set-operation mean latencies (μs)

As the analytics VM (numa node 7) performed only write (set) operations, there are no

set-operation results from numa node 7.

Set operation results from processing VMs (numa nodes 0-6) are very similar as the use

case 1. set-operation results. As discussed in get-operation mean latency analysis

above, this is expected as total operation frequency is bit lower than in use case 1. sim-

ulation.

Results from this use case simulation showed that with all evaluated data storage tech-

nologies analytics VM operation did not cause extra latency to normal data processing.

Like with use case 2., other result properties were as well in the same level as use case

1. results, thus they not discussed further here but the conclusions for mean latencies

above apply to them also.

6.3 Results from SDL API Edge Case Simulations

This section discusses results from SDL API edge case simulations, which were de-

scribed in detail in Section 5.3. Main purpose of these simulations was to ensure the

stability of the evaluated data storage technologies. All evaluated data storage solutions

performed all use case simulations without any errors, thus it was beneficial to ensure

stability with such extreme simulations.

Edge Case Simulation 1. Extremely High Operation Frequency

This simulation was like use case simulation 1. but operation interval was decreased

from 50ms to 5ms which cause operation frequency to be ten times higher (64000 op/s).

82

Both Redis and WhiteDB were able to process this simulation without any errors but

Sharedhashfile simulation produced some amount of errors (144 get-operation failures

and 141 set-operation failures). Thus, Sharedhashfile results are not further analysed

here. Redis results were again almost identical regardless of Redis deployment, thus

only one set of Redis results (Redis deployed to SDF numa node 0) are displayed here.


Figure 32. SDL API Edge Case simulation 1. get-operation mean latencies (μs)

Redis result values are bit higher than in use case simulation 1., it is increased by about

10-20%. WhiteDB result values have increased also, and relative increase is notably

higher than in Redis (about 50-60% increase). Absolute values are, however, still notably

lower than Redis values.

Figure below shows set-operation mean latencies:

Figure 33. SDL API Edge Case simulation 1. set-operation mean latencies (μs)

Redis result values are higher than in use case simulation 1., it is increased by about

50%. WhiteDB result values have increased also, and relative increase is much higher

than in Redis (about 300% increase). Absolute values are, however, still notably lower

than Redis values.

83

According to previous performance measurements using Redis as SDL data storage

backend and according to Redis documentation [59], Redis performance starts to nota-

bly decrease at about 100000op/s. Thus, it is not surprising that Redis performance did

not notably change from use case simulation 1. For WhiteDB there was no previous

results of corresponding evaluation. As already discussed in other results, WhiteDB la-

tency is expected to be mostly due to lock contention.

Edge Case Simulation 2. Long Simulation Duration

This simulation was like use case simulation 1. but simulation duration was increased

from 50s to 2000s (33,3min) to ensure that evaluated data storage solutions can perform

operations reliably for longer periods of time as well.

All evaluated data storage solutions processed this simulation without errors and results

were at same level as use case 1. simulation results.

6.4 Summary of the Results

Executed simulations evaluated the solution (Superdome Flex) from two different as-

pects, performance (use case simulations) and reliability (edge case simulations). Three

different SDL API implementations were evaluated, TCP/IP based Redis (existing SDL

API implementation) and two different (WhiteDB and sharedhashfile) shared memory-

based SDL API implementations (which were implemented during this thesis work).

In the high-level, results showed that in Superdome Flex evaluation environment shared

memory-based SDL API implementations has notably better performance than Redis

based SDL API implementation. WhiteDB had better performance than sharedhashfile,

especially in write (set) operations. However, like discussed in Section 6.2.1, there are

possibilities to optimize sharedhashfile write performance and those optimizations were

not evaluated during this thesis work. There might be possibilities to improve Redis per-

formance as well. In this thesis work, only standalone Redis deployment was evaluated.

Redis cluster deployment can increase Redis performance because operations are di-

vided among several Redis servers [48]. On the other hand, during this evaluation Redis

performance did not notable decrease due to operation frequency, not even in edge case

simulations. Like discussed in edge case simulation results (Section 6.3), this was ex-

pected, as single Redis server can handle higher operation frequency than the ones

84

generated in these simulations. Thus, it is not likely that Redis cluster deployment would

not have notably increased the Redis performance during this evaluation. Redis perfor-

mance could also be increased by optimizing TCP/IP network performance. For exam-

ple, Nokia NCIR IaaS solution has TCP/IP network performance optimizations which

were not available in the SOC8 IaaS used during this evaluation. However, performance

difference between Redis and shared memory-based solutions was so high that it is not

realistic that such TCP/IP network optimizations would have made significant difference

to result conclusions.

Shared memory-based SDL API implementations (especially WhiteDB) did have notable

better performance than Redis based SDL API implementation. At highest, operation

latency was ten-times lower. However, difference started to decrease when the operation

frequency increased. But even so, in all executed simulations performance was notably

better when using shared memory-based SDL API implementations. Shared memory-

based solutions have potential for even better performance as the idle memory latencies

between different Superdome Flex numa nodes is in nanosecond level (Appendix 2.).

Also, Tero Lindfors received nanosecond level latency results in his measurements

where he evaluated WhiteDB in Superdome Flex in lock-free manner [6].

Reliability evaluation performed via edge case simulations was successfully passed us-

ing Redis and WhiteDB. Small amount of SDL API operations failed when edge case

simulation was run using sharedhashfile based solution. These failures were not inves-

tigated further during this thesis study, and thus there are no estimates how severe these

errors were.

85

7 Summary and Conclusions

This chapter starts with a review of the thesis objective. After that the evaluation process

and outcome are discussed. Last part of this chapter suggests further study possibilities.

Objective of this thesis was to evaluate Memory Driven Computing-architecture as a low

latency shared database between different computing nodes (VMs deployed to different

computing nodes) in Nokia Cloud Base Station product. Nokia CBTS has an abstract

interface (SDL API), which provides key-value paradigm-based access to inter-VM

shared database. Therefore, it was convenient to perform the evaluation by utilizing SDL

API. That is, SDL API interface was implemented in MDC-architecture based environ-

ment (Superdome Flex product) and that implementation was evaluated using Use Case

method (cases were based to SDL API real-world use cases). There were two main

artefacts provided by this thesis work:

• Two different (WhiteDB and sharedhashfile) shared memory-based imple-mentations for SDL API

• Evaluation of the implemented data storage solutions in MDC-architecture based environment (Superdome Flex)

Subject of this thesis was broad, and it dealt with new technology. Only a prototype of

an MDC-architecture based computer was available when this thesis work started. Due

to this, there was limited amount of existing information available and thus considerable

amount of time was spend doing experiments and gathering information. There were

some uncertainties regarding the evaluation environment. In the end, real MDC-archi-

tecture based computer was not available during this thesis work, but Superdome Flex

product was released during this thesis work and it provided a very suitable evaluation

environment. Superdome Flex contained the high-level features of the MDC-architecture

and thus a very beneficial evaluation was conducted in Superdome Flex environment.

Evaluation provided concrete measurements of SDL API performance and reliability in

Superdome Flex environment and thus thesis objective was met. Section 4.4.1 discusses

the differences of complete MDC-architecture and Superdome Flex in detail and as well

provides information what to consider if similar evaluation would be conducted in com-

plete MDC-architecture in the future.

Evaluation result summary can be found in Section 6.4. Outcome of the evaluation result

was that MDC-architecture (in form of Superdome Flex product) can reduce the SDL API

86

access latency to the targets which were discussed in Section 6.1. When WhiteDB based

SDL API implementation was used in Superdome Flex, access latency was below 100

microseconds in all use case simulations. Whereas current Redis based SDL API imple-

mentation had access latency of over 300 microseconds in same simulations. From reli-

ability point of view WhiteDB met the target which was successful execution of all SDL

API operations in all simulations (as did also Redis). Based on evaluation results and

other measurements done, most of the shared memory database latency is caused by

the locking required by strict concurrency control and nanosecond level access latency

might be possible without locking. Therefore, if MDC architecture would be used only in

such SDL API use cases where locking would not be needed (e.g. use case where it is

known that same data is never written and read concurrently), MDC architecture could

provide an ultra-low latency solution. As such, in the scope of this thesis, MDC-architec-

ture is a potential alternative for creating a low latency shared data storage between

several edge cloud data centre VMs. However, given the broadness of the subject, this

thesis studied the subject from a limited scope and further study is recommended to be

conducted. Some of the recommended further study possibilities are listed below:

Suggestions for Further Study

Shared memory database access latency was suspected to be caused mostly due to

lock waiting. It would be good to further study this and try to optimize the lock handling.

Other shared memory databases could still be evaluated. If evaluation would be per-

formed in environment having NVM, there would be more shared memory databases

available.

Possibilities to access shared memory from VMs needs further study as in this thesis

shared memory was only accessed directly from SDF host (as discussed in Section

4.4.1). It would be interesting to see in what extent that would affect to performance.

Evaluation in this thesis concentrated mainly on performance. Consistency was not ver-

ified, but it was relied that lock-based concurrency control provided consistency. Further

evaluation on consistency and reliability would be useful. Evaluation could also be per-

formed using different use cases, or even a completely different research approach, like

quantitative research.

87

References

[1] ETSI Industry Specification Group , “Network Functions Virtualisation (NFV);

Infrastructure Overview,” [Online]. Available: http://www.etsi.org/deliver/etsi_gs/

NFV-INF/001_099/001/01.01.01_60/gs_nfv-inf001v010101p.pdf.

[Accessed 12 Dec 2017].

[2] M. Weldon, The Future X Network a Bell Labs Perspective, CRC press, 2015.

[3] K. Keeton, Memory-Driven Computing, 15th USENIX Conference on File and

Storage Technologies, 2016.

[4] M. Kim, J. Li, H. Volos, M. Marwah, A. Ulanov, K. Keeton, J. Tucek, L.

Cherkasova, L. Xu and P. Fernando, “Sparkle: Optimizing Spark for Large

Memory Machines and Analytics,” in arXiv.org e-Print archive, Cornell University

Library, 2017.

[5] The Machine User Group, “Webinar: DZNE and HPE: Harnessing Memory-

Driven Computing to fight Alzheimer’s,” [Online]. Available: https://community.

hpe.com/t5/Behind-the-scenes-Labs/WATCH-the-webinar-DZNE-and-HPE-

Harnessing-Memory-Driven/ba-p/6980415. [Accessed 12 Dec 2017].

[6] T. Lindfors, “Data access performance impact of a memory-centric radio

network,” 10 October 2018. [Online]. Available:

https://aaltodoc.aalto.fi/handle/123456789/34406. [Accessed 8 November 2019].

[7] Qualcomm, “The Evolution of Mobile Technologies: 1G, 2G, 3G, 4G LTE,” 2014.

[Online]. Available: https://www.qualcomm.com/media/documents/files/the-

evolution-of-mobile-technologies-1g-to-2g-to-3g-to-4g-lte.pdf.

[Accessed 19 Feb 2018].

[8] Y. Zaki, Future mobile communications: LTE optimization and mobile network,

Springer-Verlag, 2012.

[9] A. Checko, H. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. Berger and

L. Dittmann, “Cloud RAN for Mobile Networks—A Technology Overview,”

IEEE Communications surveys & tutorials, vol. 17, no. 1, pp. 405-426, 2015.

[10] E. Ryytty, “IEEE802.1CM Terminology,” [Online]. Available:

http://www.ieee802.org/1/files/public/docs2015/cm-ryytty-

terminologyconsiderations-1115.pdf. [Accessed 26 2 2018].

88

[11] sharetechnote, “5G/NR - RAN Architecture,” [Online]. Available:

http://www.sharetechnote.com/html/5G/5G_RAN_Architecture.html.

[Accessed 19 Oct 2019].

[12] National Institute of Standards and Technology, “The NIST Definition of Cloud

Computing,” September 2011. [Online]. Available: http://faculty.winthrop.edu/

domanm/csci411/Handouts/NIST.pdf. [Accessed 13 June 2018].

[13] J. Ling, “‘As a Service’: What Is A Cloud Computing Stack?,” LegalVision,

24 February 2017. [Online]. Available: https://www.lexology.com/library/

detail.aspx?g=6aaba427-d980-4938-856a-4a1308444cc2. [Accessed 8 31 2018].

[14] Red Hat, “What's the difference between cloud and virtualization?,” Red Har,

[Online]. Available: https://www.redhat.com/en/topics/cloud-computing/cloud-vs-

virtualization. [Accessed 13 June 2018].

[15] I. Hashem, I. Yaqoob, N. Anuar, S. Mokhtar, A. Gani and S. Khan, “The rise of

"big data" on cloud computing,” Information Systems, vol. 47, no. C, pp. 98-115,

2015.

[16] E. Brewer, “CAP twelve years later: How the “Rules” have changed,” Computer,

vol. 45, no. 2, pp. 23-29, 2012.

[17] “The telco cloud dilemma: How to succeed in the IaaS marketplace,” 30 August

2013. [Online]. Available: https://www.cloudcomputing-news.net/news/2013/aug/

30/the-telco-cloud-dilemma-how-to-succeed-in-iaas-marketplace/.

[Accessed 7 September 2018].

[18] Openstack development team, “openstack,” 2019. [Online]. Available:

https://www.openstack.org/. [Accessed 4 May 2019].

[19] Nokia, “Nokia AirFrame Cloud Infrastructure for Real-time applications (NCIR),”

2018. [Online]. Available: https://onestore.nokia.com/asset/205140.

[Accessed 4 May 2019].

[20] Nokia, “Creating new data freedom with the Shared Data Layer. White Paper,”

[Online]. Available: https://onestore.nokia.com/asset/200238/

Nokia_Shared_Data_Layer_White_Paper_EN.pdf. [Accessed 12 Dec 2017].

[21] Nokia, RCP Shared Data Layer (SDL), Espoo: Company confidental document,

2019.

[22] K. Hoffman, Beyond the Twelve-Factor App Exploring the DNA of Highly

Scalable, Resilient Cloud Applications, O'Reilly, 2016.

89

[23] A. Checko, A. Avramova, M. Berger and H. Christiansen, “Evaluating C-RAN

fronthaul functional splits in terms of network level energy and cost savings,”

2016. [Online]. Available: http://ieeexplore.ieee.org/document/7487973/.

[Accessed 5 Mar 2018].

[24] ETSI, “Mobile edge computing - A key technology towards 5G. White Paper,”

[Online]. Available: http://www.etsi.org/images/files/ETSIWhitePapers/

etsi_wp11_mec_a_key_technology_towards_5g.pdf. [Accessed 5 Mar 2018].

[25] Nokia, “Building a cloud-native core for a 5G world. White Paper,” [Online].

Available: https://onestore.nokia.com/asset/200888/Nokia_AirGile_Cloud-

native_Core_White_Paper_EN.pdf. [Accessed 5 Mar 2018].

[26] M. Davis, The Universal Computer: The Road from Leibniz to Turing, W. W.

Norton & Company, 2000.

[27] J. D. McCalpin, “SC16 Invited Talk: Memory Bandwidth and System Balance in

HPC Systems,” 22 November 2016. [Online]. Available:

http://sites.utexas.edu/jdm4372/2016/11/22/sc16-invited-talk-memory-bandwidth-

and-system-balance-in-hpc-systems/. [Accessed 1 June 2018].

[28] G. Moore, Cramming more components onto integrated circuits, Morgan

Kaufmann Publishers Inc., 2000.

[29] A. Clements, Principles of Computer Hardware, OUP Oxford, 2006.

[30] M. Perrone, “Multicore Programming Challenges,” in Euro-Par 2009 Parallel

Processing, 2009.

[31] T. Simon and J. McGalliard, “Multi-Core Processor Memory Contention

Benchmark Analysis Case Study,” 2009. [Online]. Available:

https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20090038666.pdf.

[Accessed 8 June 2018].

[32] D. Sorin, M. Hill and D. Wood, A primer on memory consistency and cache

coherence, Morgan & Claypool Publishers, 2011.

[33] T. Mattson, “The Future of Many Core Computing: A tale of two processors,”

January 2010. [Online]. Available: https://cseweb.ucsd.edu/classes/fa12/cse291-c

/talks/SCC-80-core-cern.pdf. [Accessed 2 July 2018].

[34] N. Manchanda and A. Karan, “Non-Uniform Memory Access (NUMA),”

5 May 2010. [Online]. Available: https://cs.nyu.edu/~lerner/spring10/projects/

NUMA.pdf. [Accessed 2 July 2018].

90

[35] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large

Clusters,” in Proceedings of the 6th conference on Symposium on Opearting

Systems Design & Implementation, 2004.

[36] C. Mellor, “A closer look at HPE's 'The Machine',” The Register, 24 November

2016. [Online]. Available: https://www.theregister.co.uk/2016/11/24

/hpes_machinations_to_rewrite_server_design_laws/. [Accessed 14 September

2018].

[37] Y. Shan, S.-Y. Tsai and Y. Zhang, “Distributed Shared Persistent Memory,” in

Proceedings of the 2017 Symposium on Cloud Computing, Santa Clara, 2017.

[38] M. Funk, “Programming for persistent memory takes persistence,” 21 April 2016.

[Online]. Available: https://www.nextplatform.com/2016/04/21/programming-

persistent-memory-takes-persistence/. [Accessed 14 September 2018].

[39] The Gen-Z Consortium, “Gen-Z Core Specification 1.0,” 13 February 2018.

[Online]. Available: https://genzconsortium.org/specification/core-specification-

1-0/. [Accessed 21 September 2018].

[40] HPE, “Linux kernel with changes for Fabric Attached Memory,” HPE, 8 Feb 2018.

[Online]. Available: https://github.com/FabricAttachedMemory/linux-l4fame.

[Accessed 16 November 2018].

[41] HPE, “The Librarian File System (LFS) Suite,” HPE, 12 April 2018. [Online].

Available: https://github.com/FabricAttachedMemory/tm-librarian. [Accessed 16

November 2018].

[42] A. Bhattacharjee and D. Lustig, Architectural and Operating System Support for

Virtual Memory, Morgan & Claypool Publishers, 2017.

[43] HPE, “Fabric-Attached Memory,” HPE, 2018. [Online]. Available:

https://github.com/FabricAttachedMemory/. [Accessed 16 November 2018].

[44] S. Singhal, “How HPE Superdome Flex’s in-memory computing gives you a head

start on Memory-Driven Computing,” HPE, 7 November 2017. [Online]. Available:

https://community.hpe.com/t5/Behind-the-scenes-Labs/How-HPE-Superdome-

Flex-s-in-memory-computing-gives-you-a-head/ba-p/6987752. [Accessed 26

October 2018].

[45] HPE, “Superdome Flex Architecture and RAS: x86 Server Solution technical white

paper,” November 2017. [Online]. Available:

https://h20195.www2.hpe.com/V2/getpdf.aspx/A00036491ENW.pdf?. [Accessed

26 October 2018].

91

[46] H. Harrison, M. Birks, R. Franklin and J. Mills, “Case Study Research:

Foundations and Methodological Orientations,” [Online]. Available:

http://www.qualitative-research.net/index.php/fqs/article/view/2655/4079.

[Accessed 5 Feb 2018].

[47] H. Kimura, “FOEDUS: OLTP Engine for a Thousand Cores and NVRAM,” in

Proceedings of the ACM SIGMOD International Conference on Management of

Data, Melbourne, 2015.

[48] Redis development team, “Redis,” 2019. [Online]. Available: https://redis.io/.


[49] WhiteDB team, “Whitedb,” 2013. [Online]. Available: http://whitedb.org/. [Accessed

1 February 2019].

[50] S. Hardy-Francis, “sharedhashfile,” January 2019. [Online]. Available:

https://github.com/simonhf/sharedhashfile. [Accessed 3 February 2019].

[51] SUSE Linux, “Release Notes | SUSE OpenStack Cloud 8,” 22 March 2019.

[Online]. Available: https://www.suse.com/releasenotes/x86_64/SUSE-

OPENSTACK-CLOUD/8/. [Accessed 4 May 2019].

[52] Intel, “Intel Memory Latency Checker v3.6,” Intel, 3 December 2018. [Online].

Available: https://software.intel.com/en-us/articles/intelr-memory-latency-checker.


[53] Boost development team, “Boost C++ libraries,” 12 April 2019. [Online]. Available:

https://www.boost.org/. [Accessed 17 May 2019].

[54] CMake development team, “CMake,” [Online]. Available: https://cmake.org/.


[55] hiredis development team, “redis/hiredis Minimalistic C client for Redis,” 12 April

2019. [Online]. Available: https://github.com/redis/hiredis. [Accessed 17 May

2019].

[56] pip development team, “pip,” 6 May 2019. [Online]. Available:

https://pypi.org/project/pip/. [Accessed 17 May 2019].

[57] Virtualenv development team, “virtualenv,” 15 May 2019. [Online]. Available:

https://virtualenv.pypa.io/en/latest/. [Accessed 17 May 2019].

[58] Redis development team, “Redis latency problems troubleshooting,” 2019.

[Online]. Available: https://redis.io/topics/latency. [Accessed 26 Oct 2019].

92

[59] Redis development team, “How fast is Redis?,” 2019. [Online]. Available:

https://redis.io/topics/benchmarks. [Accessed 26 Oct 2019].

Appendix 1

1 (2)

SDL API client simulator program input and output

Input: command line arguments:

Program options:

-h [ --help ] produce help message

-i [ --interval ] arg (=1000) milliseconds between each operation

-r [ --reads ] arg (=1) max number of read operations to

perform

-w [ --writes ] arg (=1) max number of write operations to

perform

--seqreads arg (=1) do n read operations in a sequence

--seqwrites arg (=1) do n write operations in a sequence

-l [ --writelength ] arg (=15) length of the strings to be written

-t [ --timeout ] arg (=60) time after which the program stops in

seconds

-o [ --output ] arg (=statistics.json)

path to save the results in

-n [ --namespace ] arg (=tag1) SDL namespace name

Output: JSON format statistics file, example:

{

"general": {

"totalTestDuration_s": "0",

"operationInterval_ms": "5"

},

"ops": {

"get": {

"stats": {

"scheduled": "1",

"completed": "1",

"failed": "0",

"late": "0"

},

"data": {

"delays": [

"310"

],

"op_counts": [

"1"

]

}

},

"set": {

"stats": {

"scheduled": "1",

"completed": "1",

"failed": "0",

"late": "0"

},

"data": {

"delays": [

"217"

],

"op_counts": [

"1"

Appendix 1

2 (2)

]

}

}

}

}

Appendix 2

1 (1)

Intel Memory Latency Checker output from Superdome Flex

Appendix 3

1 (1)

Asynchronous SDL C++ API storage interfaces

using ReadyAck = std::function<void(const std::error_code& error)>;

virtual void waitReadyAsync(const ReadyAck& readyAck) = 0;

using Key = std::string;

using Data = std::vector<uint8_t>;

using DataMap = std::map<Key, Data>;

using ModifyAck = std::function<void(const std::error_code& error)>;

virtual void setAsync(const DataMap& dataMap, const ModifyAck& modi

using ModifyIfAck = std::function<void(const std::error_code& error, bool sta-

tus)>;

virtual void setIfAsync(const Key& key,

const Data& oldData,

const Data& newData,

const ModifyIfAck& modifyIfAck) = 0;

virtual void setIfNotExistsAsync(const Key& key,

const Data& data,

const ModifyIfAck& modifyIfAck) = 0;

using Keys = std::set<Key>;

using GetAck = std::function<void(const std::error_code& error, const DataMap&

dataMap)>;

virtual void getAsync(const Keys& keys, const GetAck& getAck) = 0;

virtual void removeAsync(const Keys& keys, const ModifyAck& modifyAck) = 0;

virtual void removeIfAsync(const Key& key, const Data& data, const Modify-

IfAck& modifyIfAck) = 0;

using GetAllAck = std::function<void(const std::error_code& error, const Keys&

keys)>;

virtual void getAllAsync(const GetAllAck& getAllAck) = 0;

virtual void removeAllAsync(const ModifyAck& modifyAck) = 0;