Secure and Dependable Multi-Cloud Network Virtualization · 2020-03-20 · nuvem a possibilidade de...

2019

UNIVERSIDADE DE LISBOA

FACULDADE DE CIÊNCIAS

Secure and Dependable Multi-Cloud Network Virtualization

“ Documento Definitivo”

Doutoramento em Informática

Especialidade de Ciência da Computação

Max Silva Alaluna

Tese orientada por:

Prof. Doutor Fernando Manuel Valente Ramos

Documento especialmente elaborado para a obtenção do grau de doutor

2019

UNIVERSIDADE DE LISBOA

FACULDADE DE CIÊNCIAS

Secure and Dependable Multi-Cloud Network Virtualization

Doutoramento em Informática

Especialidade de Ciência da Computação

Max Silva Alaluna

Tese orientada por:

Prof. Doutor Fernando Manuel Valente Ramos

Júri:

Presidente:

● Doutor Nuno Fuentecilla Maia Ferreira Neves, Professor Cadetrático e membro do Conselho

Científico da Universidade de Lisboa

Vogais:

● Doutor Edmundo Heitor Silva Monteiro, Professor Catedrático - Faculdade de Ciências e

Tecnologia da Universidade de Coimbra

● Doutor Rui Luís Andrade Aguiar, Professor Catedrático - Departamento de Eletrónica e

Telecomunicações e Informática da Universidade de Aveiro

● Doutor João Tiago Medeiros Paulo, Investigador Auxiliar - Intituto de Engenharia de

Sistemas e Computação, Tecnologia e Ciências - INESTEC

● Doutora Sara Alexandra Cordeiro Madeira, Professora Associada - Faculdade de Ciências da

Universidade de Lisboa

● Doutor Fernando Manual Valente Ramos, Professor Auxiliar - Faculdade de Ciências da

Universidade de Lisboa (orientador)

Documento especialmente elaborado para a obtenção do grau de doutor

Financiado pelo CNPQ e pelo Exército Brasileiro

À minha família.

Abstract

Virtualization is a consolidated technology in modern computers, enabling dis-tinct virtual machines to share the same hardware resources. This technol-ogy underpinned cloud computing, enabling infrastructure providers to ex-tend their services with elastic computing and storage services. Today, thenumber of virtual servers already surpasses the number of physical servers,in a clear demonstration of the success of this technology. Unfortunately, net-working has lagged behind. Traditional network primitives (e.g., VLANs) donot present the scalability and flexibility that is necessary for the “as-a-service”model of cloud computing. As a result, existing cloud services do not offernetwork guarantees, hindering their adoption by a large class of applications.

This situation has started changing with Software-defined networking (SDN),a new paradigm that proposes the logical centralization of network control.Advanced network virtualization platforms use SDN to give cloud users thefreedom to specify their virtual network topologies and addressing schemes,for the first time enabling complete network virtualization. These solutionswere a huge step forward, but they still have limitations. First, they targeta single datacenter of a cloud provider. This limits their scalability and is ef-fectively a single point of failure for the tenant’s virtual networks. Second,the virtual network services offered are restricted to traditional services, suchas L2 switching, L3 routing, or Access Control List (ACL) filtering. This es-tablishes them as insufficient to support (critical) applications that need to bedeployed across multiple trust domains for resiliency while enforcing diversesecurity requirements. In addition, most solutions that are efficient in mappingthe tenant’s virtual network requests to the substrate typically do not scale tolarge networks. Finally, they also fail to provide the elasticity required in cloudcomputing, not allowing virtual networks to scale out or scale in.

In this thesis, we address these limitations by proposing Sirius: the first multi-cloud network virtualization platform. Sirius allows virtual networks to seam-

lessly span across a substrate composed of multiple cloud infrastructures, in-cluding public clouds and private data centers. By replicating elements acrossdifferent clouds, tenants avoid any single point of failure, thus addressing thefirst challenge. Besides enhancing the substrate, Sirius also enhances the vir-tual networks with security and dependability. For this purpose, in this thesiswe propose novel network embedding algorithms to find efficient mappings ofvirtual network requests onto the substrate network that consider security andavailability of virtual resources. Specifically, we propose an optimal solutionbased on Mixed-Integer Linear Programming (MILP), and also heuristics thatscale to very large networks, while achieving results close to optimal. Thesesolutions enable us to address challenges two and three. Finally, to address thelast challenge we propose new algorithms that allow virtual networks to scaleout and scale in, enabling elasticity to tenant’s environments.

We implemented a prototype of Sirius, and evaluated all solutions using bothlarge scale simulations and a real testbed environment running our prototype.The latter consists of a substrate composed of a private datacenter and two pub-lic clouds (Amazon and Google). Our evaluations demonstrate that the systemscales well for networks of thousands of switches employing diverse topolo-gies and improves on the virtual network acceptance ratio and provider profitwhen compared to the state-of-the-art. In particular, the acceptance ratios areless than 1% from the optimal, and the system can provision a 10 thousandcontainer virtual network in approximately 2 minutes.

Overall, the evaluations demonstrate the feasibility of our proposal in achiev-ing good trade-offs concerning security and performance, and are therefore astep forward in the enrichment of cloud computing services.

Keywords: Network Virtualization, Cloud Computing, Multi-cloud, VirtualNetwork Embedding

Resumo

A virtualização é uma tecnologia consolidada na computação moderna, per-

mitindo que máquinas virtuais distintas compartilhem os mesmos recursos de

hardware. Esta é a tecnologia base da computação em nuvem, a qual permitiu

aos provedores de infraestrutura ampliarem os seus serviços com computação

e serviços de armazenamento elásticos. Hoje, o número de servidores virtuais

já supera o número de servidores físicos, numa demonstração clara do sucesso

desta tecnologia. Infelizmente, o progresso na componente de rede tem sido

muito diferente. As primitivas tradicionais de rede (por exemplo, VLANs) não

apresentam a escalabilidade e a flexibilidade que são necessárias para o modelo

de serviços de computação em núvem. Como resultado, os serviços de nuvem

oferecem garantias limitadas no que diz respeito à rede, impedindo a sua adoção

por uma grande variedade de aplicações.

Esta situação começou a mudar recentemente com o advento das Redes Definidas

por Software (Software-defined networking – SDN), um novo paradigma que

propõe a centralização lógica do controle de rede. Plataformas avançadas de

virtualização de redes usam SDN para oferecer aos utilizadores (tenants) da

nuvem a possibilidade de especificar as topologias de rede virtual e esquemas

de endereçamento arbitrários, pela primeira vez possibilitando a virtualização

completa de redes. Estas soluções foram um grande avanço, mas possuem lim-

itações. Primeiro, são direcionadas para um único provedor de serviço. Isso

limita a sua escalabilidade e resulta, efetivamente, num ponto único de falha

para os usuários da rede virtual. Em segundo lugar, os serviços de rede vir-

tual oferecidos são restritos aos serviços tradicionais, como o switching L2, o

roteamento L3 ou a filtragem por meio de Listas de Controle de Acesso (Access

Control List – ACL). Esta abordagem apresenta-se assim insuficiente para supor-

tar aplicações (críticas) que precisam de ser implantadas em vários domínios de

confiança, ao mesmo tempo que exigem diversos requisitos de segurança. Além

disso, a maioria das soluções que são eficientes no que diz respeito ao mapea-mento de redes virtuais do utilizador na rede do substrato, tipicamente nãoescalam para topologias grandes. Finalmente, também não oferecem a elastici-dade requerida na computação em núvem, não permitindo o aumento (scalingout) e a diminuição (scaling in) das redes virtuais.

Nesta tese, abordamos estas limitações propondo o Sirius: a primeira plataformade virtualização de redes para ambientes multi-cloud. O Sirius permite que as re-des virtuais se estendam por um substrato composto por várias infraestruturasde nuvem, incluindo nuvens públicas e centro de dados privados. Ao replicarelementos em diferentes nuvens, os utilizadores evitam a existência de pontosúnicos de falha, assim resolvendo o primeiro desafio. Além de enriquecer o sub-strato, o Sirius também enriquece as redes virtuais, com requisitos de segurançae confiabilidade. Com este objetivo, nesta tese propomos novos algoritmos demapeamento de redes virtuais na rede do substrato que consideram requisitos desegurança e de disponibilidade. Especificamente, propomos uma solução ótimabaseada em modelos de Programação Linear Inteira Mista (Mixed-integer Lin-ear Programming – MILP), e também heurísticas capazes de escalar para redesmuito grandes e atingir resultados próximos do ótimo. Finalmente, propomosnovos algoritmos que permitem que as redes virtuais aumentem e diminuam,oferecendo a elasticidade esperada neste tipo de ambiente.

A nossa contribuição inclui o desenvolvimento de um protótipo do Sirius. Assoluções propostas são avaliadas recorrendo a simulações de larga escala e auma rede real rodando nosso protótipo num ambiente com múltiplas clouds,públicas e privadas. Os resultados demonstram a viabilidade das propostas, ap-resentando bons compromissos em relação à segurança e desempenho, o quenos leva a acreditar ser um passo em frente no enriquecimento dos serviços deprovedores de núvem.

Palavras Chave: Virtualização de Redes, Computação em Nuvem, Multi-cloud,Virtual Network Embedding

Resumo Alargado

A gestão moderna das infraestruturas de computação que levou ao surgimentoda computação em nuvem tornou-se possível graças ao advento da virtualiza-ção de servidores. Ao expor uma abstração de software (Virtual Machine – VM)aos utilizadores, em vez da própria máquina física, a virtualização possibilitouo grau de flexibilidade necessário para que os operadores de infraestrutura at-injam os seus objetivos operacionais ao mesmo tempo em que satisfazem asnecessidades dos clientes da infraestrutura (os “tenants”).

Infelizmente, os atuais requisitos dos utilizadores das infraestruturas de com-putação em nuvem – por exemplo, a capacidade de migrar os seus workloadsinalterados para a nuvem - não podem ser atendidos apenas com a virtualiza-ção do servidor. A raiz do problema é o fato de que, embora a virtualização dacomputação e do armazenamento sejam tecnologias bastante avançadas, a vir-tualização de rede ainda não o é. A virtualização completa da rede, isto é, o de-sacoplamento total dos serviços lógicos de rede da sua realização física (Casadoet al., 2010), não é possível com as primitivas tradicionais de virtualização derede, como as VLANs. Estas primitivas apenas fornecem formas restritas deisolamento.

Em suma, os mecanismos de virtualização disponíveis são muito rudimenta-res e não possuem a escalabilidade necessária para fornecer uma virtualizaçãocompleta da rede (Yu et al., 2011). Esta situação tem como cerne um problemafundamental: as redes tradicionais são muito difíceis de gerir. No entanto, amudança recente de paradigma que promove a centralização lógica do controlede rede e o desacoplamento entre o plano de controle e o plano de dados – as Re-des Definidas por Software (Software-defined networking – SDN) (Kreutz et al.,2015) – permitiu o desenvolvimento de plataformas de virtualização completada rede. Estas soluções permitem a criação de redes virtuais, cada uma commodelos de serviços independentes, e topologias e esquemas de endereçamentoarbitrários, compartilhando a mesma infraestrutura física (Koponen et al., 2014).

Estas plataformas (Al-Shabibi et al., 2014; Dalton et al., 2018; Firestone et al.,2018; Koponen et al., 2014) representam o estado da arte no que diz respeito à

virtualização de rede, mas apresentam algumas limitações. Em primeiro lugar,

consideram um substrato controlado por um único operador. Esta particulari-

dade afeta estas soluções do ponto de vista da resiliência, ao criar, por exemplo,

um ponto único de falha, e essa restrição pode tornar-se uma barreira impor-

tante à medida que aplicações críticas são movidas para a nuvem. Outra con-

sequência negativa diz respeito à privacidade. Por exemplo, o cumprimento de

determinada legislação pode exigir que certos tipos de workloads do cliente per-

maneçam em ambientes específicos (seja um cluster privado ou uma infraestru-

tura em nuvem localizada num país específico). Este tipo de requisito é particu-

larmente importante no contexto da nova legislação relativa à proteção de dados

(nomeadamente, o Regulamento Geral de Proteção de Dados, ou RGPD), que

normalmente exige o recurso a abordagens ad-hoc para cumprir os requisitos

legais.

Estes problemas das soluções existentes são a nossa motivação para estender a

virtualização de rede para um ambiente multi-nuvem, enriquecendo o substrato

com recursos de mútiplas infraestruturas de núvem, privadas e/ou públicas.

Este enriquecimento do substrato traz benefícios importantes. Primeiro, os

serviços de um utilizador (tenant) passam a ser imunes à indisponibilidade de

um dado datacenter ou de uma zona da nuvem, ao replicar os serviços de rede

em vários provedores. Esta é uma preocupação atual, dado o grande número

de incidentes envolvendo falhas acidentais e maliciosas nas infra-estruturas da

nuvem (Khan, 2016; Suryateja, 2018), os quais mostram que confiar num único

provedor pode trazer problemas de disponibilidade e confiabilidade aos serviços

baseados na nuvem. Em segundo lugar, os custos para o utilizador também po-

dem ser diminuídos, aproveitando os planos de preços de vários provedores

de nuvem. Um exemplo inclui o uso de instâncias (VMs) spot EC2 da Ama-

zon, que têm sido recentemente exploradas para reduzir significativamente os

custos de certos workloads quando comparado com custos de instâncias tradi-

cionais (Zheng et al., 2015). À medida que os provedores aumentam o suporte a

unidades de computação com preços dinâmicos, as oportunidades adicionais de

economia aumentam, por exemplo, atravéz da migração das redes do utilizadorpara locais menos dispendiosos. Em terceiro lugar, o aumento do desempenhotambém pode ser alcançado ao aproximar os serviços dos clientes ou migrandocomponentes da rede que, em determinado momento, precisam de cooperarmais.

Uma segunda limitação das atuais soluções de virtualização de rede é a sua ofertaser restrita a serviços de rede convencionais (como conectividade L2 ou rotea-mento L3). Dada a nossa motivação em hospedar serviços críticos na núvem, anão consideração de segurança e confiabilidade é uma restrição importante.

Nesta tese, propomos uma solução de virtualização de redes multi-cloud. Anossa proposta parte de duas ideias principais. Primeiro, a rede do substrato éenriquecida com recursos de múltiplas infraestruturas, incluindo centros de da-dos privados e núvens públicas. Segundo, as redes virtuais oferecem serviços desegurança, disponibilidade, e elasticidade inteiramente definidos pelo utilizadorda rede virtual.

Conceber uma solução de virtualização de redes multi-núvem envolve váriosdesafios. Primeiro, é necessário criar uma única abstração de nuvem para osutilizadores, que não devem perceber que o substrato é partilhado. Em se-gundo lugar, para permitir a virtualização completa da rede, é necessário que osutilizadores tenham a liberdade de especificar as topologias de rede e os esque-mas de endereçamento de forma arbitrária, garantindo o nível de isolamentorequerido. Em terceiro lugar, a oferta de redes virtuais com propriedades desegurança e confiabilidade requer a concepção de novos algoritmos para mapea-mento de redes. Em quarto lugar, é necessário que as soluções sejam eficientese escaláveis, tornando necessária a investigação de otimizações, não apenas nomapeamento, mas também na operação da infraestrutura.

Nesta tese atacamos o primeiro desafio através da criação de uma nova camadade rede que corre em cima dos serviços virtualizados oferecidos pelas núvens.Desta forma, conseguimos mascarar a heterogeneidade dos recursos dos di-ferentes provedores e apresentar ao utilizador uma infraestrutura virtual ho-mogênea. A nossa solução segue uma abordagem SDN: a nova camada de redes

proposta inclui elementos de rede que são controlados e configurados remo-tamente por um controlador SDN, de forma a realizar os mapeamentos doselementos virtuais nos físicos para garantir que a rede virtual seja completa-mente desacoplada da rede física. O controle centralizado oferecido pelo SDNpermite ao utilizador utilizar qualquer endereço de rede (de camada L2 e L3),definir topologias de rede arbitrárias e garantir o isolamento entre redes virtu-ais, assim endereçando o segundo desafio.

Para resolver o terceiro desafio, nesta tese investigamos um componente cen-tral da virtualização de redes – o mapeador (embedder) de redes virtuais. Estecomponente inclui os algoritmos que mapeiam, de forma eficiente, os pedidosde redes virtuais na rede do substrato. A principal inovação da nossa soluçãoprende-se com a consideração de requisitos de segurança e disponibilidade, per-mitindo a definição pelo utilizador de requisitos ao nível de três recursos cen-trais: as ligações virtuais entre elementos da rede, permitindo, por exemplo,que estas ligações sejam redundantes; nos switches virtuais, através de suportede vários níveis de segurança e redundância; e nas próprias infraestruturas denúvem, que podem ter diversos níveis de segurança e confiabilidade. A nossaprimeira solução baseia-se em técnicas matemáticas de Programação Linear In-teira Mista (MILP) de recursos. Propomos, ainda, um algoritmo heurístico demapeamento de redes virtuais que além de garantir o cumprimentos dos requi-sitos de segurança dos recursos virtuais, escala para redes de grandes dimensõese mantem o mesmo nível de eficiência da solução ótima.

A nossa última contribuição ataca o desafio da elasticidade. Propomos paraisso novas primitivas e algoritmos que permitem o aumento (scaling out) e adiminuição (scaling in) das redes virtuais, juntamente com a reconfiguração darede do substrato de forma a aumentar a eficiência e o uso dos recursos. Asnossas soluções atingem eficiências elevadas e simultaneamente minimizam adisrupção nos serviços dos utilizadores.

Palavras Chave: Virtualização de Redes, Computação em Nuvem, Multi-cloud,Virtual Network Embedding

Acknowledgements

First and foremost, I would like to thank God. Without Him, none of thiswould be possible.

I want to acknowledge my supervisor, Professor Fernando M. V. Ramos, andProfessor Nuno Fuentecilla Maia Ferreira Neves, who while not being my co-supervisor worked as if it were. Their guidance and assistance were essentialfor the development of the course. And I highlight their acumen and patience.

I want to acknowledge the Brazilian Army whose vision of the future let tothe decision to value and encourage the improvement of the knowledge of itshuman resources. I also want to acknowledge Generals Decílio Sales and Ed-uardo Wolski not only for the professionalism and competence, but for thesupport and help that they had in the process of registration and enrollment inthis Ph.D. course. I want to acknowledge Professor Paulo Fernando FerreiraRosa for your encouragement in the early years of my research and ProfessorRicardo Choren Noya for the support as the Master’s supervisor and duringthe Ph.D. as academic tutor.

I would like to express my special appreciation and thanks to my wife VivianVivas, my sons Alex and Pedro, for encouraging, hearing and spending sleeplessnights with me and was always my support in the moments when there was noone to support me. A big thank you to my parents, who even from a greatphysical distance, helped me a lot to overcome this important phase. I want tothank my sister Suzuky and my aunt Sonia for all encouragement and support.

I use this opportunity to express a special thanks to Eric Vial for the supportand teamwork. Moreover, my gratitude to all colleagues at the Large-Scale In-formatics Systems Laboratory (LaSIGE) research laboratory, for their cama-raderie. In particular, thank you to Pedro Gonçalves, Vinicius Cogo, Pedro

Costa, Bruno Vavala, Ricardo Fonseca, Tulio Ribeiro, Adriano Serckumecka,and many others.

Finally, I would like to acknowledge the members of the SUPERCLOUD project(ref. H2020-643964) for all the experience exchanged and support.

Contents

Abstract i

Resumo iii

Resumo Alargado v

List of Figures xiv

List of Tables xvii

1 Introduction 11.1 Problem and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Objective and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background and Related work 92.1 Software-defined networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2 OpenFlow and OVSDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Network virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2.1 Network virtualization for scalability and agility . . . . . . . . . . . . 152.2.2 Virtualization of software-defined networks . . . . . . . . . . . . . . . 172.2.3 Network virtualization in the cloud era . . . . . . . . . . . . . . . . . . 212.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3 Virtual network embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.3.1 Baseline VNE Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.3.2 Dependable VNE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.3.3 Secure VNE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

xi

CONTENTS

2.3.4 Multi-domain VNE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4 Multi-cloud systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.5 Elastic Virtual Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.5.1 Enablers for elastic VN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Sirius: Multi-cloud Network Virtualization 413.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.3 Design of Sirius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3.2 Overview of Sirius operation . . . . . . . . . . . . . . . . . . . . . . . . . 463.3.3 The multi-cloud orchestrator . . . . . . . . . . . . . . . . . . . . . . . . . 483.3.4 Hypervisor architecture and components . . . . . . . . . . . . . . . . . 503.3.5 Virtualization runtime: achieving isolation . . . . . . . . . . . . . . . . 52

3.4 Implementation and evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.4.1 Implementation and experimental setup . . . . . . . . . . . . . . . . . . 553.4.2 Evaluation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Secure Multi-Cloud Virtual Network Embedding 614.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.2 Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3 Secure Virtual Network Embedding Problem . . . . . . . . . . . . . . . . . . . 684.4 A Policy Language to Specify SecVNE . . . . . . . . . . . . . . . . . . . . . . . . 714.5 MILP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.5.1 Decision variables and auxiliary parameters . . . . . . . . . . . . . . . 744.5.2 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.5.3 Security Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.5.4 Mapping Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.5.5 Capacity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.5.6 Discussion on Security Assurances . . . . . . . . . . . . . . . . . . . . . 84

4.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

xii

CONTENTS

4.6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.6.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874.6.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5 Scalable and Secure Multi-cloud Virtual Network Embedding 935.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.2 Enhanced Sirius Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2.1 Virtual and Substrate Networks . . . . . . . . . . . . . . . . . . . . . . . 965.3 Network Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.3.1 Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 985.3.2 Scalable SecVNE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.5.1 Testing environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.5.2 Evaluation against optimal solution . . . . . . . . . . . . . . . . . . . . . 1125.5.3 Large-scale simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.5.4 Prototype experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6 Elastic Virtual Networks 1256.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266.2 Motivating use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1306.3 Abstracting the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316.4 Elastic Virtual Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

6.4.1 Elastic VN primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1336.4.2 Elastic VNE algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

6.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

7 Summary and Future Work 1497.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1497.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

7.2.1 Multi-cloud Network migration . . . . . . . . . . . . . . . . . . . . . . . 151

xiii

CONTENTS

7.2.2 Secure, dependable and scalable Sirius . . . . . . . . . . . . . . . . . . . 1517.2.3 Programmable Virtual Networks . . . . . . . . . . . . . . . . . . . . . . 152

Bibliography 153

xiv

List of Figures

2.1 Simplified view of an SDN architecture (based on Figure 1 of Kreutz et al.(2015)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 OpenFlow table pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3 Open vSwitch Interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 An example of Clos network in VL2 with two separated address sets: LAs

and AAs (based on Figure 5 of Greenberg et al. (2009)). . . . . . . . . . . . . . 162.5 FlowVisor architecture (based on Figures 2 and 3 of Sherwood et al. (2009)). 182.6 OpenVirteX architecture (based on Figures 1 and 3 of Al-Shabibi et al. (2014)). 192.7 FlowN architecture (based on Figure 1 of Drutskoy et al. (2013)). . . . . . . 202.8 (a) Virtual Cluster and (b) Virtual Oversubscription Cluster abstractions

(based on Figure 2 and 3 of Ballani et al. (2011)). . . . . . . . . . . . . . . . . . . 222.9 Network Virtualization Platform architecture. . . . . . . . . . . . . . . . . . . . 242.10 The VFP Design (based on Figure 1 of Firestone (2017)). . . . . . . . . . . . . 252.11 The Andromeda Stack (based on Figure 1 of Dalton et al. (2018)). . . . . . . . 262.12 Example of an augmented substrate graph with clusters, meta-nodes and

meta-links (based on Figure 2 of Chowdhury et al. (2012)) . . . . . . . . . . . 292.13 Example of how topology can influence node mapping (based on Figure 2

of Cheng et al. (2011)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.14 A VN not capable of tolerating two link failures (solid edges) and a VN aug-

mented that tolerates them (both solid and dashed edges) (based on Figure2 of Shahriar et al. (2016)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.15 Xen-Blanket overview architecture (based on Figure 1 of Williams et al.(2012)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.16 The multi-cloud Chrysaor’s system model (based on Figure 1 of Costa et al.(2017)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.17 Architecture of DepSky (based on Figure 3 of Bessani et al. (2013)). . . . . . . 362.18 A simple ensemble migration (based on Figure 2 of Ghorbani et al. (2014)). 39

xv

LIST OF FIGURES

3.1 Sirius architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.2 Modular architecture of the multi-cloud orchestrator. . . . . . . . . . . . . . . 483.3 Graphical User Interface of Sirius. . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.4 Intra- and inter-clouds connections. . . . . . . . . . . . . . . . . . . . . . . . . . . 503.5 Modular architecture of the network hypervisor. . . . . . . . . . . . . . . . . . 513.6 (Switch port, DatapathId) = host ID . . . . . . . . . . . . . . . . . . . . . . . . . 543.7 Setup time (left: MST; right: full mesh). . . . . . . . . . . . . . . . . . . . . . . . 563.8 Control plane overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.9 Data plane overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.1 Example substrate network encompassing resources from multiple clouds. 654.2 Example of the embedding of a virtual network request (top) onto a multi-

cloud substrate network (bottom). The figure also illustrates the variousconstraints and the resulting mapping after the execution of our MILP for-mulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.3 Network model when no backup is requested (left); and when at least onebackup node is requested (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.1 Virtual networks and substrate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955.2 Derived network models from the user specifications for the substrate net-

work and a single virtual network (only considering the c l oud () attribute). 1005.3 Data structures used in node mapping. . . . . . . . . . . . . . . . . . . . . . . . . 1065.4 VNR acceptance ratio: Sirius vs optimal . . . . . . . . . . . . . . . . . . . . . . . 1125.5 Acceptance ratio: ratio of successful VNRs. . . . . . . . . . . . . . . . . . . . . . 1135.6 Acceptance ratio (multi-cloud scenario) and provider revenue. . . . . . . . . 1145.7 Embedding time for node mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.8 Embedding time for link mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.9 The effect of coupling node and link mapping with Path Contraction. . . . 1165.10 Container configuration in-depth. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.11 Virtual network provisioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.12 Prototype measurements: intra- and inter-cloud throughput and latencies. . 1195.13 Virtual (top) and Substrate (bottom) topologies for experiments consider-

ing three embedding algorithms: Sirius, full-greedy (Yu et al., 2008), and theoptimal solution (Alaluna et al., 2017). . . . . . . . . . . . . . . . . . . . . . . . . 121

xvi

LIST OF FIGURES

5.14 Embedding time for 9 sequential VNRs . . . . . . . . . . . . . . . . . . . . . . . 122

6.1 ElasticVN primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.2 Elastic VN system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1296.3 VNR acceptance ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1446.4 Resource usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1456.5 Cost of Migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1466.6 Path lengths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1466.7 Acceptance ratio: provider-driven. . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

xvii

List of Tables

2.1 Summary of network virtualization solutions. . . . . . . . . . . . . . . . . . . . 272.2 Summary of the most relevant embedding approaches. . . . . . . . . . . . . . 332.3 Summary of multi-cloud systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.4 Elastic VNs and enablers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.1 Policy grammar to define SecVNE parameters. . . . . . . . . . . . . . . . . . . 724.2 MILP formulation variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.3 MILP additional parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.4 Auxiliary sets to facilitate the description of the formulation constraints. . 754.5 VNR configurations evaluated in the experiments. . . . . . . . . . . . . . . . . 86

5.1 VNR configurations that were evaluated. . . . . . . . . . . . . . . . . . . . . . . 110

6.1 Global variables employed by the algorithms. . . . . . . . . . . . . . . . . . . . 1346.2 VNR configurations that were evaluated in the experiments. . . . . . . . . . 143

7.1 Summary of contributions of the thesis . . . . . . . . . . . . . . . . . . . . . . . 150

xix

1Introduction

The management of cloud computing infrastructures was made possible by the advent ofserver virtualization. By exposing a software abstraction (the Virtual Machine – VM) tocloud users instead of the physical machine itself, virtualization has given the degree of flex-ibility necessary for operators to achieve their operational goals while satisfying customerneeds.

Unfortunately, the requirements of today’s cloud users – for example, the ability to mi-grate unchanged workloads to the cloud – can not be met with server virtualization alone.The root of the problem is the fact that, although compute and storage virtualization iscommonplace, network virtualization is not. Complete network virtualization, as requiredin current cloud deployments, entails fully decoupling the logical service from its physicalrealization (Casado et al., 2010). Traditional network virtualization primitives, such as Vir-tual Local Area Network – VLANs and Virtual Private Network – VPN, do not offer thisform of virtualization: they provide only restricted forms of isolation.

Moreover, these network virtualization primitives are too coarse-grained and lack the scal-ability to provide complete virtualization (Yu et al., 2011). This situation is rooted in afundamental problem: networks are very hard to manage. However, a recent paradigmshift that promotes the logical centralization of network control and the separation of thenetwork’s control logic (the control plane) from the underlying routers and switches thatforward the traffic (the data plane) – Software-defined networking (SDN) (Kreutz et al.,

1

1. INTRODUCTION

2015) – has allowed the emergence of production-quality, cloud-scale virtualization plat-

forms that allow the creation of virtual networks, each with independent service models,

topologies, and addressing architectures, over the same physical network (Dalton et al.,2018; Firestone et al., 2018; Koponen et al., 2014).

1.1 Problem and Motivation

These state-of-the-art platforms for network virtualization (Al-Shabibi et al., 2014; Dalton

et al., 2018; Firestone et al., 2018; Koponen et al., 2014) show the feasibility of network

virtualization, but they have been confined to a datacenter controlled by a single cloud op-

erator. They thus have limitations in terms of resilience, and this restriction can become an

important barrier as more critical applications are moved to the cloud. For instance, com-

pliance with privacy legislation may demand certain customer data to remain local (either

in an on-premise cluster or in a cloud facility located in a specific country). As a concrete ex-

ample, to abide by the recently implemented General Data Protection Regulation (GDPR),

a virtual switch connecting databases that contain user data may need to be placed at a spe-

cific private location under the tenant’s control, while the rest of the network is offloaded

to public cloud infrastructures, to take advantage of their elasticity and flexibility.

This offers motivation to extend network virtualization to a multi-cloud environment. Be-

ing able to leverage from several cloud providers can potentiate important benefits. First,

a tenant can be made immune to any single data center or cloud availability zone outage

by spreading its services across providers. Despite the highly dependable infrastructures

employed in cloud facilities, the large number of incidents involving accidental and mali-

cious faults in cloud infrastructures (Blodget, 2017; Los et al., 2013; Sharwood, 2016) show

that relying on a single provider can lead to the creation of internet-scale single points of

failures for cloud-based services. Second, user costs can potentially be decreased by taking

advantage of dynamic pricing plans from multiple cloud providers. Amazon’s EC2 spot

pricing is an example, which was recently explored to significantly reduce the costs on cer-

tain workloads when compared to traditional on-demand pricing (Zheng et al., 2015). As

providers increase the support of dynamic prices, the opportunity for further savings in-

creases with the user’s ability to move Virtual Networks to less costly locations. Third,

2

1.2 Objective and Challenges

increased performance can also be attained by bringing services closer to clients or by mi-grating workloads that at a certain point in time need to closely cooperate.

Another limitation of current SDN-based network virtualization solutions is that their ser-vice offering is restricted to conventional network services (such as flat L2 or L3 routing)(Al-Shabibi et al., 2014; Dalton et al., 2018; Firestone et al., 2018; Koponen et al., 2014). In par-ticular, they do not consider security and dependability aspects when deploying the virtualinfrastructures, limiting them with respect to those important non-functional properties.This motivates the need for user-centric virtual networks that leverage from a multi-cloudsubstrate infrastructure, entailing virtual resources with different levels of security and de-pendability. This brings with it several important benefits for users. First, it increases theresilience of virtual networks. Replicating services across providers avoids single pointsof failure. Second, it can improve security, by exploring the interaction between differenttypes of clouds (e.g., public vs private) and by allowing users to chose the required securityguarantees from their virtual network nodes and links.

As demonstration of the timeless of this problem, the leading virtualization companieshave recently started to explore multi-cloud approaches for security. For example, after itsacquisition of CloudCoreo, a public cloud security startup, VMware has integrated Cloud-Coreo’s multi-cloud technology to create a new brand of security services (VMwareMulti-CloudSec, 2019). This type of deployments is trending upwards, with 85% of enterprisesreporting to already have a multi-cloud strategy for their business (RightScale, 2017).

1.2 Objective and Challenges

In this thesis, we propose a multi-cloud network virtualization solution that improves overthe state of the art in two aspects. First, it enriches the network substrate with resourcesfrom multiple clouds, including private data centers and public clouds. Second, it enrichesthe virtual networks with security, dependability and elasticity, enabling, overall, moreresilient and flexible virtual infrastructures.

Scaling out network virtualization to multiple clouds entails various challenges. First, itis necessary to create a single cloud abstraction from multiple heterogeneous clouds, asthis aspect should be transparent to the users. This is complicated because different cloud

3

1. INTRODUCTION

operators expose different APIs to a different set of services. It is, therefore, necessary tocreate a new software layer to hide these differences while maintaining good performance.

Second, it should be possible to provide complete network virtualization, giving tenants thefreedom to specify the network topologies and addressing schemes of their choosing, whileguaranteeing the required level of isolation among them. As achieving this goal has proveddifficult (if not impossible) with traditional network approaches, our goal is to explore SDNfor this purpose. The centralization of the control and the global visibility offered by SDNcan be used to enforce the required properties for the virtual networks.

Third, increasing the offer of virtual networks with security and dependability requiresnovel algorithms for network embedding, the process find efficient mappings of virtualnetworks requests onto the substrate network. The main challenge is to simultaneouslyfulfill three objectives: to comply with these more advanced requirements of users; to makegood use of the substrate resources, maximizing provider’s gains; and to guarantee that thesolution scales to very large networks.

Fourth, it is important to be aligned with one of the main goals of a cloud model: elasticity.Towards this goal we investigate algorithms that allow the virtual networks to scale in andscale out. This objective should be achieved without negatively impacting the use of re-sources. For this purpose, it is necessary to investigate optimizations and explore recentlyproposed enabling solutions, including the ability to migrate network workloads.

1.3 Contributions

To fulfill the goals stated before, we make the following contributions in our work:

(i) We propose an architecture that allows network virtualization to extend across multiplecloud providers, including a tenant’s own private facilities, therefore increasing the versa-tility of the network infrastructure. In this setting, the tenant(s) can specify the requirednetwork resources as usual but now they can be spread over the data centers of several cloudoperators, both public and private. This is achieved by creating a new network layer abovethe existing cloud hypervisors to hide the heterogeneity of the resources from the differentproviders while providing the level of control to setup the required (virtual) links among

4

1.3 Contributions

the VMs. Our solution, Sirius, follows an SDN approach: the proposed new network layercontains network elements that are configured remotely by an SDN controller, in order toperform the necessary virtual-to-physical mappings, and the set up of tunnels to allow thenetwork to be virtualized. The network virtualization platform for multi-cloud environ-ments we propose allows complete virtualization of L2 and L3 addressing, arbitrary networktopologies, and isolation between tenants. This contribution is presented in Chapter 3 andenhanced in Chapter 5, and was published partially in NetSoft 2016 (Alaluna et al., 2016),partially in the XDOM0 2017 workshop (Alaluna et al., 2017).

(ii) The second contribution is a novel virtual network embedding (VNE) solution formulti-cloud network virtualization. Our solution considers security as a first-class citi-zen, enabling the definition of flexible policies in three central areas: on the links, wherealternative security options can be explored (e.g., encryption); on the switches, support-ing various degrees of protection and redundancy if necessary; and across multiples clouds,including public and private facilities, with the associated trust levels. We formulate theproblem as a Mixed Integer Linear Programming (MILP) and evaluate our proposal againstthe most common alternatives. Our analysis gives insight into the trade-offs involved withthe inclusion of security demands into network virtualization, providing evidence that thisnotion does not preclude high acceptance rates, efficient use of resources and increases inprovider profits. This contribution is presented in Chapter 4. An article presenting thiswork is currently under evaluation at the Elsevier Computer Communications journal (thechapter corresponds to its last version, which addresses a prior request for a major revision).

(iii) The MILP solution we propose makes optimal use of the substrate resources, but it hasone problem. It does not scale to large networks. Faced with this challenge, the third contri-bution is a VNE heuristic that scales to very large networks. The evaluation of our new al-gorithms demonstrates the solution to scale well for networks of thousands of switches em-ploying diverse topologies, and improves on the virtual network acceptance rate, providerrevenue, and quality of service when compared with the common alternatives. As an ex-ample, our results show that provisioning a 10 thousand container virtual network can beattained in less than 2 minutes. This contribution is presented in Chapter 5 and has beenpublished in Elsevier Computer Networks journal (Alaluna et al., 2019).

(iv) The fourth contribution of our thesis is a VNE solution that adds elasticity to thetenant’s virtual networks. For this purpose, we propose new primitives and develop new

5

1. INTRODUCTION

algorithms that allow virtual networks to scale out (increase) and scale in (decrease). To

guarantee that resources are used efficiently, our solution includes the possibility of mi-

grating specific elements of the network substrate. As migration has a cost, we make a

parsimonious use of this technique. As a result, our solution achieves a level of efficiency

that is similar to a solution that fully reconfigures the substrate network, while reducing

the migration requirements by one order of magnitude. This contribution is presented in

Chapter 6 and is about to be submitted to IEEE INFOCOM.

(v) We also made our system Sirius available open source1 , including all evaluation data

(scripts and results).

We evaluate our solutions with large-scale simulations that consider realistic network topolo-

gies, and our prototype Sirius in a substrate composed of private data centers and public

clouds. Sirius was the core software component of the H2020 SUPERCLOUD project (SU-

PERCLOUD, 2019). The prototype was demonstrated, with success, in the final review

of the project. The setup included a substrate composed of 7 cloud services from 5 cloud

providers across 5 different countries. Two non-trivial virtual networks were shown run-

ning two real distributed applications: a laboratory information system from Maxdata, and

a medical imaging platform from Phillips. A video of the demonstration is available online,

in DemoOfSirius (2019).

1.4 Structure of the Thesis

The thesis is organized as follows: Chapter 2 presents the context of the thesis and the

related work.

Chapter 3 presents the design and implementation of Sirius, a network virtualization plat-

form for multi-cloud environments.

Chapter 4 addresses the online network embedding problem for our context, by presenting

a novel solution for this central component of network virtualization: finding efficient

mappings of virtual networks requests onto the substrate network.

1https://github.com/netx-ulx/Sirius

6

1.4 Structure of the Thesis

Chapter 5 presents a heuristic for virtual network embedding which fit a modern data cen-ter context, scales well, and considers the security of virtual resources.

Chapter 6 presents new primitives and heuristics for virtual network embedding with thecapacity of scaling out and scaling in virtual networks.

Finally, Chapter 7 concludes the thesis and conjectures on future work.

7

2Background and Related work

This chapter provides background on the problem at hand, mainly by introducing the nec-

essary concepts and discussing relevant work done in the area. We start, in Section 2.1,

by describing the Software-defined networking (SDN) paradigm. By logically centralizing

network control, SDN allowed the development of solutions for complete network virtu-

alization that fully decouple the virtual networks from the substrate. In Section 2.2, we

present the state-of-the-art of network virtualization. The role of SDN will be made clear

in this section, alongside the limitations of existing approaches, which motivates this work.

Then, in Section 2.3, we address the central resource allocation problem of network virtual-

ization that we tackle in our thesis: virtual network embedding (VNE). The VNE problem

consists in finding an efficient mapping of the virtual nodes and links onto the substrate

network. Our thesis advances the state of the art on this topic by including security and

elasticity into the problem. For the former, it considers an enriched substrate network

model that includes resources from multiples clouds. As such, we present recently pro-

posed multi-cloud systems (for storage and computing) in Section 2.4. For the latter, we

propose new scaling primitives and mechanisms. We thus close this chapter with related

work on elastic virtual networks, in Section 2.5.

9

2. BACKGROUND AND RELATED WORK

2.1 Software-defined networks

In traditional IP networks, the control and data planes are coupled inside network switch-

es/routers and network control is fully decentralized. This design has advantages, namely

with respect to resilience, but has an important drawback: leads to networks that are very

complex to manage, operate, and control. In addition, the closed nature of networking gear

makes it hard to insert new functionalities directly in the network elements, forcing the in-

vestment in middleboxes, such as firewalls, intrusion detection systems, and load balancers,

that further increase infrastructure complexity. Another characteristic of traditional net-

works is that, in order to configure their behavior and to define their policies, it is necessary

to intervene in each network element with manually-operated commands or using low-level

scripts.

This lack of flexibility has slowed the innovation in, and the evolution of, networking.

Ideally, a network should be programmable. The initial step in this direction was given

recently with the advent of Software-defined networking – SDN (Kreutz et al., 2015). This

paradigm separates the network data and control planes and logically centralizes network

control, enabling its programmability. Figure 2.1 depicts the basic view of an SDN archi-

tecture.

SDN decouples the network control and forwarding functions, enabling the former to be-

come directly programmable, and allowing the underlying infrastructure to be abstracted

for applications and network services (ONF, 2018b). A software-defined network is an

architecture with four pillars (Kreutz et al., 2015; ONF, 2018b):

1. Decoupling the control plane from the data plane. Network devices become simple

packet forwarding elements.

2. Forwarding decisions are flow-based, in contrast to destination-based. A flow is a

set of packet field values acting as a filter criterion that defines the set of actions to

be applied. The packets of a specific flow receive the same service policies at the

forwarding devices.

3. The control logic is moved to an SDN controller. This entity is external and typi-

cally runs on a (cluster of) commodity server(s). It provides the main resources and

10


Figure 2.1: Simplified view of an SDN architecture (based on Figure 1 of Kreutz et al.(2015)).

abstractions to allow the programming of forwarding devices based on a logically

centralized, abstract network view.

4. The network is programmable through control software applications running on the

SDN controller.

OpenFlow (ONF, 2016) is the most common southbound interface to allow the separation

between control and data planes. It allows specification of the forwarding behavior desired

by the network application while hiding details of the underlying hardware. OpenFlow can

be seen as the equivalent to a “device driver” in an operating system. This protocol has been

added as a feature to most commercial Ethernet equipment, working as a standardized hook

to allow researchers to run experiments, without requiring vendors to expose the internal

workings of their network devices.

11


2.1.1 Controllers

The controller is the fundamental element of an SDN architecture, as it is the key support-

ing piece for the control logic (applications) to generate the network configuration based

on the policies defined by the network operator. The control platform has characteristics

that resemble those of an operating system:

• abstracts the lower-level details of the interaction with forwarding devices;

• facilitates the creation of Application Programming Interfaces – APIs;

• deals with topology discovery, a fundamental “tool” for network control applica-

tions; and

• provides other basic network functionalities.

There is a diverse set of controllers with different design and architectural choices. Exist-

ing controllers can be categorized based on many aspects, such as the type of basic network

functionalities provided, which southbound or northbound API is supported, etc. One rel-

evant architectural aspect is whether the controller is centralized or distributed. A central-

ized controller (such as NOX (Gude et al., 2008) or Floodlight (Floodlight-Project, 2019)) is

a single entity that manages all forwarding devices of the network. Naturally, it represents

a single point of failure and may have scaling limitations. Contrary to a centralized design,

a distributed controller (such as Onix (Koponen et al., 2010) or ONOS (Berde et al., 2014))

can be scaled up to meet the requirements of potentially any environment, from small to

large-scale networks.

We choose the Floodlight controller to develop the solutions to be presented in this thesis,

for three reasons. First, because its community is very active, making available a large set

of tutorials and detailed manuals about the controller. Second, because it uses Java as the

main language, facilitating portability. Finally, because Floodlight has been widely adopted

by the research community.

12


2.1.2 OpenFlow and OVSDB

As stated above, OpenFlow has been implemented into the equipment of most major ven-dors, with many OpenFlow-enabled switches now commercially available (McKeown et al.,2008; ONF, 2016). An OpenFlow Switch is composed of a set of flow tables (with multipleflow entries each) used to process packets by matching on specific headers and by executingpre-defined actions. To remotely control the forwarding tables, this equipment establishesTCP or TLS communication with, at least, one master controller (and potentially withseveral slaves). Controllers can insert, delete or update flows in each table to control thenetwork. Each flow is composed of matching rules, counters that keep statistics of matchingpackets, and a set of actions to be performed on matching packets (ONF, 2018a).

Figure 2.2: OpenFlow table pipeline.

As soon as a packet arrives at an OpenFlow forwarding element, it is handled based on thedata on the matching fields. The packet is processed in priority order of the flows in thefirst table until a match is found. If no such match occurs, it proceeds to the next tableuntil the end of the pipeline, as shown in Figure 2.2 (in fact, some actions may point thepacket directly to an output port). If a match is found, an action is executed. Examples ofactions include: forwarding to a port, drop the packet, set in a queue, push/pop tag, setfield value, change Time to Live (TTL), send the packet to the controller, among others. Incase of no match in any table, a table-miss happens, typically with the packet being sent tothe controller.

A notable example of a software-based OpenFlow switch implementation is Open vSwitch(OVS) (Pfaff et al., 2015). OVS is a software switch that operates within the hypervisor/-management domain and provides connectivity between the virtual machines and the un-derlying physical interfaces. It implements standard Ethernet switching and, in a stan-dalone configuration, it operates much like a basic L2 switch. However, to support inte-

13


gration into virtual environments, and to allow (logical) switch distribution, OVS exports

interfaces for manipulating the forwarding state and managing configuration state at run-

time. Since OVS was originally developed, its performance was on par with the Linux

Ethernet bridge. Over the past few years, its performance has been gradually optimized to

match the requirements of multi-tenant data center workloads (Pfaff et al., 2015). Underly-

ing OVS there is a flow-table forwarding model similar to that used by OpenFlow.

Since OpenFlow does not allow to modify the switch configurations (e.g. configure queues

and tunnels, add/remove ports, create/destroy switches), OVS also maintains a database

and exports a configuration interface that enables remote configuration of the virtual switches

via the OVSDB protocol (Pfaff & Davie, 2013). OVSDB is a management protocol that

uses JSON (Crockford, 2015). It has a database that holds the configuration for one Open

vSwitch daemon and describes the switching behavior of a virtual switch. The protocol

interface is used to accomplish the configuration operations on the OVS. Figure 2.3 depicts

the interaction between the main components and the interfaces in OVSDB (Pfaff & Davie,

2013). For instance, using OVSDB it is possible to: initialize the Open vSwitch database,

create OpenFlow bridges, ports, tunnels, and queues, configure controller connections, etc.

Figure 2.3: Open vSwitch Interfaces.

14

2.2 Network virtualization


There are many network primitives for virtualization, like VLANs (virtualized L2 do-

main), MPLS (virtualized path) and VPN (Virtual Private Network). However, none of

these primitives can supply the full decoupling of the virtual network from the physical

substrate: the main requirement for complete network virtualization. A network hypervi-

sor should completely decouple the substrate network from the virtual networks, allowing

the freedom to specify the network topologies and addressing schemes of a tenant’s choos-

ing while guaranteeing the required level of isolation among them. With the advent of

SDN, it became possible to offer this form of virtualization. In this section we describe the

advances in this field over the last 10 years, including the most relevant network virtualiza-

tion systems, and their limitations.

2.2.1 Network virtualization for scalability and agility

With the advent of cloud computing and the need for very large scale data center infras-

tructures to support it, the limitations of traditional networking solutions became evident.

Given the difficulty - or even the impossibility - of changing or replacing existing networks,

the community started to explore advanced forms of network virtualization. Initial work

had the goal of coupling the ease of configuration and management of the Ethernet with the

scalability advantage of IP networks. SEATTLE (Kim et al., 2008) was amongst the first sys-

tems to provide the plug-and-play functionalities and the linear addressing of the Ethernet,

with the high scalability and efficiency of using shortest-path, IP-based routing.

For this purpose, it relies on three main techniques: a one-hop network-layer distributed

hash-table (DHT) for address resolution, enabling packet forwarding based on MAC ad-

dress while avoiding switches to maintain state for each host; traffic-driven location res-

olution and caching in switches to forward packets using shortest paths while avoiding

load on the resolution service; and a scalable cache-update protocol used to avoid Ethernet

broadcasts. These techniques enable SEATTLE to overcome the scalibility limitations of

Ethernet networks, by avoiding flooding to locate hosts and by not relying in the Spanning

Tree Protocol, which has its own scalability and dependability limitations.

15


SEATTLE has one important limitation: it requires switches to be changed. VL2 (Green-

berg et al., 2009), on the other hand, takes an end-host based approach. Its starting goal

was to offer each service the abstraction of having all its servers connected by a single non-

interfering Ethernet switch. In addition, it should support huge data centers, allowing

uniform high capacity on the servers, performance isolation between services, all while

maintaining Ethernet semantics.

For this purpose, VL2 is a scale out topology built with low cost switch ASIC, managed in

a Clos topology, as presented in Figure 2.4. This provides extensive path diversity, which

coupled with the use of Valiant Load-Balancing (Zhang-Shen, 2010) enables spreading traffic

without any centralized coordination. VL2 uses different IP addresses to separate a host

location from its identification, an essential feature to enable service migration. There is

a location-specific address (LA) for switches and interfaces (these devices run a link-state

routing protocol to maintain the switch level topology), and an application-specific address

(AA) for applications (kept unaltered even if the server changes its location). As a result, the

LAs and AAs create an abstraction that all the servers belong to the same layer-2 network,

while avoiding ARP and DHCP floods. A directory system is used to maintain the mapping

between the two sets of IP addresses.

Figure 2.4: An example of Clos network in VL2 with two separated address sets: LAs andAAs (based on Figure 5 of Greenberg et al. (2009)).

A related effort is PortLand (Mysore et al., 2009). Its authors propose a lightweight scal-

able, fault tolerant Ethernet-compatible network, by leveraging the knowledge about its

baseline topology (a fat tree (Leiserson, 1985)). A solution tailored to a special topology

16


facilities flexibility in VM migration, but being based on a specific topology can also beconsidered a limitation. By contrast, TRILL (Perlman et al., 2011) enables general topolo-gies, by approaching the L2 challenges differently. Here, an Ethernet broadcast link stateprotocol is used among switches to identify the network topology, the hosts locations andto separate addressing from location. In addition, it uses MAC-in-MAC encapsulation tolimit forwarding table size in the switches. The encapsulated packets are forwarded in thenetwork and are decapsulated at the edge for the end host to remain unmodified.

While these works offer partial forms of virtualization, they are not network virtualizationplatforms per se. Namely, they do not allow tenant services to define arbitrary topologiesor addressing schemes, and as such do not offer complete network virtualization.

2.2.2 Virtualization of software-defined networks

The flexibility advantage of SDNs was demonstrated from the outset with the emergence ofplatforms that allow virtualizing the SDN itself. FlowVisor (Sherwood et al., 2009) was theseminal work on this topic. Its basic idea is to allow multiple logical networks to share thesame OpenFlow networking infrastructure. For this purpose, it provides an abstractionlayer that slices a data plane based on off-the-shelf OpenFlow-enabled switches, allowingmultiple and diverse virtual SDNs (vSDN) to co-exist. In general terms, a slice is definedas a particular set of flows in the data plane. Each slice receives a minimum data rate andeach guest controller gets its own virtual flow table in the switches. To achieve this goal,FlowVisor assumes control over the entire infrastructure.

FlowVisor sits between the tenant’s SDN controllers and the network switches (Figure 2.5).From a system design perspective, FlowVisor is a transparent proxy that intercepts Open-Flow messages between switches and controllers. By such, FlowVisor controls the viewthat the tenants’ controllers have of the SDN switches.

Five slicing dimensions are considered in FlowVisor: bandwidth, topology, traffic, deviceCPU, and forwarding tables. Different mechanisms are used to slice over each dimension.For instance, VLAN priority bits are used for bandwidth isolation, considering all packetsin a flow with a specific priority. Thus, all traffic that belongs to a given slice is mappedto the traffic class defined by the resource allocation policy. The proxy-based architecture

17


Figure 2.5: FlowVisor architecture (based on Figures 2 and 3 of Sherwood et al. (2009)).

allows packets to be intercepted – both from and to the controller – allowing transparent

slicing of topology, forwarding tables, OpenFlow flow counters, etc.

In FlowVisor, each network slice supports a controller, i.e., multiple SDN controllers can

co-exist on top of the same physical network infrastructure. Each controller is allowed to

act only on its own network slice.

Other slicing approaches based on FlowVisor appeared in the literature afterward. For

instance, AutoSlice (Bozakov & Papadimitriou, 2012) focuses on the automation of the de-

ployment and operation of vSDN topologies with minimal mediation or arbitration by the

substrate network operator. Additionally, AutoSlice targets scalability aspects of network

hypervisors by optimizing resource utilization and by mitigating the flow-table limita-

tions through a precise monitoring of the flow traffic statistics. Similarly, AutoVFlow (Ya-

manaka et al., 2014) also enables multi-domain network virtualization. However, instead

of having a single third party to control the mapping of vSDN topologies, as is the case

of AutoSlice, AutoVFlow uses a multi-proxy architecture that allows network owners to

implement flow space virtualization in an autonomous way by exchanging information

among the different domains.

18


FlowVisor-based slicing approaches do not offer complete network virtualization of anSDN. Building on the design of FlowVisor, also acting as a proxy between the controllerand the forwarding devices (Figure 2.6), OpenVirteX (OVX) (Al-Shabibi et al., 2014) pro-vides virtual SDNs with arbitrary topologies, arbitrary L2 and L3 addresses, and controlfunction virtualization. These are the required properties in a multi-tenant environmentwhere virtual networks need to be flexible to assure quick provisioning and the ability tobe migrated across the entire infrastructure, in response to the dynamics of the environ-ment. To enable arbitrary topologies, OVX maps the virtual networks onto the substrateby relying on network embedding algorithms (to be introduced in Section 2.3).

Figure 2.6: OpenVirteX architecture (based on Figures 1 and 3 of Al-Shabibi et al. (2014)).

To provide each tenant with the full header space, OVX controls all switches of the physicalSDN network, re-writing at the edge the virtually assigned IP and MAC addresses of thehosts, defined by the tenant, into disjoint addresses to be used in the SDN core. Withsuch approach, in theory, the entire flow space can be provided to each virtual network. Inpractice, the current implementation does not allow the overlap of (virtual) MAC addresses,

19


meaning virtual hosts cannot use the same MAC in different virtual networks. In other

words, in the current implementation L2 addressing is not virtualized, whereas L3 is.

FlowN (Drutskoy et al., 2013) is another solution that offers complete network virtualiza-

tion of SDN. In this platform, tenants can also specify their own L3 address space, arbitrary

topology, and control logic. Each tenant has full control over its virtual networks and is free

to deploy any network application on top of the controller platform. However, whereas

FlowVisor can be compared to traditional virtualization technology, FlowN is analogous to

container-based virtualization, i.e., it is a lightweight virtualization approach (Figure 2.7).

It is designed to be scalable by allowing a unique shared controller platform to be used for

managing multiple virtual domains in a cloud environment.

Figure 2.7: FlowN architecture (based on Figure 1 of Drutskoy et al. (2013)).

Similarly to OVX, to achieve address virtualization the solution marks incoming packets at

the edge switches using VLAN tags to identify the tenant, limiting scalability. Moreover,

20


FlowN uses database technology for the mapping between physical and virtual networktopologies.

A related effort is the compositional SDN hypervisor (Jin et al., 2014), a solution designedwith the objective of allowing the cooperative (sequential or parallel) execution of appli-cations developed with different programming languages or conceived for diverse controlplatforms. It thus offers interoperability and portability in addition to the typical functionsof network hypervisors.

One common aspect of all these works is that they aim to virtualize an SDN, and so theyare not generic network virtualization solutions.

2.2.3 Network virtualization in the cloud era

The level of maturity of cloud computing technologies has led many companies and Gov-ernment agencies to migrate their IT services to operate in the cloud, both because of ef-ficiency gains and to lower costs (Greer, 2010). As cloud services matured, the associatednetwork requirements have grown and network virtualization has had to evolve.

2.2.3.1 Enhanced VN abstractions

Towards this goal, Guo et al. (2010) proposed the virtual data center (VDC) abstraction asa unit of resource allocation. This abstraction is defined as a set of VMs with a customer-supplied IP address range and an associated service level agreement that includes bandwidthrequirements. To enable the VDC abstraction, the authors proposed SecondNet, a new datacenter network virtualization architecture. To scale, SecondNet distributes all the virtual-to-physical mapping, routing, and bandwidth reservation state.

Also by realizing that cloud application performance critically depends on the network, Bal-lani et al. (2011) have identified that the VDC abstraction was limited and have proposedthe extension of cloud services with two new abstractions. The first is the virtual clus-ter, an abstraction that provides the illusion of having all VMs connected to a single, non-oversubscribed virtual switch. This is motivated by MapReduce-like applications that arecharacterized by all-to-all traffic patterns. The second, the virtual oversubscribed cluster,

21


emulates an oversubscribed two-tier cluster that suits applications featuring local communi-

cation patterns. The authors further designed a system that implements these abstractions:

Oktopus. Figure 2.8 presents these two abstractions.

Bandwidth B B

VM 1 VM N

Virtual Switch

Request <N, B>Each VM can send and receive at

rate B

Switch bandwidth need = N*B

(a) (b)

BandwidthB*S/O

Root Virtual Switch

Group Virtual Switch

Group 1 Group 2 Group N/S

VM 1 VM S VM 1 VM S VM 1 VM S

Request <N, S, B, O>N VMs in groups of S, Oversubscription factor O

Group Switch bandwidth = S*BRoot switch bandwidth = N*O/O

Figure 2.8: (a) Virtual Cluster and (b) Virtual Oversubscription Cluster abstractions (basedon Figure 2 and 3 of Ballani et al. (2011)).

Xie et al. (2012) extended the previous solution to handle dynamic traffic patterns. The

motivation for their design is that the behavior of the most common jobs executed in data

centers follows specific patterns. By leveraging this knowledge they are able to increase the

utilization of the data center and decrease the tenants’ costs. For this purpose, this work

proposes a temporally-interleaved virtual cluster abstraction which captures the temporal

variations in the network behavior of cloud applications. Rost et al. (2015) went further

and proposed a polynomial-time solution to compute resource-minimal virtual cluster em-

beddings, proving that this special case of embedding was not an NP-hard problem, as was

commonly perceived.

These works enhance virtual networks, but they do not yet provide complete network vir-

tualization with arbitrary network topologies.

22


2.2.3.2 Complete Network Virtualization

Simple network abstractions are typically enough for the most basic application workloadsthat require only a single logical switch connecting a few tens of VMs using a flat L2 ser-vice, with some bandwidth (or sometimes delay) guarantees. However, this leaves asidemany typical workloads. For instance, large analytic workloads typically demand for L3routing, and web services often require multiple tiers. The experience from production-level environments (e.g., (Dalton et al., 2018; Firestone et al., 2018; Koponen et al., 2014))confirms this scenario: as deployments mature, tenants migrate to more complicated work-loads. This strengthens the case for offering tenants with arbitrary virtual networks, withdiverse network topologies.

VMware has proposed a network virtualization platform (NVP/NSX) (Koponen et al.,2014; VMWare, 2018) that provides the necessary abstractions to allow the creation of in-dependent virtual networks for large-scale multi-tenant environments. NVP is a completenetwork virtualization solution that allows the creation of virtual networks, each with in-dependent service models, topologies, and addressing architectures, over the same physicalnetwork. In NVP, tenants’ applications are provided with an API to manage their virtualnetworks, and the network hypervisor translates the tenants’ configurations and require-ments into low-level instruction sets to be installed on the forwarding devices.

For this purpose, the platform uses a cluster of SDN controllers to manipulate the forward-ing tables of the Open vSwitches in the host’s hypervisor. Forwarding decisions are there-fore made exclusively at the network edge (Figure 2.9). NVP wraps the ONIX controllerplatform (Koponen et al., 2010), and thus inherits its distributed controller architecture,allowing it to scale. In order to virtualize the network, NVP creates logical datapaths, atthe sending host, simulating the virtual network in order to reach a forwarding decision.After a forwarding decision is made, the packets are tunneled over the physical network tothe receiving host hypervisor. As a result, the physical network sees nothing but ordinaryIP packets.

An advantage of these systems is that they do not require SDN-based equipment in theinfrastructure core. The only requirement is at the edge: the computer hypervisors runsoftware switches that are SDN-controlled. As a consequence, the existing (IP-based) in-frastructure does not need to be replaced nor upgraded.

23


Figure 2.9: Network Virtualization Platform architecture.

Following an approach similar to NVP to meet its own requirements, Microsoft (Firestone,

2017) developed VFP, a Virtual Switch Platform, currently in production in the MicrosoftAzure public cloud. This platform is capable of handling a data center with a large number

of VMs, maintaining the high performance required in the cloud context, providing a pri-

vate network using the tenants’ defined addresses space, while allowing network function

such as L4 load balancers, ACLs, etc. The VFP architecture is presented in Figure 2.10 and

its design relies on Hyper-V1’s extensible switch.

The VFP core design is split into four parts: filtering model; programming model; packet

processor and flow compiler; and switching model. For the first part, VFP uses policies to

filter ingress/egress traffic, based on the Match-Action Tables (MATs) in a specific port of

the virtual NICs. The programming model is based on a hierarchy of objects: ports (each

port holds a match action table policy with a set of layers), layers (a basic MAT for the

controllers to specify policies), groups (each with its own set of rules) and rules (responsible

for executing an action considering a specific packet match). For performance, the platform

includes flow caching and a central packet processor that handles only meta data, avoiding

handling the entire packet until the end of the processing. Concerning the switch model,

1Microsft hardware virtualization platform.

24


Figure 2.10: The VFP Design (based on Figure 1 of Firestone (2017)).

besides SDN filtering, the platform implements a bridge to forward traffic.

Microsoft identified that the consumption of CPU due the VFP processing adds latency

and penalizes network performance. As follow up to VFP, Firestone et al. (2018) proposed

Azure Accelerated Network – AccelNet to address this problem. This solution is responsible

for offloading host networking to hardware: a special and customized Field Programmable

Gate Array (FPGA). Besides saving CPU cycles and maintaining the VFP programmable

design, AccelNet aims to achieve high performance, making it easy to add new functionali-

ties and enabling 100+ GbE virtual ports in the future.

In the AccelNet platform, the control plane remains the same as with VFP. However, the

data plane is divided into two packet processing units, each one with a packet buffer, a

parser, a flow lookup match, and a flow action. In addition to these, the platform may

perform flow tracking and reconciliation (if VFP policies are updated, the respective flow

actions are updated too). With these changes, including the improvements in VFP and the

processing offload to hardware, AccelNet increases the bandwidth capacity (up to 32 Gbps)

and decreases latency (average of 50µs in some cases), while monitoring more than 500

metrics to allow more accurate system diagnostics.

Similar to Microsoft network virtulization platform, Google also developed Andromeda

(Dalton et al., 2018), Google Cloud Platform’s network virtualization stack. This platform

25


was developed anchored in similar goals to Microsoft’s: tenants isolation without sacrific-ing throughput or latency from the substrate hardware; simplifying the addition of newnetwork features; commitment with high availability and operability, including transpar-ent live VM migration and resolution failures; and control plane scalability. While the goalsare similar, the approach followed is different: the solution is entirely software-based, withheavy optimizations in the network stack.

Figure 2.11: The Andromeda Stack (based on Figure 1 of Dalton et al. (2018)).

Figure 2.11 presents the Andromeda stack. There are two main parts: Fabric ManagementLayer, which exposes a high-level API to expresses the configuration for the underlyingcluster and program the VM host switches; and Host/Switch Layer, which has an extendedprogrammable software switch and a specific packet forwarding model, the Hoverboard,which combines the characteristics of traditional On-Demand and Gateway SDN models.

To achieve high throughput, the VM host data plane was divided into two packet processingpaths: the Fast Path, and the Coprocessor Path. The first processes performance-criticalflows by separating ingress/egress engines for packet processing from control threads, andperforming flow table cache with a single hash table lookup. The Coprocessor was addedto decouple the features that occasionally may disturb the Fast Path operation and lead tohigh latency or low throughput for demanding packet processing. Some example functionsinclude encryption and WAN traffic shaping. In the end, the throughput and latency resultare quite similar to AccelNet (Firestone et al., 2018) while being anchored on a software-based solution that does not depend upon offloading to hardware.

26


2.2.4 Summary

To conclude this section, in Table 2.1 we present a summary of the network virtualization

solutions, contrasting then with our own (Sirius).

With the exception of FlowVisor and Oktopus, the solutions we present offer complete net-

work virtualization of topology and addressing, and most follow an edge-based approach.

Existing solutions that provide full network virtualization have, however, three limitations

that motivate this thesis. First, the virtual network services provided are limited to tradi-

tional networking (L2, L3) and do not consider security services besides ACL and firewalls

(i.e., filtering mechanisms). Second, the substrate infrastructure is restricted to that of a

single operator/provider. Third, they do not allow virtual networks to scale in and out,

limiting the service’s elasticity. This restricts the security, dependability and flexibility of

the user’s virtual networks.

In this thesis, we extend the state-of-the-art by enhancing virtual networks with security

(including availability), by extending the substrate with resources from multiple cloud in-

frastructures, and by proposing new scaling primitives to virtual networks.

FlowVisor OVX NVP/NSX Oktopus AccelNet Andromeda Sirius

Full No Yes Yes No Yes Yes Yesvirtualization (limited)Type of Proxy- Proxy- Edge-based Edge-based Edge-based Edge-based Edge-basedvirtualization based based

Arbitrary No Yes Yes No Yes Yes YestopologyArbitrary No Partial Yes No Yes Yes YesaddressingAvailable Yes Yes No No No No Yesopen-sourceSecurity No No ACL only No ACL & NAT Stateful firewall Node & linkservices ACL & NAT security

Multi-cloud No No No No No No YessystemElastic No No No No No No YesVNs

Table 2.1: Summary of network virtualization solutions.

27


2.3 Virtual network embedding

One of the fundamental algorithmic challenges in network virtualization is the VirtualNetwork Embedding (VNE) problem, a focus of this thesis. VNE addresses the problemof embedding the virtual networks specified by the tenants into the substrate infrastruc-ture, while making efficient use of the shared resources. It is traditionally formulated withthe objective of maximizing network providers profit by efficiently embedding virtual net-work (VN) requests. This objective is subject to constraints, such as processing capacity onthe nodes and bandwidth resource on the links. The VNE literature is already abundant,although security and dependability have been relatively neglected (Fischer et al., 2013a).In the following, we present some of the most important studies on VNE.

2.3.1 Baseline VNE Solutions

One of the seminal papers on VNE is the work by Yu et al. (2008). The authors proposed anoriginal two-phase approach to solve node and link embeddings. Firstly, a greedy approachis used for node embedding. Afterward, follows link embedding, with two approaches pro-posed: Multi-commodity flow (MCF) and k-shortest paths. The greedy approach for nodemapping aims to choose the substrate resources with more substrate resources available ina specific moment. The function to measure the resources considers both the nodes’ CPUand the bandwidth of all links connected to this node. Another innovation was the assump-tion that path splitting was available in the substrate making it viable the use of MCF inthe link embedding phase. The solution to this kind of problem without path splitting isconsidered intractable because it is NP-hard (Andersen, 2002), and MCF decreases the timeto solve the problem significantly.

Later, Chowdhury et al. (2012) proposed two algorithms for VNE that introduced a bettercoordination between the node and the link mapping phases. In Figure 2.12, we presentthe main idea to enable coordination: generating an augmented substrate graph. In this so-lution, meta-nodes and meta-links are inserted in the original graph, considering a locationrequirement that connects the two phases. After the graph is augmented, a relaxed linearprogramming solution is formed to obtain the node mappings. The link mapping followsthe same MCF approach as Yu et al. (2008).

28


Figure 2.12: Example of an augmented substrate graph with clusters, meta-nodes and meta-links (based on Figure 2 of Chowdhury et al. (2012))

Cheng et al. (2011) proposed a different approach. They used Markov chains to find a noderanking based on the CPU, bandwidth and the importance of the node in the topology, andused this information in the two-phase approach, with node and link mappings separatedas usual. Different from other works, the node ranking is accomplished by taking intoaccount topology attributes (first phase), with the purpose of improving the success rateand efficiency of link mapping (second phase). In Figure 2.13 we illustrate the main conceptof “importance” in this context. Node A1 is more important than node A2 because theavailable resources in its neighbors is higher, so it is better positioned in the topology.

Figure 2.13: Example of how topology can influence node mapping (based on Figure 2of Cheng et al. (2011)).

29


2.3.2 Dependable VNE

While important baselines, none of these works consider dependability. As failures are in-

evitable, it is with no surprise that solutions considering this problem have been developed

for VNE. One of the first instances was the work by Rahman et al. (2010) that formu-

lates the survivable virtual network embedding (SVNE) problem, to incorporate recovery

of single substrate link failures in VNE. They propose an heuristic that solves SVNE in

three separate phases: 1) before any VN request arrives, backups for each substrate path

are pre-computed; 2) when a VN request arrives, node embedding is performed using exist-

ing heuristics (for instance, Chowdhury et al. (2012)’s), and link embedding is performed

through a linear program based on multi-commodity flow; and 3) when a substrate link

failure happens, a reactive backup detour optimization solution is invoked which reroutes

the affected bandwidth along candidate backup detours selected in the first phase. Yu et al.

(2011) focuses on the failure recovery of substrate nodes. The idea of this paper is to take

the original non-survivable virtual network (VN) and transform it into a VN with redun-

dant VN nodes, so that, if a critical VN node fails, the failed node and its connections are

migrated to a redundant node. To improve efficiency, some authors consider pooling of

backup resources, such as Yeow et al. (2010).

More recently, Shahriar et al. (2016) investigated the problem of ensuring virtual network

connectivity in the presence of multiple substrate link failures. The approach proposed

consists of augmenting the topology with parallel virtual links between adjacent virtual

nodes. This idea can be seen in Figure 2.14. In the figure, the original VN is represented by

solid edges. Note that if link G-B and G-C fail in the original VN, G and H are disconnected

from the network. To ensure tolerance for two link failures, three edges (dashed) are added.

In this article, two solutions are proposed: one completely heuristic, and another partially

heuristic, partially linear programming. In both solutions, the first step is the Conflict Set:

for a VN to be k-protected, the algorithm will augment the graph to guarantee k+1 edge

connectivity between every virtual node that belongs to the VN. The second step is to

embed the augmented VN using an heuristic (in the first solution) or to solve a MILP (in

the second).

30


Figure 2.14: A VN not capable of tolerating two link failures (solid edges) and a VN aug-mented that tolerates them (both solid and dashed edges) (based on Figure 2 of Shahriaret al. (2016)).

2.3.3 Secure VNE

Fischer et al were amongst the first to introduce security into the VNE problem in a po-sition paper, Fischer & Meer (2011). The authors argue that the abstraction provided byvirtualization mechanisms introduces several additional layers that may pose significant se-curity risks, as additional attack vectors become available to adversaries. They categorizethe problems into three types of attacks: attacks from the physical host to its VMs; attacksfrom the VM to the physical host; and side-channel attacks between VMs. The first andthird types of attack reinforce the need to run virtual hosts on trustworthy hosts, some-thing not available on existing network virtualization platforms and on which this thesismakes a first attempt. The authors proposed the security aware mapping of virtual re-sources. The idea is to assign security levels and security demands to every virtual networkrequest and to every physical resource. Then, additional constraints need to be added to theoriginal VNE problem: virtual resources should not be mapped to physical resources thathave a lower security level than the security demand of the virtual resource; the physicalresources should not be used to host virtual resources that have a lower security level thanthe security demand of the physical resource; and a virtual resource should not be co-hostedon the same physical resource of other virtual resource that has lower security level thanthe security demand of the first resource. In this position paper, the authors only presentedthe problem and a rough sketch of the solution, but have not proposed, implemented orevaluated any algorithm to address the problem. In addition, the model is overly generic,

31


being both user- and provider-driven. As a result, it is inadequate to our scenario. Speci-fically, ours is a user-driven approach, with the user defining its virtual network resourcedemands and our solution mapping these into substrate resources that fulfill the request.Moreover, in this work, only node security issues are considered.

Considering that links may also suffer from security threats (e.g. adversaries can influ-ence the physical links in a negative way, like replay attacks), Bays et al. (2012) and Liuet al. (2014) have afterward proposed VNE algorithms based on this idea. However, theirspecification of security is limited. The authors of Bays et al. (2012) consider only linkprotection, by considering specific physical paths to provide a cryptographic channel, as-suring confidentiality in communications. Liu et al. (2014) go further, by considering alsonode security. However, their model makes a few assumptions that make it unfeasible fora practical network virtualization platform. First, it requires virtual nodes to fulfill a spe-cific security level, demanded by the physical host. In practice, a virtualization platformcannot assume such level of trust about its guests. Second, it assumes that the duration of avirtual network request is known beforehand. This limits its applicability in a traditionalpay-as-you-go model, as a cloud tenant typically does not know in advance the durationof its requests. To make it as realistic as possible, we make none of these assumptions inour work as a design requirement. Finally, none of these works considers availability nora multi-cloud setting with different trust domains.

2.3.4 Multi-domain VNE

The majority of the works in VNE only consider a single provider. An exception is Chowd-hury et al. (2010). In this paper the authors addressed the conflicts of interest between ser-vice providers (that are interested in satisfying their demands while minimizing their expen-diture) and infrastructure providers (that strive to optimize the allocation of their resourcesby favoring requests that offer higher revenue while offloading unprofitable work to theircompetitors). This paper proposed PolyVINE – a policy-based end-to-end VNE frame-work that partitions a VN request into k subgraphs to be embedded onto a k-substratenetwork, establishes inter-connections between the k subgraphs using inter-domain paths,and embeds each subgraph in each provider using an intra-domain algorithm. The goal ofthis work is to coordinate policies among inter-domains, a different problem from ours.

32


Other examples include the works by Houidi et al. (2011) and Dietrich et al. (2015). These

solutions considered a multiple substrate in the embedding, and address some important

albeit orthogonal problems that could be incorporated into our Sirius (e.g., Dietrich et al.

(2015) addresses the multi-domain VN embedding problem with limited information dis-

closure, which is of high relevance to VN providers).

2.3.5 Summary

To conclude this Section, we present, in Table 2.2, a summary of the most relevant works

on VNE. The main conclusion is that existing work focuses on a specific aspect: fault-

tolerance of nodes or links (rarely both), security (less often) of nodes or links (rarely both),

and multi-domain substrates (even rarer). This limitation of the state-of-art motivated us

to develop new embedding algorithms that integrates all these aspects to build a secure and

dependable, user-centric, solution.

One common limitation of the majority of the VNE work is that they focus solely on the

embedding problem, and do not build a virtual network prototype, as we set to do in this

thesis.

Node Link Tolerate node Tolerate link Multi-domain/security security failures failures Multi-cloud

Yu et al. (2008) No No No No NoChowdhury et al. (2012) No No No No NoCheng et al. (2011) No No No Yes NoYu et al. (2011) No No Yes No NoRahman et al. (2010) No No No Yes NoYeow et al. (2010) No No Yes Yes NoShahriar et al. (2016) No No No Yes NoLiu et al. (2014) Yes Yes No No NoBays et al. (2012) Yes Yes No No NoChowdhury et al. (2010) No No Yes No YesHouidi et al. (2011) No No Yes No YesDietrich et al. (2015) No No Yes No YesSirius Yes Yes Yes Yes Yes

Table 2.2: Summary of the most relevant embedding approaches.

33


2.4 Multi-cloud systems

Hardware

Xen KVM

Provider controlled VM manager Provider controlled VM manager

Xen-Blanket

User controlled VM manager

Xen-Blanket

User 1 User 2

VM VM . . . VM . . .

Figure 2.15: Xen-Blanket overview architecture (based on Figure 1 of Williams et al. (2012)).

The multi-cloud model has been successfully applied in the context of computation Costaet al. (2016, 2017); Williams et al. (2012), and storage Bessani et al. (2013, 2014). Williamset al. (2012) first proposed Xen-Blanket: a novel approach to solve the problem of hetero-geneity in the virtual machine managers and the VMs in diverse cloud providers, that pre-cludes unified cross-cloud access and control. Instead of waiting for standards to be de-veloped to homogenize the multi-cloud environment, this work follows a user-centric ap-proach by creating a second-layer hypervisor that runs as a guest inside the VM instances,both in public and private clouds. As depicted in Figure 2.15, Xen-Blanket uses nested vir-tualization as a layer that has the capacity of communicating with a variety of hypervisorinterfaces (bottom), exposing a single VM interface to the second layer (top) that is underuser’s control. Xen-Blanket allows services to be deployed in a multi-cloud environment,allowing VM inter-cloud migration, and other benefits.

From the point of view of dependability, Costa et al. (2016) proposed a system that allowsMapReduce computations to scale out to multiple clouds (Medusa), with the purpose of

34

2.4 Multi-cloud systems

ClientChrysaor

Cloud A

ResourceManager

NodeManager

Cloud B

ResourceManager

NodeManager

Cloud C

ResourceManager

NodeManager

Figure 2.16: The multi-cloud Chrysaor’s system model (based on Figure 1 of Costa et al.(2017)).

tolerating several types of faults, including arbitrary faults, malicious faults, and cloud out-

ages. Chrysaor (Costa et al., 2017) is an improvement over Medusa that offers fine-grained

replication, allowing re-executing only part of the job (tasks) affected by an issue. For this

purpose, Chrysaor creates “logical jobs” that enable this fine granularity. We present the

Chrysaor system model in Figure 2.16. The client submits an unmodified MapReduce job

that is intercepted by Chrysaor, that acts as a proxy. Chrysaor submits the job as a set of

logical jobs - to separate the Map and Reduce phases - to multiple resource managers, in

the required number of clouds to meet the dependability requirements. These logical jobs

are then received by an unmodified resource manager that handles job executions to the

various nodes.

In the context of storage, Bessani et al. (2013) proposed DepSky: a multi-cloud storage so-

lution whose architecture is displayed in Figure 2.17. DepSky improves the availability,

integrity and confidentiality of information stored in the cloud through the encryption,

encoding and replication of the data on diverse clouds. Afterwards, Bessani et al. (2014)

proposed SCFS: a cloud-backed file system that uses a multi-cloud approach while provid-

ing strong consistency and near-POSIX semantics on top of eventually-consistent cloud

storage services.

35


DepSky cloud-of-clouds

Cloud 1 Cloud 3Cloud 2Cloud 4

DepSky client 1 DepSky client 2

Value(data)Value

(data)

Figure 2.17: Architecture of DepSky (based on Figure 3 of Bessani et al. (2013)).

A contemporary approach to ours, Supercloud, Shen et al. (2017), built on Xen-Blanket

to provide users with cloud services that span over multiple clouds. Similar to our work,

they use an SDN-based approach for the networking component. The main differentiating

factor of our work is that Supercloud does not offer network virtualization. The network-

ing services made available are limited to connectivity between virtual nodes, guaranteeing

that IP addresses do not change when VM migration is necessary. By contrast, our solu-

tion offers virtualization of the topology and of addressing. In addition, it includes security,

availability, and elasticity services in the virtual networks. Another difference is that Super-

cloud is restricted to cloud VM instances that offer the interfaces necessary for the nested

virtualization technique employed. Namely, HVMs that provide a PV interface. Our so-

lution does not require that support, enabling any type of VM instance to be used as part

of our substrate.

Compute Storage Network Security Availability ElasticityWilliams et al. (2012) XCosta et al. (2016) X XCosta et al. (2017) X XBessani et al. (2013) X X XBessani et al. (2014) X X XShen et al. (2017) XSirius X X X X

Table 2.3: Summary of multi-cloud systems.

36

2.5 Elastic Virtual Networks


One of the main benefits of the virtualized infrastructures that enable cloud computing is

the ability to request resources on-demand, as needed. This is opposed to buying hardware

upfront and over-provisioning the data center in order to handle sudden increases in traf-

fic, or even expected short-term changes in demand. This is true for compute and storage

resources today, but ideally the network should also dynamically adapt to the changing

application needs. Modern virtualization platforms, however, do not allow scaling virtual

networks. In addition, most VNE work ignores the requirement to dynamically scale vir-

tual networks.

Two exceptions are Kraken (Fuerst et al., 2016) and Yu & Cai (2016). Kraken supports

online updates to both bandwidth and compute resources. The system also performs mi-

gration of specific VMs (the end nodes) to reconfigure the substrate. Yu & Cai (2016) also

address the problem of scaling up a virtual network with bandwidth guarantees, but pro-

posed dynamic programming algorithms for scaling, optimizing virtual cluster locality and

VM migration cost. One limitation of these works is that they focus on a single topology:

the virtual cluster. By contrast, in Sirius we address the problem of scaling arbitrary vir-

tual topologies. A very recent effort by Michel et al. (2019) addresses this exact problem,

however, as we will see in Chapter 6, the migration cost of this solution is very high. Sir-

ius improves significantly over Michel et al. (2019)’s works in this aspect, as we will see in

Section 6.5.

A disjoint set of work focuses on scaling in the context of Network Function Virtualiza-

tion (NFV) environments, a different set of ours. Wang et al. (2015), for instance, solve the

virtual network function (VNF) placement and scaling problem considering a pre-planned

allocation that minimizes resource usage and overhead while guaranteeing the bandwidth

requirements. Dräxler et al. (2017) formalized this as the template embedding problem, but

in their solution the decision is made based, exclusively, on the data rate from different

sources on the network nodes. As we said, in spite of considering scalability, these works

focus on NFV chains, a different problem from ours.

37


Network Scaling in Virtualmigration scaling out network

Fuerst et al. (2016) Yes Yes VC onlyYu & Cai (2016) No Yes VC onlyWang et al. (2015) No Yes NFV chainsDräxler et al. (2017) No Yes NFV chainsWang et al. (2008) Yes No LimitedGhorbani et al. (2014) Yes No ArbitraryMichel et al. (2019) Yes Yes ArbitrarySirius Yes Yes Arbitrary

Table 2.4: Elastic VNs and enablers.

2.5.1 Enablers for elastic VN

One of the challenges of scaling virtual networks is fragmention (Michel et al., 2019). As

virtual networks are mapped out the substrate, and are afterwards scaled in and on, the

substrate tends to become fragmented (akin to disk fragmentation), as resources from a

particular VN becomes scattered across the substrate. As a consequence, path lengths in-

crease, leading to higher latencies and inefficiencies. In Chapter 6 of this thesis we explore

the use of recent network migration solutions to reduce fragmentation and improve net-

work efficiencies, so we address these enabler techniques in this section.

While VM migration is already a consolidated technology (Clark et al., 2005), the ability

to migrate networks (both VMs and switching equipment) is not. This technique can be

tremendously useful for data center virtualization solutions as entire virtual networks can

be moved and consolidated, reducing fragmentation. Wang et al. (2008) proposed VROOM

(Virtual ROuters On the Move), a network-management primitive where virtual routers

may move between physical routers without changes to the topology. This is the first work

where the authors argue that the decoupling between logical and physical configurations

may simplify network management, but it was restricted to traditional networks.

Later, Ghorbani et al. (2014) proposed LIME, a system that enables live migration of an

ensemble (a set of virtual machines and virtual switches), allowing new tools for cloud

providers for planned maintenance, to optimize resource usage, etc. LIME consists of a

38


Figure 2.18: A simple ensemble migration (based on Figure 2 of Ghorbani et al. (2014)).

network abstraction layer and an API that runs between the SDN controller and the ten-

ant’s applications, enabling a transparent live migration process.

The main challenge is to be capable of migrating (all or part of) the virtual network with-

out penalty to normal network’s behavior. During a migration, VMs should continue to

observe packets being received and sent, and SDN controllers should continue to observe

events and be able to query statistics, for instance. All of this should occur without the

tenant perceiving any network change, but only slight latency/throughput variation and

maybe occasional packet drops or out of order delivery. As regular networks work with

this best effort characteristic, these specific failures maintain transparency in the migration

process.

In Figure 2.18, we illustrate the live migration process in LIME, in a simple network with

one switch and 3 hosts (an ensemble on top the figure). The goal is to move the ensemble

from the left to the right. In the middle of the network we present a naive approach that

creates a tunnel between the two switches, original and new. This forces h1 to communi-

39


cate with h2 via the tunnel, even despite the fact that they are still connected to the sameswitch. In the bottom of the figure, we show the LIME approach. The idea is to clonethe switch during the migration process, enabling local communications to remain local.The challenge LIME then addresses is maintaining consistency throughout the process. InChapter 6 we explore the use of a live network migration approach such as LIME to enableefficient and elastic virtual networks.

In Table 2.4 we present a summary of some of the relevant approaches that aim to enhancevirtual networking with elasticity, and also the enablers just discussed. The main problemof the most promising solutions is that they target a specific topology, so they do not addressour generality requirement.

2.6 Summary

In this chapter we have presented the platforms, mechanisms and algorithms of modernnetwork virtualization, and detailed the limitations that motivate our work.

Specifically, modern network hypervisors target a single cloud provider scenario, motivat-ing us to explore multi-cloud approaches centred in the user. In addition, they are limited inthe security, dependability, and elasticity services they provide, motivating us to innovate,particularly in the algorithmic support that underlies these solutions.

40

3Sirius: Multi-cloud Network

Virtualization

Recent SDN-based network virtualization solutions (Dalton et al., 2018; Firestone et al.,

2018; Koponen et al., 2014) give cloud providers the opportunity to extend their “as-a-

service” model with the offer of complete network virtualization. They provide tenants

with the freedom to specify the network topologies and addressing schemes of their choos-

ing while guaranteeing the required level of isolation among them. These platforms, ho-

wever, have been targeting the data center of a single cloud provider with full control over

the infrastructure.

Another limitation of existing multi-tenant network virtualization platforms is that they

have so far focused on the offer of conventional networking services only. As such, they

face limitations in terms of security and dependability, both in terms of the infrastructure

itself and of the services offered to its customers. To address these challenges in this chapter

we present the design and implementation of Sirius (Alaluna et al., 2016, 2017), a network

virtualization platform for multi-cloud environments. Contrary to existing solutions, Sir-

ius considers not only connectivity and performance, but also security and dependability

as first-class citizens, leveraging from a substrate infrastructure composed of both public

clouds and private data centers.

41

3. SIRIUS: MULTI-CLOUD NETWORK VIRTUALIZATION

3.1 Motivation

The advances in computing and storage virtualization that enabled cloud computing havenot been met by networking. Traditional forms of network virtualization (VLANs, etc.)do not present the scalability and flexibility that is necessary for current cloud environ-ments. The reason lies fundamentally in the complexity of network management and con-trol. In particular, networking has lacked unifying abstractions to enable network-widevisibility and control. As a result, network provisioning is typically orders of magnitudeslower when compared to its computing and storage counterparts (Dalton et al., 2018; Fire-stone et al., 2018; Koponen et al., 2014).

The current state of affairs has recently started to change with the emergence and rapidadoption of Software-Defined Networking (SDN). By decoupling the networking planesand by logically centralizing control, SDN offers operators network-wide visibility anddirect control over traffic in the network (Kreutz et al., 2015). These capabilities have led tothe development of production-quality network virtualization platforms (Koponen et al.,2014) that allow the creation of virtual networks, each with independent service models,topologies, and addressing schemes, over the same substrate network.

Current multi-tenant network hypervisors target single-provider deployments and tradi-tional services, such as flat L2 or L3, as their goal is to enable tenants to use their existingcloud infrastructures. Such single-cloud paradigm has inherent limitations in terms of scal-ability, security, and dependability, which may potentially dissuade critical systems to bemigrated to the cloud. For instance, a tenant may want to outsource part of its compute andnetwork infrastructure to a public cloud, but may not be willing to trust the same providerto store its confidential business data or to run sensitive services, which should stay in amore trusted environment (e.g., a private data center). To avoid cloud outages disruptingits services – a type of incident increasingly common (USA TODAY, 2017) – the tenantmay also wish to spread its services across clouds, to avoid Internet-scale single points offailures.

To address this challenge, we propose Sirius, a multi-cloud network virtualization platform.Contrary to previous approaches, Sirius leverages from a substrate infrastructure that en-tails both public clouds and private data centers. This brings with it several important ben-

42

3.2 Requirements

efits. First, it increases resilience. Replicating services across providers avoids single pointsof failure, making a tenant immune to any datacenter outage. Second, it can improve secu-rity, for instance by exploring the interaction between public and private clouds. A tenantthat needs to comply with privacy legislation, such as the GDPR, may demand certain dataor specific services be placed in trusted locations. In addition, it can improve performanceand efficiency. For example, the placement of virtual machines may consider service affin-ity to reduce latencies. Specific workloads can also be migrated to clouds which consumeless energy (Baliga et al., 2011). Dynamic pricing plans from multiple cloud providers canalso be explored to improve cost-efficiency (Zheng et al., 2015). The multi-cloud modelhas been successfully applied in the context of computation (Williams et al., 2012) and sto-rage (Bessani et al., 2014) recently. To the best of our knowledge, this is the first time themodel is applied for network virtualization.

3.2 Requirements

The network virtualization platform we propose leverages on network infrastructure fromboth public cloud providers and private infrastructures (or private clouds) of the tenants.This heterogeneity impacts on the level of network visibility and control that may beachieved, affecting the type of configurations that can be pushed to the network, withobvious consequences on the kind of services and guarantees that can be assured by thesolution.

On one extreme case, the public cloud provider gives very limited visibility and no (orextremely limited) network control, which is often the case with commercial cloud serviceproviders (e.g., AWS). Even in this case, these clouds offer a full logical mesh among localVM instances (i.e., they provide a “big switch” abstraction), which we can use to implementlogical software-defined datapaths and thus present a virtual network to the tenant.

On the other extreme, full access may be attainable if the cloud is private (i.e., the datacenter belongs to the tenant). This results in a flexible topology that may be (partially)SDN-enabled, where both software and hardware switching may be employed.

Considering this setting, we aim to fulfill three requirements in the design of our multi-cloud network hypervisor. The first requirement is to have remote, flexible control over

43


the network elements. Traditional networks’ lack of such control has been identified as themain reason for the limitations of current forms of network virtualization (Koponen et al.,2014).

The second requirement is to offer complete network virtualization, including topology andaddressing abstraction, and isolation between tenants. For topology abstraction, differentmappings should be created when the network is setup. For instance, a virtual link cancorrespond to multiple network paths connecting the two endpoints. In addition, tenantsshould have complete autonomy to manage their own address space of the virtual network.Lastly, isolation between users should be enforced at different degrees. A first level is at-tained by separating the virtual networks of the users and then hiding them from each otherwhen they are deployed. A second level is to prevent the actions of one user to influencethe network behavior observed by the others. For example, if one of the users attempts toclog a particular link, this should not cause a significant decrease in the bandwidth availableto the other users.

Requirement number three is to further improve over existing network virtualization so-lutions by allowing users to specify security and dependability requirements for all virtualresources. While specifying the virtual network, besides defining requirements related toconnectivity and performance for nodes and links (for instance, bandwidth and CPU), wealso allow tenants to specify security properties and dependability of all virtual elements.These requirements are enforced during embedding by laying out the virtual elements atthe appropriate locations, where the substrate infrastructure still has enough resources tosatisfy the particular demands. In addition, the datapaths are configured with adequateroutes through the network.

3.3 Design of Sirius

Existing multi-tenant network virtualization platforms have so far focused on the offer ofconventional networking services by a single cloud provider. As we have argued, this limitsthem, in terms of security and dependability, both in terms of the infrastructure itself andof the services offered to its customers. To address these challenges, Sirius allows an organi-zation to manage resources belonging to multiple clouds, which can then be transparently

44


shared by various users (or tenants). Resources are organized as a single substrate infras-

tructure, effectively creating the abstraction of a cloud that spreads over several clouds,

i.e., a cloud-of-clouds (Lacoste et al., 2016). The main resources are interconnected virtual

machines (VM) that are either acquired from public cloud providers or are placed in lo-

cal facilities (i.e., private clouds). The Sirius prototype was also extended to include other

cloud resources, namely storage services (Oliveira et al., 2017).

Users can define virtual networks composed of a number of containers interconnected ac-

cording to an arbitrary topology. Sirius deploys these virtual networks in the substrate

infrastructure, ensuring isolation of the traffic by setting up separated datapaths (or flows).

While specifying the virtual network, it is possible to indicate several requirements for the

nodes and links, for example with respect to the needed bandwidth, security properties,

and fault tolerance guarantees. These requirements are enforced during embedding (to be

explored in the next chapters) by laying out the containers at the appropriate locations,

where the substrate infrastructure still has enough resources to satisfy the particular de-

mands. In addition, the datapaths are configured to follow adequate routes through the

network.

3.3.1 Architecture

The architecture of Sirius is displayed in Figure 3.1. The multi-cloud orchestrator is res-

ponsible for the dynamic creation of the substrate infrastructure by deploying the VMs and

containers. It also configures secure tunnels between gateway modules, normally building

a fully connected topology among the participating clouds. A gateway acts like an edge

router, receiving local packets whose destination is in another cloud and then forwarding

them to its peer gateways, allowing data to be sent securely to any container in the infras-

tructure. Intra-cloud communications between tenant containers use GRE tunnels setup

between the local VMs, to ensure isolation.

The network hypervisor runs as an application on top of an SDN controller. It takes all

decisions related to the placement of the virtual networks, and setups the network paths

by configuring software switches (Open vSwitch (Pfaff et al., 2015)) that are installed in all

VMs (along with OpenFlow hardware switches that may exist in private clouds, not shown

45


VM1

Network HypervisorSDN controller

VM1

ContainerHypervisor

VM1VM1

VM1Co

nta

iner

ContainerHypervisor

OvS

Public cloud VM manager

ContainerHypervisor

GA

TEW

AY

VM1

ContainerHypervisor

SECURE TUNNEL

Cloud provider 1 Cloud provider 2

Public cloud VM manager

VM1VM1

ContainerHypervisor

VM1

ContainerHypervisor

Private cloud

SECURE TUNNEL

Multi-CloudOrchestrator

Co

nta

iner

Co

nta

iner

Co

nta

iner

Co

nta

iner

Co

nta

iner

Co

nta

iner

Co

nta

ine

r

Co

nta

iner

Co

nta

iner

OvS OvS OvS OvS OvSGRE TUNNELGRE TUNNELGRE TUNNEL

. . .Vitual Network

of Tenant 1Vitual Network

of Tenant N

VM VM

GA

TEW

AY

GA

TEW

AY

VM VM VMVM

Figure 3.1: Sirius architecture.

in the figure). The hypervisor intercepts the control messages between the substrate infras-tructure and the users’ virtual networks, and vice-versa, thus enabling complete networkvirtualization.

The hypervisor was developed using a shared controller approach (the solution also adoptedin (Koponen et al., 2014)). Alternative solutions, including OVX (Al-Shabibi et al., 2014),assume one controller per tenant. Ours is a more lightweight solution, as only one logically-centralized component is needed for all tenants. It is also simpler to implement as it cantake advantage of the high-level APIs offered by the SDN controller, instead of having todeal with “raw” OpenFlow messages when interacting with the switches. Finally, this ar-chitecture follows a fate sharing design as the controller and the network hypervisor residein the same host. This facilitates replication for fault-tolerance.

3.3.2 Overview of Sirius operation

The deployment of a virtual network in the platform involves the execution of a few tasks.The first task is to assemble the substrate infrastructure. The administrator of Sirius within

46


the organization needs to indicate the resources that are available to build the infrastructure.She interacts with a graphical interface1 offered by the cloud orchestrator that allows theselection of the cloud providers, the type and number of VMs that should be created, andthe provision of the necessary access credentials. The network topology is also specified,pinpointing, for instance, the connections between clouds. For each provider, it is possibleto specify a few attributes, such as the associated trust level.

Based on such data, the orchestrator constructs the substrate infrastructure by interactingwith the cloud providers and by setting up the VMs. In each VM a few skeleton contain-ers are started with minimal functionality. The gateways are also interconnected with thesecure tunnels. The next step is for the hypervisor to be initialized by obtaining, from theorchestrator, information about the infrastructure. Then, it contacts each network switchto obtain data about the existing interfaces, port numbers, and connected containers. Af-ter populating the hypervisor’s internal data structures, Sirius is ready to start serving theusers’ VN requests.

The second task is to run virtual networks on demand, whenever a user of the organizationneeds to run an application in the cloud. The user employs a graphical interface of theorchestrator to represent a virtual network with the various containers that implement theapplication. Containers are then interconnected with the desired (virtual) switches andlinks. Complete flexibility is given on the choice of the network topology and addressingschemes. Attributes may be associated with the containers and links, specifying particularrequirements with respect to security and dependability. For example, certain links mayneed to have backup paths to allow for fast fail-over, while certain containers may only bedeployed in clouds with the highest trust levels.

The orchestrator receives the VN request and forwards it to the hypervisor to perform thevirtual network embedding. The embedding algorithm (described in the next two chapters)decides on the location of the containers and network paths considering all constraints,namely the available resources in the substrate infrastructure and the security requirements.The computed mapping is transmitted to the orchestrator so that it can be displayed uponrequest of the Sirius administrator. Hereafter, the orchestrator and the hypervisor work inparallel to start the VN. The orchestrator downloads and initializes the containers images

1The same sort of information can also be provided through configuration files, to simplify the use ofscripts.

47


in the chosen VMs, and configures the IP and MAC addresses based on the tenant’s request.

The hypervisor enables connectivity by configuring the necessary routes by setting up the

flows in the switches while enforcing isolation between tenants.

3.3.3 The multi-cloud orchestrator

The multi-cloud orchestrator combines three main features, as can be seen in Figure 3.2.

First, it manages interactions with users through a web-based graphical interface. Users

with administrator privileges can design the substrate infrastructure topology (Admin GUI),

indicating the kind of VMs that should be deployed in each cloud provider. Similarly, nor-

mal users can represent virtual networks of containers (User GUI) and later request their

deployment. The graphical interface also displays the mappings between the containers

and links in the substrate infrastructure and the status of the various components. A view

of the graphical interface is shown in Figure 3.3.

JcloudsAPI

HypervisorTopologyDatabase

ServletServer

TopologyManager

R/W

HTTP

Topologyupdates

XML / TCPSocket

API

VM attributes GET / SET

Deployrequests & replies

AdminGUI

Topologyupdates

SSHPublic &Private

Cloud VMs

Figure 3.2: Modular architecture of the multi-cloud orchestrator.

Second, it keeps information about the topologies of the substrate and virtual networks

and their mappings. This information is kept updated, as virtual networks are created and

destroyed, thus offering a complete view of how the infrastructure is currently organized.

48


In addition, it maintains in external storage a representation of the different networks that

were specified, allowing their re-utilization when users want to run similar deployments.

Figure 3.3: Graphical User Interface of Sirius.

Third, it configures and bootstraps VMs in the clouds in cooperation with the network

hypervisor and setups the tunnels for the inter-cloud connections. Apart from that, when

a virtual network is started, it also initiates the containers in the VMs selected by the hy-

pervisor. A storage of VM and containers is kept locally, in case the users prefer to work

and save the images within the organization.

Figure 3.4 shows the main connections that are managed within the infrastructure. Gate-

ways have public IPs that work as endpoints of secure tunnels between the clouds. In our

current implementation, OpenVPN with asymmetric key authentication is employed as

the standard solution as it presents the advantage of being generic and independent from

the provider’s gateway service (e.g. VPC service for Amazon EC2). Links between VMs

rely on GRE tunnels. We chose this approach as intra-cloud communications are expected

to be performed within a controlled environment and inter-cloud traffic is protected by

the secure tunnel. The containers use the IP addresses defined by the tenants (without re-

strictions), and isolation is achieved by the network hypervisor properly configuring the

switches’ flow tables (an aspect to be detailed in Section 3.3.5).

49


Clo

ud

B

Clo

ud

A

Gateway VM

OVS

Co

nta

ine

r

Container Hypervisor

Co

nta

iner

Co

nta

ine

r

OvS

IPT IPT IPT

Gateway VM

OVS

Co

nta

ine

r


Co

nta

iner

Co

nta

ine

r

OvS

IPT IPT IPT

Local VM

OVS

Co

nta

ine

r


Co

nta

iner

Co

nta

ine

r

OvS

IPT IPT IPT

Internet

GRE Tunnel IPprivIPpriv

Secure tunnel

GRE Tunnel

IPpubIPpub

Figure 3.4: Intra- and inter-clouds connections.

3.3.4 Hypervisor architecture and components

The design of the hypervisor software follows a modular approach. We present its building

blocks in Figure 3.5.

The Embedder addresses the problem of mapping the virtual networks specified by the ten-

ants into the substrate infrastructure (Fischer et al., 2013a). As soon as a virtual network re-

quest arrives, the secure Virtual Network Embedding (VNE) module finds an effective and

efficient mapping of the virtual nodes and links onto the substrate network, with the ob-

jectives of minimizing the cost of the provider and maximizing its revenue. This objective

takes into account, firstly, constraints about the available processing capacity of the sub-

strate nodes and of the available bandwidth resources on the links. Moreover, we consider

security and dependability constraints based on the requirements specified by the tenants

to each virtual resource. These constrains address, for instance, concerns about attacks on

virtual machines or on substrate links (e.g., replay/eavesdropping). As such, each particu-

lar node may have different security levels, to guarantee for instance that sensitive resources

are not co-hosted on the same substrate resource as potentially malicious virtual resources.

In addition, we consider the coexistence of resources (nodes/links) in multiple clouds, both

public and private, and assume that each individual cloud may have distinct levels of trust

from a user standpoint. We will further detail these algorithms in Chapters 4 and 5.

50


Multi-Cloud Network Hypervisor

sNet topology specifier & config

sNet config

sNet topology data collector

vNet topology specifier & config

vNet config

vNet routingsecure VNE

Embedder

virtual-substrate mapper

flows handlernetwork

monitoring

virtual-substrate handler

packet-In handler

components isolation handler

Hypervisor core

Interfaces handler

External Interfaces

Figure 3.5: Modular architecture of the network hypervisor.

The Substrate Network (sNet) Configuration module is responsible for maintaining in-formation about the substrate topology. It reaches its goals by performing two main func-tions. First, it retrieves information from the orchestrator about the substrate nodes andlinks, alongside their security and dependability characteristics. Second, it interacts witheach switch to set itself as its master controller and to collect more detailed information,including switch identifiers, port information (e.g., which ports are connected to whichcontainers), etc. This information is maintained in efficient data structures to speed updata access.

The Virtual Network (vNet) Configuration module is responsible for maintaining infor-mation about the virtual network topologies. This includes both storing tenant requestsand the mapping that is the result of the embedding phase. As the embedding moduleoutputs only the substrate topology that maps to the virtual network request, this mod-ule runs a routing algorithm to define the necessary flow rules to install in the switches(without populating them – that’s left for the next module).

The Hypervisor core module is configured as a controller module (in our case, Flood-light (Floodlight-Project, 2019)). Its first component is the virtual-substrate mapper that,

51


after interacting with the substrate topology and virtual topology modules, requests a spe-

cific mapping to the embedder. When the output of the VNE returns successfully, the

mapping is stored in specific data structures of the core module and this information is

shared with other interested modules (namely, the vNet configuration module).

Network monitoring is responsible to detect changes in the substrate topology when a

reconfiguration occurs (e.g., due to failures in the substrate network). This module then

sends requests to the virtual-substrate handler to update its data structures accordingly.

Isolation is handled by several sub-modules, including the isolation handler, the packet-in

handler, and the flows handler. These components’ goal is to guarantee that each tenant

perceives itself as the only user of the infrastructure. We currently use four main tech-

niques for this purpose. First, as we have control over the entire infrastructure, from the

core to the edge, we uniquely identify each tenants’ host by its precise location. Second,

based on this unique identification and on the tenant ID we perform address translation

at the edge from the tenant’s MAC to an ephemeral MAC address (eMAC) and install the

required flows based on the eMAC. The flows for communication between all virtual nodes

are initially installed pro-actively by the flow handler module in such a way as to guarantee

isolation between tenant’s traffic. For efficiency reasons, flows are installed with prede-

fined timeouts. When a timeout expires (which means a particular pair of nodes has not

communicated during that period) the flow is removed from the switches to save flow table

resources. If communication ensues between those nodes afterward, the first packet of the

flow generates a packet-in that is sent to the hypervisor, triggering the packet-in handler to

install the required flows in switches. Third, we perform traffic isolation during the ini-

tial steps of communication, namely, by treating ARP requests and replies. Finally, flow

table isolation is guaranteed by each virtual switch having its own virtual flow table, with

a predefined size limit. We detail these techniques further in the next section.

3.3.5 Virtualization runtime: achieving isolation

The main requirement of our multi-tenant platform is to provide full network virtualiza-

tion. To achieve this goal it is necessary to virtualize the topology, addressing, and service

models, and guarantee isolation between tenants’ networks. Topology virtualization is

52


achieved in our system by means of the embedding procedure already described and fur-

ther detailed in subsequent chapters. In this section, we focus on the other three aspects.

Sirius allows tenants to configure their VMs with any L2 and L3 addresses. Tenants thus

have complete autonomy to manage their address space. They can also retain their pre-

ferred L2 and L3 service models (for instance, they can use VLAN services). Giving ten-

ants with these options precludes the use of labeling techniques for virtualization, such as

using VLAN tags to identify tenants (as this would break the L2 service model) or inserting

tenant-based tags in the L2 or L3 address (as this would restrict the addressing choices).

To achieve these two goals and guarantee isolation we create a unique identifier for each

tenant’s hosts based on their location. We then perform edge-based translation of the host

MAC address to an ephemeral MAC address that includes this ID. Finally, we setup tunnels

between every Open vSwitch (i.e., between every VM of the substrate infrastructure).

An alternative solution that would also fulfill our requirements would be to setup tunnels

between all tenant’s hosts (in our solution this would mean setting up tunnels between

containers). This would avoid the need to maintain host location information and of edge-

based translation. The problem of this option is scalability. The number of tunnels would

grow with the number of containers (i.e., with the number of tenant’s hosts), whereas our

solution scales much better, as it grows with the number of provider VMs (in a production

setting, each VM is expected to run hundreds or even thousands of containers).

Uniquely identifying hosts. The tenant’s hosts of our solution are containers. We opted

for this operating system virtualization technology as it provides functionality similar to

a VM but with a lighter footprint (Higgins et al., 2015). Each container (i.e., each tenant’s

host) has its own namespace (IP and MAC addresses, name, etc.) and its own resources

(processing capacity, memory), and as such can be seen as a lightweight VM.

To uniquely identify a tenants’ host, we use its network location. Each container is con-

nected to a specific software switch (identified by a DatapathID), being attached to a unique

port. As such, we use as hostID the tuple ⟨s wi t c h po r t , Dat a pat hI d ⟩. Figure 3.6 shows

an example.

Edge address translation. Packets generated in a virtual network cannot be transmitted

unmodified in the substrate network. As different tenants can use the same addresses, col-

53


OvS 1DatapathId 1:

00:00:d1:9e:1a:d7:b8:4d

21

3…

p1

host ID ( host location) Tenant

< 1, 00:00:d2:9e:1a:d7:b8:4d >

< 2, 00:00:d2:9e:1a:d7:b8:4d >

< 3, 00:00:d2:9e:1a:d7:b8:4d >

... ...

< p2, 00:00:e3:6a:18:33:b5:11 >

OvS 2DatapathId 2:

00:00:e3:6a:18:33:b5:11

…

p2…

...

TENANT N

TENANT 1

TENANT 2

12C

on

tain

er 1

Co

nta

iner

j

Co

nta

iner

2C

on

tain

er 3

Co

nta

iner

t

2

1

Figure 3.6: (Switch port, DatapathId) = host ID

lisions could occur. For this reason, we perform edge-based address translation to ensure

isolation. We assign an ephemeral MAC address – eMAC – at the edge, to replace the host’s

MAC address. The translation occurs at the edge switch. Every time traffic originates from

a container its host MAC is converted to the eMAC. Before the traffic arrives at the receiv-

ing container the reverse operation occurs at the edge switch. The eMAC is composed of a

tenant ID and a shortened version of the hostID, unique per tenant.

This mechanism guarantees isolation in the data plane. The control plane guarantees are

provided by the hypervisor, as it has network-wide control and visibility. For this purpose,

the hypervisor populates the flow tables with two types of rules: translation rules in the

edge switches, as just explained; and forwarding rules that enable communication between

all hosts from a single tenant.

ARP handling. Hosts use the ARP protocol to map an IP address to an Ethernet address.

As we want unmodified hosts to run in our platform, Sirius emulates the behavior of this

protocol. When an ARP message arrives at a switch, it is forwarded directly to the des-

tination host. Flooding is never needed as the switches are configured by the hypervisor.

Even in those cases where the packet arriving at the switch does not match any flow rule –

because it has expired – a packet-in is sent to the hypervisor, which populates the required

tables with the necessary flow rules for the packet to be forwarded to the destination.

Flow table virtualization. As forwarding tables have limited capacity, in terms of TCAM

54

3.4 Implementation and evaluation

entries (hardware switches) or memory (software switches), in Sirius each tenant has a finite

quota of forwarding rules in each switch. This is important because the failure to isolate

forwarding entries between users might allow one tenant to overflow the number of for-

warding rules in a switch and prevent others from inserting their flows. Our hypervisor

maintains a counter of the number of flow entries used per tenant switch and ensures that

a preset limit is not exceeded.

The hypervisor controls the maximum number of flows allowed per tenant, in both phys-

ical and virtual switches. This control is performed using the OpenFlow field cookie (an

opaque data value that allows flows to be identified (ONF, 2018a)). When the hypervisor

inserts a new flow in a switch (which only occurs if the limit was not exceeded), the cookie

field is properly set to identify its tenant owner, and the counter for the number of flows in

this switch that belong to this particular tenant is incremented. When a flow is removed the

hypervisor is informed, extracts from the cookie the tenant owner of the flow just removed,

and decrements the corresponding counter.


We now present some implementation details and an initial evaluation of the prototype. We

setup a controlled environment with the purpose of measuring systems overhead including:

the setup time, and the data and control plane overheads.We leave an evaluation of the

system in a real multi-cloud setting to Chapter 5.

3.4.1 Implementation and experimental setup

The Sirius network hypervisor is implemented in Java as a Floodlight controller module.

This component is hosted in a virtual machine (one processor and 2.5 GB in main memory)

running in an Oracle VirtualBox VM in our testbed. The orchestrator runs in an Apache

Tomcat server. The client GUI is written in J avas c r i p t/ J Q ue r y and implements the

vis.js (vis, 2017), an open-source library for network visualization. Communication bet-

ween the HTTP client and server is performed using Servlet technology.

55


020

040

060

080

010

0012

00

Tim

e (m

s)

10 20 30 10 20 30Number of Virtual Networks

MST full mesh tunnel creation time

●

●

●

Figure 3.7: Setup time (left: MST; right: full mesh).

The evaluation of our solution in a controlled environment answers two main questions.

First, it shows the cost of deploying the environment by analyzing the different compo-

nents that make up the setup time. In particular, we study how the creation of tunnels and

the tunnel topology itself influence the setup time, and how this variable scales with net-

work size. Second, we evaluate the overhead introduced by our virtualization layer, both

in the control and data planes.

The experiments were run on a testbed composed of two servers equipped with 2 Intel

Xeon E5520 quad-core, HT, 2.27 GHz, and 32 GB RAM. The hypervisor used is XenServer

6.5, running OvS 2.1.3. There is a router between the servers to simulate a multi-cloud

environment. One of the servers hosts one VM dedicated to the Floodlight controller, and

another to host Mininet 2.2.0.

3.4.2 Evaluation results

Setup time. The setup time is the time between the moment the tenant submits a virtual

network request until the instant when the whole network components are initialized and

instantiated. This time is composed of two components: time to populate the network

state in the resilient network hypervisor, and time to configure and initialize all tunnels.

56


0.030

0.050

0.075

0.100

100 200 300 400 500 600 700 800 900 1000

Late

ncy

(ms)

Number of switches

● VirtualizedNon−virtualized

Figure 3.8: Control plane overhead

We compare two different tunnel topologies. The first is a setup with a full mesh of tunnelsbetween all VMs, creating a one-hop tunnel between each pair of VMs, to serve as a baseline.The second is our solution: we set up a minimum spanning tree (MST) between those sameVMs. The results are shown in Figure 3.7.

As expected, for the MST case the setup time grows linearly with network size. By contrast,a full mesh has an O(n2) cost, and hence the setup time grows quadratically. As can be seenin the full mesh case, tunnel creation has a visible effect on setup time as the network grows,making it a fundamental component for large-scale scenarios. This motivates the need tominimize the number of tunnels for the system to scale. In any case, these setup times arestill two to three orders of magnitude below the time to provision and boot a VM in thecloud (Li et al., 2010).

Control plane overhead. We measure the cost of network virtualization in the control planeusing cbench, a control plane benchmarking tool that generates packet-in events for newflows. In this test, cbench is configured to spawn a number of switches equal to the num-ber of virtual networks, each switch having 5 hosts with unique MAC addresses. The testsare run with cbench in latency mode. In this mode, cbench sends a packet-in request andwaits for a response before sending the next request. This allows measuring the controller’srequest processing time. We consider two scenarios: one with network virtualization, andanother without network virtualization. We present the results in Figure 3.8.

As can be seen, the virtualization layer adds a very small overhead of less than 0.1 ms com-pared to the baseline. Importantly, the latency overhead is mainly independent of networksize (i.e., as the network grows the latency overhead remains relatively stable). Further,

57


1050

500

5000

Thr

ough

put (

Mbp

s)

Non−virtualized Virtualized

−71%−18%

−6%

10 Gbps1 Gbps100 Mbps

0.0

0.2

0.4

0.6

Rou

nd−t

rip ti

me

(ms)

Non−virtualized Virtualized

+57%+27%

+25%10 Gbps1 Gbps100 Mbps

Figure 3.9: Data plane overhead

for multi-cloud scenarios, the inter-cloud latency is in the order of the tens of hundreds of

milliseconds (Li et al., 2010), and hence this overhead is negligible.

Data plane overhead. To evaluate data plane overhead we make two experiments. We mea-

sure network latency by running several pings between two virtual machines executing

in different servers (emulating different clouds). To measure network throughput we run

netperf’s TCP_STREAM test between those same virtual machines. Again, we consider

two scenarios: one virtualized and one non-virtualized. The results are shown in Figure 3.9.

The results show that the virtualization layer introduces an overhead, in particular at very

high bit rates. This overhead is mainly due to the use of tunnels. This motivates the need to

minimize the use of tunnels by increasing traffic locality as much as possible. This can be

done by maintaining VMs that communicate frequently closer to each other. For instance,

VM migration could be triggered when this type of communication pattern is detected.

Anyway, for the multi-cloud scenarios, the inter-cloud throughput is in the order of the

hundreds of Mbps (Li et al., 2010). At these rates, the overhead is relatively low. The

additional latency is also negligible when compared with typical inter-cloud latencies.

58

3.5 Summary

3.5 Summary

In this chapter we presented Sirius, our multi-cloud network virtualization platform. Thedesign of Sirius extends the guarantees of connectivity and performance of virtual networksoffered by existing solutions with security and dependability. This is achieved by leveragingfrom multiple cloud infrastructures (both public and private) and by considering resiliencyrequirements from users during the virtual network embedding process.

In the next two chapters we present our solutions to the main algorithmic challenge of oursystem: the virtual network embedding problem.

59

4Secure Multi-Cloud Virtual Network

Embedding

In the last chapter we presented Sirius, our multi-cloud network hypervisor. Sirius ad-

dresses a specific limitation of modern network virtualization platforms: they target the

data center of a single provider, which is insufficient to support (critical) applications that

need to be deployed across multiple trust domains, while enforcing diverse security require-

ments.

This chapter follows up on addressing this limitation by presenting a novel solution for

the central resource allocation problem of network virtualization – the virtual network

embedding, which aims to find efficient mappings of virtual network requests onto the

substrate network. We improve over the state-of-the-art by considering security as a first-

class citizen of virtual networks, while enhancing the substrate infrastructure model with

resources from multiple cloud providers.

Our solution enables the definition of flexible policies in three core elements: on the virtual

links, where alternative security compromises can be explored (e.g., encryption); on the

virtual switches, supporting various degrees of protection and redundancy if necessary;

and on the substrate infrastructure, extending it across multiple clouds, including public

and private facilities, with their inherently diverse trust levels associated.

61

4. SECURE MULTI-CLOUD VIRTUAL NETWORK EMBEDDING

Here, we propose an optimal solution to this problem formulated as a Mixed Integer LinearProgram (MILP). The results of our evaluation give insight into the trade-offs associatedwith the inclusion of security demands into network virtualization. In particular, theyprovide evidence that enhancing the user’s virtual networks with security does not precludehigh acceptance rates and an efficient use of resources, and allows providers to increase theirprofits.

4.1 Introduction

Existing network virtualization solutions (including VMware NSX (Koponen et al., 2014),Google Andromeda (Dalton et al., 2018) and Microsoft AccelNet (Firestone et al., 2018)),offer their tenants traditional networking services. They allow tenants to specify a virtualtopology of L2 switches and L3 routers, to define arbitrary addresses to their network ele-ments, and to set ACL filtering rules. Although this represents a formidable advance overthe recent past (e.g., VLANs), it is still rather limited with respect to security and dependa-bility. Motivated by this limitation, as explained before, we extend virtual networks withsecurity assurances that go beyond simple ACL configurations. Specifically, we allow usersto specify security constraints for each element of the virtual network. These constraintsaddress, for instance, concerns about attacks on specific virtual hosts (e.g., covert chan-nels or DDoS attacks) or on physical links (e.g., replay/eavesdropping). In our solution, atenant can define a high security level to particularly sensitive virtual nodes and/or links,mandating, respectively, their instantiation in secure VMs (e.g., an Amazon EC2 instancethat provides DDoS protection), and their mapping to substrate paths that guarantee therequired security properties (e.g., confidentiality). To further extend the resiliency pro-perties of our solution, and since we support the coexistence of resources from multipleclouds, both public and private, we assume each individual infrastructure (cloud) to havedistinct levels of trust, from a user standpoint, enabling for instance virtual networks thatare GDPR-compliant.

In this chapter, we tackle the central resource allocation that is required to materialize ournetwork virtualization solution – the Virtual Network Embedding (VNE) – from this newperspective. VNE addresses the problem of embedding the virtual networks specified bythe tenants into the substrate infrastructure. When a virtual network request arrives, the

62

4.1 Introduction

goal is to find an efficient mapping of the virtual nodes and links onto the substrate net-work, while maximizing the profit of the virtualization provider. This objective is subjectto various constraints, such as the processing capacity on the substrate nodes and bandwidthof the links.

The literature on the problem is vast (Fischer et al., 2013b), but unfortunately, no existingsolution meets all our requirements, as we have shown in Section 3.2. In particular, theapproaches that consider security are either limited in the level of protection offered, in-clude assumptions that make them impractical for use in modern virtualization platforms,and/or do not fit an enriched multi-cloud substrate.

Motivated by this gap in the literature, in this chapter we propose an optimal VNE solution,based on Mixed Integer Linear Programming (MILP), that considers security constraintsbased on indications from the tenants. These constraints address the concerns about attackson hosts and on links mentioned above, including supporting the coexistence of resources(nodes/links) in multiple clouds with distinct levels of trust. Given the limited expressive-ness of a MILP formulation, we also propose a policy language to specify user requests.Our language includes conjunction (‘and’), disjunction (‘or’), and negation (‘not’) opera-tions, enabling tenants to express alternative resource requirements.

We have evaluated our proposal against the most commonly used VNE alternative (Chowd-hury et al., 2012). The results show that our solution makes a more efficient use of networkresources, which is translated into higher acceptance ratios and reduced costs. This demon-strates the advantage of our model for this context. Another interesting takeaway is thatthe performance decrease is limited even when a reasonable number of virtual network re-quests includes security requirements. For instance, when 20% of requests include securitydemands, the reduction in acceptance ratio is of only 1 percentage point. The results alsoillustrate the cost/revenue trade-off of including security services, shedding some light onthe pricing schemes a virtualization operator should employ to benefit from offering thesevalue-added services.

The contributions of this chapter can thus be summarized as:

(i) We formulate the SecVNE model and solve it as a Mixed Integer Linear Program (MILP).The novelty of our approach is in considering comprehensive security aspects over a multi-cloud deployment;

63


(ii) We propose a new policy language to specify the characteristics of the substrate network,and to allow the expression of user requirements;

(iii) We evaluate our formulation against the most commonly used VNE alternative (Chowd-hury et al., 2012), and analyze its various trade-offs with respect to embedding efficiency,costs and revenues.

4.2 Network model

Our multi-cloud network virtualization platform, Sirius, leverages from Software DefinedNetworking (SDN) to build a substrate infrastructure that spreads across both public cloudsand private data centers. As explained in the previous chapter, these resources are thentransparently shared by various tenants, allowing the definition and deployment of virtualnetworks (VN) composed of a number of virtual hosts (containers in our implementation)interconnected by virtual switches, arranged in an arbitrary network topology. While spec-ifying the virtual network, it is possible to indicate several requirements for the switchesand links, for example with respect to the needed bandwidth, CPU capacity, and securityguarantees. These requirements are enforced during embedding by laying out the VN el-ements at the appropriate locations. Specifically, those where the substrate infrastructurestill has enough resources that fulfill the security requirements to satisfy the particular de-mands. In addition, the datapaths are configured by the SDN controller by installing thenecessary forwarding rules in the switches. In this section we detail the network modelconsidered.

As illustrated in Figure 4.1, virtual machines (VM) might be acquired by the virtualiza-tion operator at specific cloud providers to run tenants’ containers implementing their dis-tributed services. In this scenario, the most relevant security aspects that may need to beassessed are the following. First, the trust level associated with a cloud provider is influ-enced by various factors, which may have to be taken into consideration when runningcritical applications. Providers are normally better regarded if they show a good past trackrecord on breaches and failures, have been on the market for a while, and advertise ServiceLevel Agreements (SLAs) with stronger assurances for the users. Moreover, as the virtu-alization operator has full control over its own data centers, it might employ protection

64

4.2 Network model

Figure 4.1: Example substrate network encompassing resources from multiple clouds.

features and procedures to make them compliant with regulations that have to be fulfilled

by tenants (e.g., the EU GDPR that has recently come into force).

Second, VMs can be configured with a mix of defense mechanisms, e.g., firewalls and an-

tivirus, to build execution environments with stronger degrees of security at a premium

price. These mechanisms can be selected by the operator when setting up the VMs, eventu-

ally based on the particular requirements of a group of tenants, or they could be sold ready

to use by the cloud providers. Examples include cloud services provided by Amazon1 and

Azure2. The latter recently extended its service range with SGX enclaves to enhance the

security capabilities of particular instances3. Highly protected VMs arguably give more

trustworthy conditions for the execution of the switch employed by the container man-

ager, ensuring correct packet forwarding among the containers and the external network

(e.g., without being eavesdropped or tampered with by malicious co-located containers).

Third, the switches can also be configured with various defenses to protect network traffic.

In particular, it is possible to setup tunnels between switches implementing alternative se-

curity measures. For instance, if confidentiality is not a concern, then it is possible to add

message authentication codes (MAC) to packets to afford integrity but without paying the

1For example: Trend Micro Deep Security at aws.amazon.com/marketplace.2For example: Check Point vSEC at azuremarketplace.microsoft.com3Azure confidential computing: https://azure.microsoft.com/en-us/solutions/confidential-compute/

65


full performance cost of encryption. Further countermeasures could also be added, suchas denial of service detection and deep packet inspection to selected flows. In some cases,if the trusted hardware is accessible (such as Intel CPUs with SGX extensions in the pri-vate cloud), one could leverage from it to enforce greater isolation while performing thecryptographic operations – for example, to guarantee that secret keys are never exposed.

The reader should notice that the above discussion would also apply to other deploymentscenarios. For example, the virtualization operator could offer VNs of virtual machines(instead of containers) in distinct cloud providers. This would require the use of nestedvirtualization mechanisms (e.g., Ben-Yehuda et al. (2010)), and in this case, the relevantnetwork appliances would be the nested hypervisor switches and their corresponding inter-connections. We have instead opted for using lightweight virtualization mechanisms, alignedwith the increasing trend towards the use of microservices and other forms of serverlesscomputing Vahdat (2017).

Substrate Network Modeling. Given the envisioned scenarios, the substrate networkis modeled as a weighted undirected graph GS = (N S , E S ,AS

N ,ASE ), composed of a set of

nodes N S (e.g., switches/routers) and edges E S interconnecting them. Both the nodes andthe edges have attributes that reflect their particular characteristics. The collection of at-tributes we specify in our model resulted from conversations with several companies fromthe healthcare and energy sectors that are moving their critical services to the cloud, andthey represent a balance among three goals: they should be (i) expressive enough to repre-sent the main security requirements when deploying virtual networks; (ii) easy to specifywhen configuring a network, requiring a limited number of options; and (iii) be readilyimplementable with available technologies.

The following attributes are considered for substrate nodes:

ASN = {{c p uS(n), s e c S(n), c l oud S(n)} | n ∈N S}

The total amount of CPU that can be allocated for the switching operations of node n isgiven by c p uS(n) > 0. Depending on the underlying machine capacity and the divisionof CPU cycles among the various tasks (e.g., tenant jobs, storage, network), c p uS(n) cantake a greater or smaller value. The security level associated to the node is s e c S(n) > 0.This attribute is used to differentiate the substrate elements with respect to security: a VM

66

4.2 Network model

instance with Intel SGX capabilities will have a higher “security level” than the typical VM

instance. As per the discussion above, substrate nodes with higher values are associated with

stronger safeguards, and will thus be set to a high value for s e c S(n). The trustworthiness of

a cloud provider is defined in similar terms, and is indicated by c l oud S(n)> 0. A GDPR-

compliant cloud provider or a private facility will have a higher value for c l oud S(n)when

compared to providers that do not offer the same guarantees.

The substrate edges have the following attributes:

ASE = {{b wS(l ), s e c S(l )} | l ∈ E S}

The first attribute, b wS(l )> 0, corresponds to the total amount of bandwidth capacity of

the substrate link l. The security measures enforced by the link are reflected in s e c S(l )> 0.

If the link implements tunnels that ensure integrity and confidentiality (by resorting to

MACs and encryption) then it will have a higher s e c S(l ) than a default edge that simply

forwards packets.

Virtual Network Modeling. VNs have an arbitrary topology and are composed of a num-

ber of nodes and the edges that interconnect them. When a tenant wants to instantiate a

VN, besides indicating the nodes’ required processing capacity and bandwidth for the links’,

she/he may also include as requirements security demands. These demands are defined by

specifying security attribute values associated with the resources.

In terms of modeling, a VN is also modeled as a weighted undirected graph, GV = (N V , EV ,

AVN , AV

E ), composed of a set of nodes N V and edges (or links) EV . Both the nodes and the

edges have attributes that portray characteristics that need to be fulfilled when embedding

is performed. Both AVN and AV

E mimic the attributes presented for the substrate network.

The only exception is an additional attribute that allows for the specification of security

requirements related to availability.

This attribute, avai l V (n), indicates that a particular node should have a backup replica

to be used as a cold spare. This causes the embedding procedure to allocate an additional

node and the necessary links to interconnect it to its neighboring nodes. These resources

will only be used in case the virtualization platform detects a failure in the primary (or

working) node. avai l V (n) defines where the backup of virtual node n should be mapped.

67


If no backup is necessary, avai l V (n) = 0. If virtual node n requires a backup to be placed

in the same cloud, then avai l V (n) = 1. Otherwise, if n should have a backup in another

cloud (e.g., to survive cloud outages), then avai l V (n) = 2.

Threat Model. The multi-cloud network virtualization platform receives as input the SN

specification, with the topological organization and the attributes that characterize the sub-

strate network. Based on this information, and on the currently unoccupied resources, the

platform embeds the VNs of the users fulfilling the stated constraints. It is assumed that SN

accurately represents the physical resources, namely the security safeguards implemented

in the edges, nodes and clouds. In addition, it is assumed that users fully comprehend what

a particular value of s e c S() or c l oud S() means in terms of security/trustworthiness guar-

antees. Oversight in this area could cause, for example, the embedding of a critical node

in a cloud with only normal protections, allowing it to become the victim of unexpected

attacks. With regard to availability, upon user request, the platform allocates redundant

resources in the same (or a distinct) cloud. This ensures that a service deployed in a VN

remains accessible in the presence of a single node (or cloud) failure. However, further fai-

lures in the substrate network may cause the unavailability of parts (or the whole) of the

service.

4.3 Secure Virtual Network Embedding Problem

Our approach to VNE enables the specification of VNs to be mapped over a multi-cloud

substrate, enhancing the security and flexibility of network virtualization. More precisely,

we define the Secure Virtual Network Embedding (SecVNE) problem as follows:

SecVNE problem: Given a virtual network request (VNR) with resources and security de-

mands GV , and a substrate network GS with the resources to serve incoming VNRs, can GV be

mapped to GS ensuring an efficient use of resources while satisfying the following constraints?

(i) Each virtual edge is mapped to the substrate network meeting the bandwidth and security

constraints; (ii) Each virtual node is mapped to the substrate network meeting the CPU capacity

and security constraints (including node availability and cloud trust domain requirements).

68

4.3 Secure Virtual Network Embedding Problem

Figure 4.2: Example of the embedding of a virtual network request (top) onto a multi-cloud substrate network (bottom). The figure also illustrates the various constraints andthe resulting mapping after the execution of our MILP formulation.

Our approach handles the SecVNE problem, mapping a VN onto the substrate network

respecting all constraints. When a VNR arrives, the goal of the embedding solution is to

minimize the cost of mapping, while fulfilling all requirements. In the situations when

there are not enough substrate resources available, the incoming request is rejected. To

minimize these events, and therefore increase the VN acceptance ratio, we may allocate

resources that are of a higher level of security than the ones specified in the VNR. We

find this option to represent a good trade-off. For typical workloads, the alternative of

having a more strict mapping will reduce the acceptance ratio while causing resources to

be underutilized.

Figure 4.2 shows an example of embedding a VNR (displayed on top) onto the substrate of

Figure 4.1 (represented at the bottom). We assume a maximum security level of 5 for virtual

nodes, and of 3 for virtual links. We also assume in this example a cloud trustworthiness

of 1 for the public cloud, of 2 for the trusted public cloud, and of 3 for the trusted private

cloud. These values were chosen arbitrarily for the sake of this example – the operator will

set these parameters to fit the specificities of its substrate. The VNR consists of two nodes

69


interconnected by one virtual link. Node a requires 10 units of CPU (c p uV (a) = 10), a

medium level of security (s e cV (a) = 3), and a default cloud trust level (c l oudV (a) = 1).

Besides slightly different CPU and security demands, the user requires the second node to

be replicated. Further, the primary and backup nodes should be placed in different clouds

(avai l V (b ) = 2). For the virtual link the user requires 20 units of bandwidth (b wV (a, b ) =

20), and a medium level of security (s e cV (a, b ) = 2).

The resulting embedding guarantees that all requirements are satisfied. Specifically, node a

is mapped onto a substrate node with a security level equal to the one requested, on a public

cloud (s e c S(B) = 3 and c l oud S(B) = 1). The other virtual node and its backup are embed-

ded on different clouds that fulfill the trustworthiness request (respectively, c l oud S(C ) = 2

and c l oud S(E) = 3). It is also possible to observe that one of the substrate paths (namely,

the primary/working) maps to two substrate edges ((B , D) and (D ,C )), but guaranteeing

that they both have the necessary security level (2, in this case). In addition, the primary

and backup paths are disjoint. As will be justified later, this is a requirement of our solution.

The figure also displays meta-links connecting the virtual nodes to the substrate nodes they

are mapped onto (e.g., the dash-dotted line between a and B). This is an important artifact

in our modeling that is going to be explored in the MILP formulation (Section 4.5).

Before closing this section, we detail how we model differently the VNRs that include

backup requests, from those that do not have such requirements. Please refer to Figure 4.3.

For VNRs where no node requests a backup, we only model the working network (Fig-

ure 4.3, left). However, when at least one node requires backup, we model two networks

(Figure 4.3, right): the working network, mapping all nodes and all primary paths; and

the backup network. The backup network also includes all nodes. However, nodes that

require backup are mapped to a different substrate node from that of the working node,

whereas nodes that do not require backup will be placed in the same substrate node. As

this is an artifact of the model, no resources are reserved to this latter node. The backup

network includes only the backup paths that interconnect the backups to their neighbors.

As such, nodes that do not require backup and are not backup neighbors do not need to be

connected to the rest of the network (one example is node d in the figure).

70

4.4 A Policy Language to Specify SecVNE

Figure 4.3: Network model when no backup is requested (left); and when at least onebackup node is requested (right).

4.4 A Policy Language to Specify SecVNE

We support two alternative ways for both the virtualization operator and for the tenant to

describe the substrate network and the virtual network, respectively. The first is based on

a graphical interface where the users can draw their arbitrary topologies, including nodes,

links, and the associated attributes. In this case, after the tenants draw their virtual network

and launch a VNR, the solution runs the embedding solution we present in this chapter

(Section 4.5) to map the request into the substrate.

As a result of our conversations with potential users of our platform (namely, enterprises

that run critical applications), we have come to realize that a common requirement was

that of expressing different options. A typical example was that of a virtual node for which

a high level of security is required if placed in a less trusted infrastructure (e.g., a public

cloud), but that could have its security requirement lightened if placed in a highly trusted

infrastructure (e.g., a private cloud). Indeed, a problem with this first approach is that it

does not grant tenants with this level of expressiveness.

71


We thus introduce a second approach based on a policy language, that lets the user describeboth the substrate and the VNRs in a programmatic fashion. The production rules ofthe grammar were kept relatively simple, but the achieved level of expressiveness is muchgreater than what is attained with the graphical interface alone. As the characteristics ofthe substrate and VNRs are distinct, we explain them separately in the following.

Substrate Specification

S→ f unc S(pa rame t e r ) = val uenu mS→ S & SVirtual Network Specification

V → f uncV (pa rame t e r ) = val uenu mV → f uncV (pa rame t e r )≥ val uenu mV → !V ; (V ); V & V ; V | V

Table 4.1: Policy grammar to define SecVNE parameters.

The substrate part of the SecVNE policy grammar (top rows of Table 4.1) enables the list-ing of resources that compose the substrate. The parameters represent the substrate nodeor link ID, and the function enables the specification of particular attributes. For example,the leftmost cloud of Figure 4.2 is specified as:

s u b s t rat e→ c p uS(A) = 80 & s e c S(A) = 1 & c l oud S(A) = 1 &c p uS(B) = 80 & s e c S(B) = 3 & c l oud S(B) = 1 &b wS(A,B) = 100 & s e c S(A,B) = 2 & ...

In the virtual part of the SecVNE policy grammar (bottom rows of Table 4.1), the relationsdictate the requirements for each node and link of the VNR. Differently from the substratepart, certain functions specify exact values, whereas others specify a minimum demand.For instance, while the CPU requirement is an exact value, the security function speci-fies the minimum security level required, as a virtual node can be mapped to a substratenode with either the same or a higher security level. As the grammar supports booleanoperations, including o r (“|"), and (“&"), and not (“!"), it is possible to express alternativeresource requirements.

72

4.5 MILP Formulation

As an example, consider the following VN with two nodes and one edge:

V N → (C P U V (a) = 10 & s e cV (a)¾ 3 & c l oudV (a)¾ 1 &

avai l V (a) = 0)& (C P U V (b ) = 20 & avai l V (b ) = 1 &

((s e cV (b )¾ 1 & c l oudV (b )¾ 4) | (s e cV (b )¾ 4 & c l oudV (b )¾ 1))&

(b wV (a, b ) = 20 & s e cV (a, b )¾ 2)

Node a requires a security level of at least 3, and cloud trustworthiness of 1 or higher. For

node b , on the other hand, the tenant makes a compromise between node security and

the degree of cloud trust. If the cloud has a high level of trust (at least 4), then this node’s

security level can be as low as 1. However, the tenant is willing to have this virtual node

mapped in a less trusted cloud (with a trust level of only 1), but in that case, the node

security level is increased to at least 4.

When processing a VNR containing a VN with several optional demands as this one, we

generate all possible requests that would satisfy the tenant. Then, we evaluate each one

and select the solution with the lowest cost. There are two main benefits to this approach.

First, as explained, the increased expressiveness allows tenants to explore different trade-offs

with respect to security and availability of their resources. Second, by evaluating several

solutions we increase the number of options, reducing mapping costs and, as a consequence,

enabling higher acceptance ratios.


In this section we present our MILP formulation to solve the SecVNE problem. The section

starts by explaining the decision variables used in the formulation, proceeds to present the

objective function, and finally the constraints required to model the problem.

73


Symbol Meaning

w f i , jp,q ¾ 0

The amount of working flow, i.e., bandwidth, on physical link (p,q) for virtuallink (i,j)

b f i , jp,q ¾ 0

The amount of backup flow, i.e., backup bandwidth, on physical link (p,q) forvirtual link (i,j)

w l i , jp,q ∈ {0,1} Denotes whether virtual link (i,j) is mapped onto physical link (p,q). (1 if (i,j) is

mapped on (p,q), 0 otherwise)

b l i , jp,q ∈ {0,1} Denotes whether the backup of virtual link (i,j) is mapped onto physical link

(p,q). (1 if backup of (i,j) is mapped on (p,q), 0 otherwise)

wni , p ∈ {0,1} Denotes whether virtual node i is mapped onto physical node p. (1 if i is mappedon p, 0 otherwise)

b ni , p ∈ {0,1} Denotes whether virtual node i’s backup is mapped onto physical node p. (1 ifi’s backup is mapped on p, 0 otherwise)

wci ,c ∈ {0,1} Denotes whether virtual node i is mapped on cloud c. (1 if i is mapped on c, 0otherwise)

b ci ,c ∈ {0,1} Denotes whether virtual node i’s backup is mapped on cloud c. (1 if i’s backup ismapped on c, 0 otherwise)

Table 4.2: Domain constraints (decision variables) used in the MILP formulation.

4.5.1 Decision variables and auxiliary parameters

Table 4.2 presents the variables that are used in our MILP formulation. Briefly, w f i , jp,q , b f i , j

p,q ,

w l i , jp,q and b l i , j

p,q are related to (working and backup) links; wni , p and b ni , p are associated

with (working and backup) nodes; and wci ,c and b ci ,c are related to the cloud location of a

virtual node. In Table 4.3 we present a few additional parameters used on the formulation.

Their importance will be made clear as we describe the solution.

The formulation also employs a few auxiliary sets whose value depends on the VNR, as

shown in Table 4.4. For example, �N V is a set that includes all nodes that do not require

backup, and NV

is its complement. As such, when NV= ; that means no virtual node re-

quires backup. We recall that, when this happens, we only model a working network. This

means that every backup-related decision variable (b f i , jp,q , b l i , j

p,q , b ni , p , b ci ,c ) is equal to 0.

On the other hand, if NV 6= ;, then we model both a working and a backup network (recall

Figure 4.3). When we need to model a backup network (i.e., when at least one node requests

backup), then we introduce another artifact into our model. Namely, if virtual node i has

avai l V (i) = 0, indicating that it does not require replication, then both the working and

the backup nodes of i are placed in the same substrate node p (i.e., wni , p = b ni , p = 1).

74


Symbol Meaning

β1, β2, β3Coefficients used in the Objective Function to provide a weighted-sum properlyparameterized for each objective.

αp,q A weight representing the relative cost of link (p,q).

nod eLocat i onp,cDenotes the location of substrate node p. (1 if substrate node p is located in cloudc ; 0 otherwise)

back u pN e t wo r k Assumes value 1 if at least one of the nodes of a VNR requests backup; 0 other-wise.

wo r ki n gp,qAuxiliary binary variable defining if a physical link (p, q) is part of the workingnetwork.

back u pp,qAuxiliary binary variable defining if a physical link (p, q) is part of the backupnetwork.

Table 4.3: Additional parameters used in the MILP formulation.

�N V = { i ∈N V : avai l V (i) = 0 }N

V= N V \ �N V

�EV = { (i , j ) ∈ EV : avai l V (i) = 0 and avai l V ( j ) = 0 }E

V= EV \ �EV

Table 4.4: Auxiliary sets to facilitate the description of the formulation constraints.

However, it is guaranteed that this second node, the “virtual” (i.e., not requested) backup,

does not consume resources (e.g., CPU). When a virtual node j has avai l V ( j ) > 0, thus

requiring replication, it is necessary to map the working and backup in different substrate

nodes (possibly in distinct clouds, as explained in Section 4.2). In this case, the backup will

have the necessary resources reserved, to be able to substitute the primary in case of failure.

4.5.2 Objective Function

The objective function aims to minimize three aspects (see Eq. 4.1): 1) the sum of all com-

puting costs, 2) the sum of all communication costs, and 3) the overall number of hops of the

substrate paths used to map the virtual links. We resort to a weighted-sum technique, with

strictly positive weighing factors, a technique commonly used in multi-objective optimiza-

tion to obtain (supported) non-dominated/efficient solutions (others approaches could be

used, see Steuer (Steuer, 1986)). The use of this composite function allows us to use differ-

75


ent units. In this case, after obtaining a solution, each objective should be evaluated by thedecision-maker (in this case, the operator). She should then respond to the three outcomesindividually, not to the composite one, to validate the result. The composite function ismerely used to obtain a (supported) non-dominated/efficient solution. The three differentcoefficients of the weighted-sum function,β1,β2, andβ3, allow the virtualization operatorto reasonably parameterize and check the function, for each objective.

mi n β1

h

∑

i∈NV

∑

p∈N S

c p uV (i) s e c S(p) c l oud S(p) wni , p

+∑

i∈NV

∑

p∈N S

c p uV (i) s e c S(p) c l oud S(p) b ni , p

i

+β2

h

∑

(i , j )∈EV

∑

(p,q)∈E S

αp,q s e c S(p, q) w f i , jp,q

+∑

(i , j )∈EV

∑

(p,q)∈E S

αp,q s e c S(p, q) b f i , jp,q

i

+β3

h

∑

(i , j )∈EV

∑

(p,q)∈E S

w l i , jp,q +∑

(i , j )∈EV

∑

(p,q)∈E S

b l i , jp,q

i

(4.1)

The first part of Eq. 4.1 covers the computing costs, including both the working and backupnodes (top 2 lines). The second part of the equation is the sum of all working and backuplink bandwidth costs (lines 3-4). The last part of the objective function is related to thenumber of hops of the working and backup paths.

A few aspects are worth detailing. First, note that for the node backup case we restrictthe sum to the set of nodes that require backup (N

V). This is needed to express that the

nodes that do not require backup (the “virtual” backups referred above) do not consumeresources. The equation considers the level of security of the substrate resources and thetrustworthiness of the cloud infrastructure that hosts them. This is based on the assump-tion that a higher level of security or of trustworthiness is translated into an increased cost.As the parameters that formalize these levels (s e c S(p) and c l oud S(p)) can take as valueany positive real number, this allows the virtualization operator to fine-tune its costs. Toaddress the possibility that different substrate edges may have different costs, we have in-cluded a multiplicative parameter αp,q in lines 3-4. This parameter is a weight that can be

76


used to express these differences. For instance, a link connecting two clouds might have a

higher (monetary, delay, or other) cost, when compared to intra-cloud links.

Intuitively, this objective function attempts to economize the most “powerful” resources

(e.g., those with higher security levels) for VNRs that explicitly require them. Therefore,

for instance, it is expected in most cases virtual nodes with s e cV = 1 to be mapped onto

substrate nodes with s e c S = 2 only if there are no other substrate nodes with s e c S = 1

available.

4.5.3 Security Constraints

Next, we enumerate the constraints related to the security of (working and backup) nodes,

edges, and clouds, respectively:

wni , p s e cV (i)¶ s e c S(p), ∀i ∈N V , p ∈N S (4.2)

b ni , p s e cV (i)¶ s e c S(p), ∀i ∈N V , p ∈N S (4.3)

w l i , jp,q s e cV (i , j )¶ s e c S(p, q), ∀(i , j ) ∈ EV , (p, q) ∈ E S (4.4)

b l i , jp,q s e cV (i , j )¶ s e c S(p, q), ∀(i , j ) ∈ EV , (p, q) ∈ E S (4.5)

wni , p c l oudV (i)¶ c l oud S(p), ∀i ∈N V , p ∈N S (4.6)

b ni , p c l oudV (i)¶ c l oud S(p), ∀i ∈N V , p ∈N S (4.7)

The first of these constraints guarantees that a virtual node is only mapped to a substrate

node that has a security level that is equal to or greater than its demand (Eq. 4.2). The next

equation guarantees the same for backup nodes.

The following two equations force each virtual edge to be mapped to (one or more) physical

links that provide a level of security that is at least as high as the one requested. This is

guaranteed for links connecting the primary nodes (Eq. 4.4) and the backups (Eq. 4.5). The

last constraints ensure that a virtual node i is mapped to a substrate node p only if the cloud

where p is hosted has a trust level that is equal or greater than the one demanded by i (Eq.

4.6 and 4.7 for working and backup, respectively).

77


4.5.4 Mapping Constraints

We take security, including availability, as a first class citizen in our solution. Therefore,when faced with a choice between security and resource efficiency, we give preference tothe former, as will be made clear next.

Node Embedding: We force each virtual node to be mapped to exactly one working substratenode (Eq. 4.8) and, when a backup is requested, to a single backup substrate node (Eq. 4.9).We also guarantee that a substrate node maps at most one virtual node (working or backup)of a single tenant. This means we have opted to avoid substrate sharing in a tenant’s virtualnetwork, to improve its availability. As a result, if one substrate node fails, this will affectat most one (backup or working) virtual node. This is one example of our design choice ofavailability over efficiency. This is expressed in three equations – Eq. 4.10 to 4.13 – due tothe use of the “virtual” backup artifact in our model. Eq. 4.10 guarantees that one substratenode maps at most one working virtual node. The next equation guarantees the same forthe backup case.

∑

p∈N S

wni , p = 1, ∀i ∈N V (4.8)

∑

p∈N S

b ni , p = 1, ∀i ∈NV

(4.9)

∑

i∈NV

wni , p ¶ 1, ∀p ∈N S (4.10)

∑

i∈NV

b ni , p ¶ 1, ∀p ∈N S (4.11)

(4.12)

Eq. 4.13 guarantees that one substrate node will not map both a backup node and a virtualnode. Note the use of the set N

Vto guarantee that the “virtual” backups are not included

– they are the exception to the rule. As explained in Section 4.5.1, if a virtual node requiresno replication but there is a backup network (as at least one other node has requested it),its working and backup are mapped onto the same substrate node. This is formalized withthe next two equations. Note they both use the set �N V , meaning that they deal only with

78


virtual nodes that do not require backup. Eq. 4.14 guarantees that “virtual” backup nodescan only be mapped to their corresponding working node, while Eq. 4.15 grants thesesubstrate nodes the exception of hosting a maximum of 2 virtual nodes: the working andthe “virtual” backup node.

∑

i∈NV

wni , p + b n j , p ¶ 1, ∀ j ∈NV

, p ∈N S (4.13)

b ni , p ¶ wni , p , ∀i ∈ �N V , p ∈N S (4.14)∑

i∈NV

wni , p + b n j , p ¶ 2, ∀ j ∈ �N V , p ∈N S (4.15)

The next set of constraints create the necessary relationships between nodes and the flowsthat traverse them.

w l i , jp,q b wV (i , j )¾ w f i , j

p,q , ∀(i , j ) ∈ EV , (p, q) ∈ E S (4.16)

b l i , jp,q b wV (i , j )¾ b f i , j

p,q , ∀(i , j ) ∈ EV , (p, q) ∈ E S (4.17)

w l i , jp,q = w l i , j

q , p , ∀(i , j ) ∈ EV , p, q ∈N S ∪N V (4.18)

b l i , jp,q = b l i , j

q , p , ∀(i , j ) ∈ EV , p, q ∈N S ∪N V (4.19)

Eq. 4.16 ensures that if there is a flow between nodes p and q for a virtual edge (i , j ), then(i , j ) is mapped to the substrate link whose end-points are p and q . The use of the inequalityin this equation is important, to guarantee the possibility of network flows to use multiplepaths. The next equation achieves the same goal, but for the backup. We also include twobinary constraints to force the symmetric property for the binary variables that define thelink mappings (Eqs. 4.18 and 4.19). Note that in these equations we included links fromboth the substrate and the virtual networks. This is due to the need to include the meta-links described in Section 4.3.

Similarly, we also need to establish the relation between the virtual nodes and the cloudsthey are embedded into. This is achieved with Eqs. 4.20 and 4.21, for working and backups,respectively. Specifically, if virtual node i is mapped onto a substrate node p, and p is hostedin cloud c, then i is mapped into cloud c. The auxiliary parameter nod eLocat i onp,c hasvalue 1 if substrate node p is hosted in cloud c , and 0 otherwise. We also require eachvirtual node to be mapped to exactly one cloud (working or backup), with Eqs. 4.22 and

79


4.23. The auxiliary parameter back u pN e t wo r k assumes value 1 if a backup is needed for

at least one of the nodes of a VNR, or value 0 otherwise.

∑

p∈N S

(wni , p nod eLocat i onp,c )¾ wci ,c , ∀i ∈N V , c ∈C (4.20)

∑

p∈NS

(b ni , p nod eLocat i onp,c )¾ b ci ,c , ∀i ∈N V , c ∈C (4.21)

∑

c∈C

wci ,c = 1, ∀i ∈N V (4.22)∑

c∈C

b ci ,c = back u pN e t wo r k , ∀i ∈N V (4.23)

We give tenants three replication options for their virtual nodes: no replication, replication

in the same cloud, and replication in a different cloud. We must thus restrict the placement

of the working and backup nodes to the same or to distinct clouds, depending on the value

of the availability attribute (avai l V (i)). This is achieved with Eq. 4.241. In this equation,

when avai l V (i) = 1, variable b ci ,c is equal to wci ,c : the backup is mapped to the same cloud

as the working node, as required. By contrast, when avai l V (i) = 2, b ci ,c has to be different

from wci ,c , so the nodes will be mapped to different clouds. Finally, when avai l V (i) = 0,

b ci ,c will be equal to zero (this condition needs to be considered jointly with Eq. 4.22).

|wci ,c back u pN e t wo r k − b ci ,c |= (avai l V (i)− 1) ×

(wci ,c back u pN e t wo r k + b ci ,c ), ∀i ∈N V , c ∈C (4.24)

This constraint includes an absolute value function that needs to be linearized. We perform

this using a standard technique, replacing constraint 4.24 with constraints 4.25 and 4.26.

wci ,c back u pN e t wo r k − b ci ,c = au x [(avai l V (i)− 1) ×

(wci ,c back u pN e t wo r k + b ci ,c )]− (1− au x) [(avai l V (i)− 1) ×

(wci ,c back u pN e t wo r k + b ci ,c )], ∀i ∈N V , c ∈C (4.25)

1Note that in the implementation the modulus function was linearized.

80


au x ∈ {0,1} (4.26)

Link Embedding: The next constraints are related to the mapping of virtual links into thesubstrate. They take advantage of the meta link artifact (recall Figure 4.2), which connectsa virtual node i to the substrate node p where it is mapped, to enforce a few restrictions.

wni , p b wV (i , j ) = w f i , ji , p , ∀ (i , j ) ∈ EV , p ∈N S (4.27)

wn j ,q b wV (i , j ) = w f i , jq , j , ∀(i , j ) ∈ EV , q ∈N S (4.28)

b ni , p b wV (i , j ) = b f i , ji , p back u pN e t wo r k , ∀(i , j ) ∈ EV , p ∈N S (4.29)

b n j ,q b wV (i , j ) = b f i , jq , j back u pN e t wo r k , ∀(i , j ) ∈ EV , q ∈N S (4.30)∑

j ,k!=i j ,k∈NV

w f j ,ki , p +w f j ,k

p,i = 0, ∀i ∈N V , p ∈N S (4.31)

∑

j ,k!=i j ,k∈NV

b f j ,ki , p + b f j ,k

p,i = 0, ∀i ∈N V , p ∈N S (4.32)

These constraints guarantee that the working flow of a virtual link (i , j ) has its source in iand its sink in j, traversing the corresponding substrate nodes ( p and q) (Eq. 4.27 and 4.28).These equations effectively define the meta-link artifact. The next two equations formalizethe same requirement for the backup nodes. Please note that even though the backup pathis only used if the working substrate path fails, we reserve the necessary resources duringembedding to make sure they are available when needed. Eqs. 4.31 and 4.32 force meta-linksto carry only (working or backup, respectively) traffic to/from their virtual nodes.

The next equations specify flow conservation restrictions at the nodes.

81


∑

p∈N S

w f i , ji , p −∑

p∈N S

w f i , jp,i = b wV (i , j ), ∀(i , j ) ∈ EV (4.33)

∑

p∈N S

w f i , jj , p −∑

p∈N S

w f i , jp, j =−b wV (i , j ), ∀(i , j ) ∈ EV (4.34)

∑

p∈N S∪NV

w f i , jq , p −∑

p∈N S∪NV

w f i , jp,q = 0, ∀(i , j ) ∈ EV , q ∈N S (4.35)

∑

p∈N S

b f i , ji , p −∑

p∈N S

b f i , jp,i = b wV (i , j ) back u pN e t wo r k , ∀(i , j ) ∈ EV (4.36)

∑

q∈N S

b f i , jj ,q −∑

q∈N S

b f i , jq , j =−b wV (i , j ) back u pN e t wo r k , ∀(i , j ) ∈ EV (4.37)

∑

p∈N S∪NV

b f i , jq , p −∑

p∈N S∪NV

b f i , jp,q = 0, ∀(i , j ) ∈ EV , q ∈N S (4.38)

Eqs. 4.33, 4.34 and 4.35 refer to the working flow conservation conditions, which denotethat the network flow to a node is zero, except for the source and the sink nodes, respec-tively. In an analogous way, the following three equations refer to the backup flow conser-vation conditions (Eq. 4.36, 4.37 and 4.38).

The next set of constraints guarantee flow symmetry, to ensure the same flow traverses bothdirections, for both working (Eq. 4.39) and backup nodes (Eq. 4.40). Finally, Eqs. 4.41 and4.42 define the same flow for the working node and its backup, for both directions.

w f i , jp,q = w f j ,i

q , p , ∀(i , j ) ∈ EV , p, q ∈N S ∪N V (4.39)

b f i , jp,q = b f j ,i

q , p , ∀(i , j ) ∈ EV , p, q ∈N S ∪N V (4.40)

w f i , jp,q = b f i , j

p,q , ∀(i , j ) ∈ �EV , p, q ∈N S (4.41)

w f j ,ip,q = b f j ,i

p,q , ∀(i , j ) ∈ �EV , p, q ∈N S (4.42)

Nodes and Links Disjointness: When a virtual node stops responding, it may have been thecase that the node has effectively faulted, or the failure has occurred in one of the substratelinks that guarantees its connectivity to other nodes. Aligned with our goal of providinghigh guarantees of availability, we aim to cover these two cases. For this purpose, we ensurethat the paths connecting the virtual nodes’ backup are disjoint from the substrate resourcesused for the working part (otherwise, a single failure could compromise both paths). For

82


this purpose, we introduce the auxiliary binary variables wo r ki n gp,q and back u pp,q , thatdefine if a physical link (p, q) belongs to the working or to the backup networks, respec-tively.

wo r ki n gp,q ¶ 1− back u pp,q , ∀(p, q) ∈ E S (4.43)

w l i , jp,q ¶ wo r ki n gp,q , ∀(i , j ) ∈ EV , (p, q) ∈ E S (4.44)

b l i , jp,q ¶ back u pp,q , ∀(i , j ) ∈ E

V, (p, q) ∈ E S (4.45)

First, we require disjointness between the working and backup paths (Eq. 4.43). Second,we guarantee that, if the working path of a virtual edge (i , j ) is mapped onto a substratelink (p, q), then (p, q) is placed into the working network (Eq. 4.44). Finally, we definethe equivalent constraint to the backup part (Eq. 4.45).

4.5.5 Capacity Constraints

Node Capacity Constraints: Virtual nodes from different tenants, mapped as a response to adifferent VNR, can be mapped to the same substrate node (in contrast to a node from thesame tenant). Let’s callNV the set of all virtual nodes that, at a certain moment, are mappedonto the substrate, and i ↑ p to indicate that virtual node i is hosted on the substrate nodep. Then, the residual capacity of a substrate node, RN (p), is defined as the CPU capacitycurrently available in substrate node p ∈N S .

RN (p) = c p uS(p)−∑

∀i↑p

c p uV (i), i ∈NV

For a substrate node, we need to ensure that we never allocate more than its residual ca-pacity, when carrying out a new embedding. This needs to take into consideration theresources consumed by both the working and backup nodes (Eq. 4.46).∑

i∈NV

wni , p c p uV (i) +∑

j∈NV

b n j , p c p uV ( j )¶ RN (p), ∀p ∈N S (4.46)

Link Capacity Constraints: Similarly, substrate links can also map virtual edges from dif-ferent tenants. Lets define EV as the set of all virtual edges mapped, at a certain instant,onto the substrate, and (i , j ) ↑ (p, q) to denote that the flow of virtual link (i , j ) traverses

83


substrate link (p, q). The residual capacity of a substrate link, RE (p, q), is defined as thetotal amount of bandwidth available on the substrate link (p, q) ∈ E S .

RE (p, q) = b wS(p, q)−∑

∀(i , j )↑(p,q)

b wV (i , j ), (i , j ) ∈EV

The following constraint ensures that the allocated capacity of a substrate link never exceedsthe capacity of that physical link, taking into consideration both the working and backupparts.

∑

(i , j )∈EV

w f i , jp,q +∑

(i , j )∈EV

b f i , jp,q ¶ RE (p, q), ∀(p, q) ∈ E S (4.47)

4.5.6 Discussion on Security Assurances

Our MILP formulation finds VN mappings that fulfil the requirements of the SecVNEProblem. The formulation contains a number of constraints that prevent a computed em-bedding for a specific VN request from placing virtual nodes and edges into substrate re-sources that do not have the necessary levels for the security and trustworthiness attributes.In particular:

Virtual edges requirements: A virtual edge has to be mapped into (one or more) substrateedges that have a larger or equal security level than the one being requested. This is manda-tory on both the substrate edges that are employed in the primary paths as in the backuppaths (constraints (4.4) and (4.5)). With regard to availability, the formulation ensures thatprimary and backup paths are located in separate parts of the substrate, guaranteeing thata single failure can only affect a path, allowing the remaining one to continue to provideconnectivity (constraints (4.43), (4.44) and (4.45)). By request of the user, this sort of assur-ance is provided within a cloud or across clouds, enabling local failures or complete cloudfailures to be tolerated (constraint (4.24)).

Virtual nodes requirements: Virtual nodes can only be mapped into substrate nodes thathave a security level equal to or greater than the one indicated in the VN request. This isensured both for primary resources as for backups (constraints (4.2) and (4.3)). In addition,virtual nodes are forced to be deployed into clouds that have a similar or higher trust levelthan the one being requested. Again, this is imposed on both the primary and backupsubstrate nodes (constraints (4.6) and (4.7)).

84

4.6 Evaluation

4.6 Evaluation

In this section we present performance results of our solution, considering diverse (virtualand substrate) network topologies, and diverse VNR settings. Our evaluation aims to askthree main questions. First, what is the performance of our solution when compared withthe most common alternatives? Second, how does the richer set of services offered by oursolution (namely, security and availability) affect performance? Third, can a multi-cloudvirtualization provider benefit from offering these value-added services to its users?

4.6.1 Experimental Setup

The setting of our experiments follows the related literature on this problem. Specifically,we have extended the simulator presented in (vin, 2012) to simulate the dynamic arrivalof Virtual Network Requests (VNRs) to the system. To create the substrate networks weresorted to the network topology generator GT-ITM (Zegura et al., 1996). Two kinds ofnetworks were evaluated: one based on random topologies, where every pair of nodes israndomly connected with a probability between 25% and 30%; and the other employingthe Waxman model to link the nodes with a probability of 50% (Naldi, 2005).

The substrate networks have a total of 25 nodes. The CPU and bandwidth (c p uS and b wS )of nodes and links is uniformly distributed between 50 and 100. Each of these resourcesis associated with one of three levels of security (s e c S ∈ {1.0,1.2,5.0}), according to a uni-form distribution.The rationale for the choice of these specific values in our evaluation isas follows. We have opted to associate the value of the security attribute to its monetarycost, based on our initial intuition, which was afterwards backed by empirical evidence.Specifically, we have analyzed the cost of similar VM instances in different cloud providersthat varied in their security characteristics. We have analyzed Amazon EC2 and MicrosoftAzure, considering both basic and secure VM instances. The secure instances includedTrend Micro Deep Security, Check Point vSEC, SafeNet ProtectV, HyTrust DataControl,among others. It was possible to observe a wide range of values depending on the securityservices included, but there was a clear – and expected – pattern: higher security guaranteesare translated into higher costs. The specifics were, however, slightly unexpected. Whilean EC2 instance with content protection is around 20% more expensive than a normal ins-tance, the cost of similar instances with more sophisticated defenses was between 4 and 15

85


Notation Algorithm description

NS+NA SecVNE with no security or availability requirements for VNs

10S+NASecVNE with VNRs having 10% of their resources (nodes and links) with security require-ments (excluding availability)

20S+NASimilar to 10S+NA, but with security requirements (excluding availability) for 20% of theresources

NS+10ASecVNE with no security requirements, but with 10% of the nodes requesting replicationfor increased availability

NS+20A Similar to 10S+NA, but with availability requirements for 20% of the resources

20S+20ASecVNE with 20% of the resources (nodes and links) with requirements for security and20% for replication

D-ViNE VNE MILP solution presented in (Chowdhury et al., 2012)

Table 4.5: VNR configurations evaluated in the experiments.

times higher. This justifies our choice of 1 for the basic level of security (normal instance),1.2 for the intermediate level, and (a conservative) 5 for the highest level of security. Thesubstrate nodes are also uniformly distributed among three clouds, each one with a differenttrustworthiness level (c l oud S ∈ {1.0,1.2,5.0}) – justified along the same line of reasoning.The goal is to represent a setup that includes a public cloud (lowest level), a trusted publiccloud (e.g., GDPR-compliant), and a private datacenter (assumed to offer the highest se-curity). We should emphasize that the cloud and trust levels are abstract entities that canbe instantiated in different ways. We argue that monetary cost can be a good proxy to in-stantiate the security coefficients here, but we make no claim that it is the only, and by nomeans the optimal, way.

VNRs have a number of virtual nodes uniformly distributed between 2 and 41. Pairs ofvirtual nodes are connected with a Waxman topology with probability 50%. The CPUand bandwidth of the virtual nodes and links are uniformly distributed between 10 and20. Several alternative security and availability requirements are evaluated, as shown inTable 4.5. We assume that VNR arrivals are modeled as a Poisson process with an averagerate of 4 VNRs per 100-time units. Each VNR has an exponentially distributed lifetimewith an average of 1000-time units.

1Please note that a node corresponds to a virtual switch, that can support hundreds to thousands ofcontainers in a large VM (recall Figure 4.1). In fact, the setup of experiments with our prototype includedrunning over 6 thousand containers per VM.

86

4.6 Evaluation

The MILP is solved using the open source library GLPK (GLPK, 2008). In the objectivefunction, we set β1 = β2 = β3 = 11 to balance evenly the cost components (Eq. 4.1). Pa-rameter α was also set to 1 because our pricing analysis showed negligible cost differencesbetween intra- and inter-cloud links, in most of the relevant scenarios. We set up 20 ex-periments, each with a different substrate topology (10 random and 10 Waxman). Everyexperiment ran for 50 000 time units, during which embedding is attempted for groupsof VNRs (specifically, 10 sets of 2 000 VNRs each were tested). The order of arrival andthe capacity requirements of each VNR are kept the same in each of the configurations ofTable 4.5, ensuring that they solve equivalent problems.

In the evaluation, we compared our approach with D-ViNE (Chowdhury et al., 2012). Wehave chosen this solution because it has been considered as the baseline for most VNEwork Fischer et al. (2013b), and due to the availability of its implementation as open-sourcesoftware. D-ViNE introduces correlation between the node mapping and the link mappingphases. In D-Vine node mapping is made in a way that facilitates the mapping of virtuallinks to physical paths in the subsequent phase. For this purpose, it extends the physi-cal network graph by introducing meta-nodes for each virtual node, and by connecting themeta-nodes to a selected subset of physical nodes, enabling the required correlation. Howe-ver, D-ViNE considers only CPU and bandwidth constraints. Our solution adds to theserequirements also security demands, namely node and link security (including availability),and cloud preferences. D-ViNE presents an optimal formulation (Section 4 of Fischer et al.(2013b)]), and two solutions with LP relaxations. For a fair comparison, we evaluate oursolution against D-ViNE’s optimal formulation.

4.6.2 Metrics

We considered the following performance metrics in the evaluation:

– VNR acceptance ratio: the percentage of accepted requests (i.e., the ratio of the number ofaccepted VNRs to the total number of VNRs);

– Node stress ratio: average load on the substrate nodes (i.e., average CPU consumption overall nodes);

1As future work we plan to explore Chebychev-like functions with the help of reference points in aninteractive way (i.e., with the help of a decision-maker), in order to get the most adequate efficient solutionfor the problem (as in (Steuer, 1986)).

87


– Link stress ratio: average load on the substrate links (i.e., average bandwidth consumptionover all edges);

– Average revenue by accepting VNRs: One of the main goals of VNE is to maximize theprofit of the virtualization provider. For this purpose, and similar to (Chowdhury et al.,2012; Yu et al., 2008), the revenue generated by accepting a VNR is proportional to the valueof the acquired resources. As such, in our case, we take into consideration that strongersecurity defenses will be charged at a higher (monetary) value. Therefore, the revenue as-sociated with a VNR is:

R(VNR) = P F ∗ [λ1

∑

i∈NV

[1+ϕ1(i)] c p uV (i) s e cV (i) c l oudV (i) +

λ2

∑

(i , j )∈EV

[1+ϕ2(i , j )] b wV (i , j ) s e cV (i , j )],

where λ1 and λ2 are scaling coefficients that denote the relative proportion of each revenuecomponent to the total revenue. These parameters offer providers the flexibility to pricedifferent resources differently. Variables ϕ account for the need to have backups, eitherin the nodes (ϕ1(i)) or in the edges (ϕ2(i , j )). Specifically, if ϕ1(i) = 1 node i requiresbackup, with 0 otherwise; if ϕ1(i , j ) = 1, at least one node i or j requires backup, with 0otherwise. As the substrate nodes and links are shared by multiple tenant VNs, the providerexpenses from acquisition and operation of each substrate element have to be compensatedby the revenue generated from the virtual resources that share it. The revenue formula thusincludes a profit factor (P F ) which we use to represent the provider’s profit margin. Forinstance, P F = 2 means each CPU and each bandwidth unit of the virtual network arecharged at twice the unit cost of the substrate resource it is mapped to.

This metric accounts for the average revenue obtained by embedding a VNR (i.e., the totalrevenue generated by accepting the VNRs divided by the number of accepted VNRs).

– Average cost of accepting a VNR: The cost of embedding a VNR is proportional to thetotal sum of substrate resources allocated to that VN. In particular, this cost has to takeinto consideration that certain virtual edges may end up being embedded in more than onephysical link (as in the substrate edge between nodes b , d and c , in Figure 4.2). The costmay also increase if the VNR requires higher security for its virtual nodes and links. Thus,we define the cost of embedding a VNR as:

C(VNR) = λ1

∑

i∈NV

∑

p∈N S

c p u ip s e c S(p) c l oud S(p) +

88

4.6 Evaluation

λ2

∑

(i , j )∈EV

∑

(p,q)∈E S

f i , jp,q s e c S(p, q),

where c p u ip corresponds to the total amount of CPU allocated on the substrate node p

for virtual node i (either working or backup). Similarly, f i , jp,q denotes the total amount of

bandwidth allocated on the substrate link (p, q) for virtual link (i , j ). λ1 and λ2 are thesame weights introduced in the revenue formula to denote the relative proportion of eachcost component to the total cost.

In the experiments, we set λ1 = 10 and λ2 = 1. These values are chosen based on empiricalevidence: a node is at least one order of magnitude more expensive than an edge. Forinstance, the cost of a modern Ethernet switch (tof, 2019) is on the order of the tens ofthousands of dollars, whereas each of its network interfaces (and cabling) is typically 10x to100x cheaper. A similar argument can be made for cloud resources, where the cost of VMinstances is on average much higher than the required VM-to-VM communication costs,for two reasons. First, intra-cloud communication typically does not incur any cost to thetenants. Second, in a multi-cloud setting, while there is a cost for inter-cloud connectivity,the number of VM instances is expected to be much higher than the number of clouds(we cannot anticipate any scenario where this assumption would not hold), so the cost ofthe inter-cloud link is amortized amongst the various instances. With respect to the profitfactor variable, we consider multiple values. Specifically, P F = {1,5,10}.

4.6.3 Evaluation Results

Figure 4.4a displays the acceptance ratio over time for one particular experiment with arandom topology substrate. We can observe that after the first few thousand time units, theacceptance ratio tends to stabilize. A similar trend also occurs with the other experiments,and for this reason, the rest of the results are taken at the end of each simulation. Theresults in the following graphs are from the Waxman topologies only (Figure 4.4b - 4.4f).We note however that the conclusions to be drawn are exactly the same as for the randomtopologies. The main conclusions are:

SecVNE exhibits a higher average acceptance ratio when compared to D-ViNE, notonly for the baseline case but also when including security requirements: Figure 4.4bindicates that SecVNE can make better use of the available substrate resources to embed

89


(a) (b) (c)

(d) (e) (f)

Figure 4.4: Average results and standard deviation (except first graph): (a) VNR acceptanceratio over time; (b) VNR acceptance ratio; (c) Node utilization; (d) Link utilization; (e)Provider profit; (f) Embedding time.

the arriving VNRs when compared to the most commonly employed VNE algorithm. It isinteresting to note that SecVNE is better than D-Vine even when 20% of the VNRs includesecurity requirements, which are harder to fulfill. This does not mean D-Vine is a poorsolution – it merely shows that its model is not the best fit for our particular problem. Inparticular, D-Vine uses the geographical distance of substrate nodes as one of the variables toconsider in node assignment. This parameter is less relevant in our virtualized environmentbut constrains D-Vine options. In any case, notice that the results for D-Vine represent itsbest configuration with respect to geographical location – we have tested D-Vine with theentire range of options for this parameter.

A richer set of demands decreases the acceptance ratio, but only slightly: VNRs withstronger requirements have a greater number of constraints that need to be satisfied, andtherefore it becomes more difficult to find the necessary substrate resources to embed them.However, a surprising result is the small penalty in terms of acceptance ratio in the pres-ence of security demands (see Figure 4.4b again). For instance, an increase of 20 percentage

90

4.6 Evaluation

points (pp) in the resources with security needs results in a penalty of only around 1 ppin the acceptance ratio. Also interesting is the fact that the reduction in acceptance ratiois more pronounced when VNRs have availability requirements when compared to secu-rity. In this case, an increase of 20 pp in the number of nodes with replication results in apenalty of around 10 pp. This is because of the higher use of substrate resources due to thereservation of backup nodes/links.

Security demands only cause a small decrease on substrate resources utilization. Fig-ures 4.4c and 4.4d show the substrate node and link stress ratio, respectively. We observethat the utilization of node resources is very high in all cases (over 80%), meaning the map-ping to be effective. It is also possible to see that slightly more resources are allocated in thesubstrate network with SecVNE than with D-ViNE, which justifies the higher acceptanceratio achieved. If the existing resources are used more extensively to be able to serve morevirtual network requests, then the assignment of virtual requests is being more effective.As the link stress ratio is lower (again, for all cases), this means the bottleneck is the nodeCPU. Finally, the link stress ratio of D-Vine is lower than in our solution. This is due toD-vine incorporating load balancing into the formulation.

Security and availability services have the potential to increase provider profit. Figure4.4e presents the provider profit for SecVNE considering the different VNR configurations,and also for D-ViNE for comparison purposes. We draw three main conclusions. First, theprofit with P F = 1 is negative in all cases. This was expected, as in this case, it is assumed therevenue of a unit of virtual node CPU and virtual link bandwidth to have the same valueas the cost of a unit of substrate node CPU and substrate link bandwidth, respectively. Asa virtual link is often mapped to a substrate path composed of multiple substrate links, theoverall provider cost is higher than its revenue. As we introduce a profit margin to therevenue, it is possible the provider to attain a profit – under our assumptions, it breakseven at around P F = 5 even in the most demanding scenarios (those that require nodereplication; we will return to this below). Second, the results show that increases in thesecurity requirements cause profit to increase faster, relative to the baseline. As this added-value service is priced at a premium, and the cost of providing it is not high (comparedto availability as we will see), the provider benefits even with a lower profit factor. Third,higher costs are incurred to fulfil availability requests due to the need to reserve additionalresources (both the nodes that require replication and the paths that interconnect them).

91


Since D-ViNE does not consider security, it ends up choosing more expensive mappings,and so its profit figures are less favourable.

More restrictions speed up VNE computations. Figure 4.4f presents the embedding timewith growing VNR sizes. We performed experiments with all VNR configurations, andalso included D-ViNE. There are two main conclusions. First, all our solutions run fasterthan D-ViNE (between 20x to 35x faster). The reason is that D-Vine has fewer restrictions,and so the search space for an optimal solution is larger. The same is true, although lesspronounced, when we compare different configurations of our solution: VNRs with morerestrictions typically run slightly faster.

4.7 Summary

In this chapter, we presented a new solution for the secure virtual network embedding(SecVNE) problem. Our approach fits a multi-cloud network virtualization deploymentand enhances over the state-of-the-art on security. Specifically, by allowing users to definedifferent security and availability demands to the nodes and links of their virtual networks.The solution further allows tenants to leverage a substrate composed of several clouds withdistinct levels of trust, enabling specific node placement requirements to be set. By notrelying on a single cloud provider we avoid internet-scale single points of failures (with thesupport of backups in different clouds). Besides, privacy and other security concerns canbe accommodated by constraining the mapping of certain virtual nodes to specific classesof clouds (e.g., sensitive workloads can be placed in a private cloud).

We formulate the SecVNE model and solve it optimally as a Mixed Integer Linear Program(MILP). In addition, we propose a new policy language to specify the characteristics ofthe substrate and virtual networks, improving the expressiveness of user requests. Ourextensive evaluation shows the efficiency of the solution and the favorable cost-revenuetrade-offs of including security services in virtual networks.

A problem of a solution based on linear programming is that it does not scale to largenetworks. For this reason, in the next chapter we explore heuristics to this problem.

92

5Scalable and Secure Multi-cloud Virtual

Network Embedding

We proposed a MILP solution for the secure multi-cloud VNE problem in the last chapter.Unfortunately, this solution scales poorly. So in this chapter, we present a scalable heuristicthat still considers security as a first-class citizen, is specifically tailored to a hybrid multi-cloud domain, and scales to very large networks.

We evaluate our algorithm with large-scale simulations that consider realistic network topolo-gies and our prototype in a substrate composed of one private data center and two publicclouds. The system scales well for networks of thousands of switches employing diversetopologies and improves on the virtual network acceptance ratio, provider revenue, andembedding delays. Our results show that the acceptance ratios are less than 1% from theoptimal solution presented in the previous chapter, and that the system can provision a 10thousand container virtual network in approximately 2 minutes.

5.1 Introduction

The problem of embedding virtual resources in a multi-cloud setting is highly complicated.In particular, it is necessary to deal with a hybrid substrate, as private data center topologies(typically a Clos variant) differ significantly from the network offered by a public cloud

93

5. SCALABLE AND SECURE MULTI-CLOUD VIRTUAL NETWORKEMBEDDING

(a full mesh or “big switch”). Since most network embedding algorithms (Fischer et al.,2013a) target wide-area networks and mesh topologies, they perform poorly when directlyapplied to this context. They are also unsuitable for any practical deployment, as theyoften recur to solving the Multi-Commodity Flow (MCF) problem for link mapping (Yuet al., 2008). This approach is not only slow to be applied to large infrastructures (as weshow in Sec. 5.5) – it is also impractical, as it would be necessary to adjust the resulting pathsplits to the granularity supported by the underlying cloud, a problem not addressed in theembedding literature. Aside from this, as our goal is to leverage from an enriched substrateto enhance the security and availability of virtual networks, it is necessary to extend theembedding algorithms with these new requirements further. While the MILP solution weproposed in the previous chapter addresses this problem, it unfortunately scales very poorly(see Section 5.5.2).

To address these challenges, in this chapter we propose a novel embedding algorithm. Oursolution achieves five goals – all five are necessary for a production-level deployment. First,it makes efficient use of the substrate resources, achieving a very high acceptance ratio ofvirtual network requests, consequently increasing provider profit. Second, it is topology-agnostic, allowing it to achieve good results for radically different topologies. Third, itallows users to specify the level of security (including availability) of each element of theirvirtual networks and guarantees the fulfillment of their requests. Forth, it improves appli-cation performance by significantly reducing the average path length. Finally, it scales well,making it practical for large-scale deployments.

This algorithm has been implemented in our Sirius system, supporting the management ofmulti-cloud substrate infrastructures, and the provisioning of user-defined virtual networkrequests. We analyze the behavior and performance of the algorithm with large-scale simu-lations that consider a private data center following Google’s Jupiter topology design (Singhet al., 2015), extended with hundreds of cloud resources spread across two public clouds.Also, we evaluate our Sirius prototype in a substrate composed of three clouds: one privatedatacenter and two public clouds (Amazon EC2 and Google Cloud Platform). We demon-strate that our system allows virtual networks to extend across multiple clouds, withoutsignificant loss of performance compared to a non-virtualized substrate. When comparedwith the state-of-the-art approaches, our novel embedding algorithm enables multi-cloudproviders to increase the virtual network acceptance ratio (with results less than 1% lowerthan the optimal), reduce path lengths, and grow provider revenue. Our evaluation also

94

5.2 Enhanced Sirius Design

shows that virtual networks with several thousands of virtual hosts can be provisionedquickly, even if spread over several clouds.

In summary, the main contributions of this chapter are:

• An enhancement of our multi-cloud network virtualization system Sirius by extend-ing its underling network model to fit a hybrid multi-cloud substrate;

• A network embedding algorithm that is demonstrated to be the best fit for our sce-nario, increasing the multi-cloud network acceptance ratio and quality of service,alongside scaling properties that guarantee its applicability in large-scale deployments;

• A detailed evaluation of our proposal under various scenarios.

5.2 Enhanced Sirius Design

Figure 5.1: Virtual networks and substrate.

In this section, we present the enhancement of Sirius network model to better fit a hybridmulti-cloud composed of public clouds and modern private data centers.

As explained, our solution allows users to define their virtual networks (VN) with arbitrarytopologies and addressing schemes, and guarantee isolation of all tenants that share the sub-

95


strate (see Figure 5.1). On top of this foundation, we set a couple of additional goals for ourplatform, including the improvement of its scalability properties, and the enhancement ofits network model. Towards these objectives, we introduce a set of requirements – the tech-nical innovations that materialize these requirements are the core of this chapter. These are:Substrate scalability: Allow the network substrate to scale out, by extending it with publiccloud resources. System scalability: Handles on the order of the hundreds of requests persecond. Enhanced virtual network services: Allow users to define the security and availabi-lity requirements of every element of their virtual networks, increasing their dependability.Provider profit: The mapping of virtual to substrate resources should maximize providerprofit, through high acceptance ratios and efficient utilization of resources. Fit for a hybridmulti-cloud: The solution should perform well in a diverse substrate network with differ-ent topologies, including public (e.g., Amazon EC2), private (e.g., a modern data center),and hybrid clouds (e.g., a private DC extended with public cloud resources). Practicality:The constraints of the network substrate should be taken into account. Performance: Thevirtualization layer should not introduce significant overhead and application performanceshould not degrade.

These requirements are not met by any existing solution, which at most address a subset ofthese criteria.

5.2.1 Virtual and Substrate Networks

Enhanced Virtual networks. In our platform, the user can define her virtual network (Fig-ure 5.1, left) using a graphical user interface or through a configuration file. As explainedbefore, she can define any arbitrary topology composed of a set of virtual hosts and a groupof virtual switches, interconnected by virtual links. In this chapter we extend this model.Specifically, the virtual switches can be of one of two types: virtual edge switches, in casethey have virtual hosts attached, or virtual transit switches, in case they do not. The virtualhosts can be configured with any addresses from the entire L2 and L3 address space1. Thevirtual links are configured with the bandwidths and latencies required.

Similar to Chapter 4, users set the specific security requirements of each virtual host, switch,and link. It is again also possible to define the level of availability of virtual hosts. In this

1With the restriction that they are not allowed to use the same address in different hosts of their network.

96

5.3 Network Embedding

case, our system enforces these elements to be replicated, accordingly with their level. Forinstance, if the level of availability required is high, it may be replicated in a different cloud,to tolerate large-scale cloud outages.

Figure 5.1 illustrates a virtual network in our system, with a simplified set of requirements.In this example, the nodes within a red circle represent sensitive elements and thereforeneed to be located in a trusted cloud.

Substrate network. Our system allows a network provider to extend its infrastructure byenriching its substrate with resources from public cloud providers, allowing it to be shared(transparently) by various users (Figure 5.1, right). The resources are organized in such away to create a single multi-cloud abstraction. Our substrate is composed of one or severalprivate infrastructures, and one or several public clouds. In this infrastructure, the substratecompute elements run on top of the Sirius compute hypervisor, and are inter-connected bysubstrate links and substrate switches.

In this chapter we enhanced this models as follows. Every hypervisor runs one substratesoftware switch to allow communication between the substrate compute elements and bet-ween these and the outside world. But now, we consider a second type of substrate switchesin our model: the substrate fabric switch. This corresponds to a physical switch under Sir-ius’s control – typically part of a private data center (DC) fabric. Note, however, that thisdoes not mean the solution to require full control of all DC switches. If the provider doesnot have centralized control over the fabric (as it uses traditional L2/L3 networking), thenthe substrate topology will not include the fabric switches. In this case, the only switchingelements are the software switches that run at the edge. The substrate links include tun-nels (both intra-cloud and inter-cloud) and physical links. Again, physical links are onlyincluded in the substrate topology if Sirius has control over the physical switches to whichthey connect. Finally, every cloud includes one gateway that acts as an edge router. Theinter-cloud tunnels are set up and maintained in this node (as such, this is the only elementthat requires a public IP address).


As we enhanced the design of Sirius with new virtual network abstractions and networkelements, in this section we abstract the elements of the infrastructure in a new model that

97


captures the fundamental characteristics of the substrate and user demands for the virtualnetworks. Sirius then uses the models to optimize the embedding of user requests in theclouds.

5.3.1 Network model

5.3.1.1 Substrate Network

The substrate network is modeled as a weighted undirected graph GS = (N S , E S ,ASN ,AS

E ),where N S is a set of nodes, E S is the group of links (or edges) and AS

N /ASE are the attributes

of the nodes and edges. A node nS is a network element capable of forwarding packets.It can be either a software switch or a fabric switch, which is modeled by an attributet y pe(nS) with values 0 or 1, respectively.

A software switch connects several local compute elements (e.g., containers) to the in-frastructure. All run in the same physical (or virtual) machine and share the local re-sources. Therefore, we employ attribute c p u(nS) to aggregate the total CPU capacity avail-able for network tasks and processing of user applications. Similarly, these componentshave an equivalent set of protections and are located in the same cloud. We use attributess e c(nS)≥ 0 and c l oud (nS)≥ 0 to represent the security level of the ensemble and the trustassociated with the cloud, with higher values associated to stronger safeguards.

Fabric switches are internal routing devices, utilized for example in the access or aggrega-tion layers of a data center. They are optional elements because our solution can enforceall necessary traffic forwarding decisions by configuring only the software switches at theedge. However, they allow for additional flexibility when computing the paths, often lead-ing to more efficient embeddings. Since public cloud providers disallow the configurationof internal network devices, we restrict the modeling of fabric switches to private data cen-ters. These switches have the same attributes as above, where the values for s e c(nS) andc l oud (nS) are dictated by the risk appetite of the owner organization. Overall, the at-tributes of a node are AS

N = {{t y pe(nS), c p u(nS), s e c(nS), c l oud (nS)} | nS ∈N S}.

The edges are characterized by the total bandwidth capacity (attribute b w(e S) > 0) andthe average latency (l at (e S)> 0). They also have an associated security level (s e c(e S)> 0),where for example inter-cloud links may be perceived less secure than edges of a private

98


datacenter, as the first have to be routed over the Internet. Overall, the edge attributes areAS

E = {{b w(e S), l at (e S), s e c(e S)} | e S ∈ E S}.

5.3.1.2 Virtual Networks

VNs are also modeled as weighted undirected graphs GV = (N V , EV ,AVN ,AV

E ), where N V

is the set of virtual nodes, EV is the set of virtual links (or edges), and AVN / AV

E are the nodeand edge attributes. These attributes are similar to the substrate network attributes. Forexample, t y pe() classifies whether a node is a virtual edge or a virtual transit switch. Inthis case, and to reduce the specification effort, t y pe(nV ) is inferred by the system usinga straightforward rule – a switch with no virtual host attached is considered a virtual transitswitch; otherwise, it is a virtual edge switch.

A node nV that corresponds to a virtual edge switch models the requirements of the switchand the locally connected hosts. Regarding demanded CPU, the attribute c p u(nV ) is thesum of all requested processing capacity (for network tasks and applications). With theother attributes, we take the most strict requirement of all elements (e.g., if hosts needa security level of 4 but the switch only asks for 1, then s e c(nV ) is 4). For virtual tran-sit switches, there is a direct relation between the requirements and the attributes of thematching node.

VNs only have an extra attribute in AVN to support enhanced availability. In many scenarios,

hosts should be replicated so that backups can take over the computations after a failure.This means that during embedding additional substrate resources need to be allocated forthe backups. The attribute avai l (nV ) can take three pre-defined values: avai l (nV ) = 0means no replication; avai l (nV ) = 1 requests backups in the same cloud; for replicationin another cloud then avai l (nV ) = 2 (e.g., to survive from a cloud outage).

Overall, the two sets of attributes are: AVN = {{t y pe(nV ), c p u(nV ), s e c(nV ), c l oud (nV ),

avai l (nV )} | nV ∈N V } and AVE = {{b w(eV ), l at (eV ), s e c(eV )} | eV ∈ EV }.

Virtual Network Requests: VNRs are composed by the requested virtual network plustwo extra parameters: V N R = (N V , EV ,AV

N ,AVE ,T i meV , D u r V ). The first corresponds

to the instant when the VNR arrived to the system (T i meV ), while the second to theperiod during which the virtual network should remain active in the substrate (D u r V ).

99


Virtual Networks

Substrate Network

4

Public Cloud

1 3

2

Trusted PublicCloud

Private Datacenter

k l m n

o p q r

Virtual nodes

Virtual edge

Substrate edge Mapping of virtual to subst. edges

Legend:

Mapping of virtual to subst. nodes

< <

Cloud Trust Domain Levels

...

w t x

u y

v z h

c d e

f

i g

b

s

Substrate software / fabric switch

j

a 1

...

...

...

3

4

2

Figure 5.2: Derived network models from the user specifications for the substrate networkand a single virtual network (only considering the c l oud () attribute).

If necessary, at a later moment, the user may extend the duration to avoid eviction fromthe substrate. Figure 5.2 displays a model of a virtual network, where a user demands 4interconnected virtual nodes. This VN is embedded in the substrate represented below.For instance, virtual node 1 is mapped to the substrate node a and virtual link 1 ↔ 3becomes a substrate path a↔ d ↔ g ↔ j . For simplicity, the figure only represents thecloud trust related requirements. It is possible to observe that the embedding places thevirtual nodes in clouds with trust degrees higher or equal than the requested.

5.3.2 Scalable SecVNE

Sirius instantiates the users’ requests by mapping the associated VNs onto the substratewhile respecting all declared attributes. Our solution handles the on-line VNE problem,where every time a VNR arrives there is an attempt to find appropriate provisioning in thesubstrate. While deciding on the location of the resources, we perform a heuristic mappingwith the purpose of increasing the overall acceptance ratio, reducing the usage of substrate

100


resources, and fulfilling the security needs. If we find a solution, the residual capacity ofthe substrate resources decreases by the amount that is going to be consumed. Otherwise,we reject the request.

While experimenting with existing embedding algorithms, we observed some limitationswhen applied to our multi-cloud environment, namely related with the inability to addressthe security requirements (for instance, Chowdhury et al. (2012); Shahriar et al. (2016); Yuet al. (2008)), the incapacity to support large numbers of switches (for instance, Alaluna et al.(2017); Chowdhury et al. (2012); Rahman et al. (2010); Yu et al. (2011)), and inefficiencieson the use of resources, often because the assumed network model is not a good matchfor our setting ( for instance, Bays et al. (2012); Chowdhury et al. (2012); Liu et al. (2014);Rahman et al. (2010); Shahriar et al. (2016); Yu et al. (2011, 2008)). Therefore, we designeda new algorithm built around the following ideas: (i) Optimal embedding solutions, forinstance, based on linear program optimization do not scale to our envisioned scenarios,and therefore we employ a greedy approach based on two utility functions to guide theselection of the resources; (ii) The mapping of the virtual resources to the substrate is carriedout in two phases: in the first the nodes are chosen, and then the links. While ensuring thesecurity constraints, we give priority to the security level over the cloud trust; (iii) Thebackup resources necessary to fulfill availability requirements are only reserved after theprimary resources have been completely mapped, giving precedence to the common casewhere no failures occur. The next subsections detail the various parts of our solution.

5.3.2.1 Utility Functions

The process for selecting the substrate nodes uses two utility functions to prioritize thenodes to pick earlier. The first function, U Re s Sec(), utilizes only information about thecurrent resource consumption and the node security characteristics. In particular, it valuesmore the nodes that: (i) have the highest percentage of available resources, as this con-tributes to an increase in the VNR acceptance ratio; and (ii) provide the lowest securityassurances (but still larger than the demanded ones), to reduce the embedding costs (highlysecure cloud instances are typically substantially more expensive than regular ones1). Theutility of a substrate node nS is:

1For example, the cost of a Trend Micro Deep Security instance at aws.amazon.com/marketplace is fivetime higher than a normal instance.

101


U Re s Sec(nS) =

%RN (nS)×∑

∀eS→nS

%RE (eS)

s e c(nS)× c l oud (nS)(5.1)

where the %RN (nS) = RN (n

S)/c p u(nS) is the percentage of residual CPU capacity (oravailable capacity) of a substrate node. The residual capacity is computed with Equation 5.2,with nV ∈N V and x ↑ y denoting that the virtual node x is hosted on the substrate node y:

RN (nS) = c p u(nS) −

∑

∀nV ↑nS

c p u(nV ) (5.2)

and, where the sum element of Equation 5.1 corresponds to the overall available band-width of the edges connected to nS (e S → nS means nS is an endpoint of e S ). The valueof %RE (e

S) = RE (eS)/b w(e S) is the percentage of the residual capacity of a substrate link.

The residual capacity is calculated with Equation 5.3, with eV ∈ EV and x ↑ y denotingthat the flow of the virtual link x traverses the substrate link y:

RE (eS) = b w(e S) −

∑

∀eV ↑eS

b w(eV ) (5.3)

The second utility function, U Pat h(), contributes to decrease the distance (in a numberof hops) among the substrate nodes of a VN embedding (for this reason we term this func-tion Path Contraction), improving the QoS and decreasing further the provider costs. Thisallows coupling node to link mapping, which is key to improve network efficiency, as weshow in Section 5.5.3. When a virtual node nV is to be mapped, the utility of selecting asubstrate node nS is computed with two factors: (i) the U Re s Sec() of nS ; and (ii) the av-erage distance between nS and the substrate nodes that have already been used to place theneighbors of nV (given by function av g Di s t2N ei g h b o r s()). The substrate node utilitywith path contraction is:

102


U Pat h(nS , nV ) =U Re s Sec(nS)

av g Di s t2N ei g h b o r s(nS , nV )(5.4)

The intuition behind dividing the (initial) substrate node utility by the average distanceis to diminish the utility of substrate nodes located further away from the nodes alreadymapped. The effect is a lower communication delay (average path length is shorter) and,simultaneously, a decrease in bandwidth costs.

5.3.2.2 SecVNE Algorithm

Our Secure Virtual Network Embedding (SecVNE) algorithm consists of three procedures:(i) receiving and processing the tenants VN request (main SecVNE procedure); (ii) primarynode and edge mapping, employing the utility functions to steer the substrate resourceselection; and (iii) backup mapping, allocating disjoint resources to avoid common modefailures.

SecVNE: VN requests are embedded into the substrate with Algorithm 1. The algorithmexpects two kinds of inputs: (1) the VN graph (GV ) including all attributes; and (2) thesubstrate description, with the graph (GS ) and current nodes’ and edges’ residual capacities(RN , RE ). At system initialization, the algorithm assigns the residual capacities with thevalues of the resource capacities in the substrate graph, but as VNRs are serviced, theyare updated using Equations 5.2 and 5.3. Also, when a VNR execution ends, the residualcapacities are increased to reflect the release of the associated resources.

The algorithm can potentially produce two mappings. One is for the network used in nor-mal operation, called the P M a p (from primary mapping), and the other is for the backupnodes and edges to be employed in case of failure, called the BM a p (from backup mapping).A mapping is a set of tuples, each with a virtual resource identifier and the correspondingsubstrate resource(s) where it will be placed (and some additional information).

The algorithm logic is relatively simple. It starts by seeking for a P M a p (Lines 1-2). Then,it changes the residual capacities based on resource consumption after deployment (Line6). If at least one of the virtual nodes requires availability support (aval (nV

i ) > 0), then abackup mapping is also obtained (Lines 8-9), and the capacities are updated (Line 11).

103


Algorithm 1: Main SecVNE ProcedureInput: GV , GS , RN , REOutput: P M a p //mapping for primary networkOutput: BM a p //mapping for backup network

1 P M a p.N ←N od eM a p pi n g (GV ,GS , RN , RE );2 P M a p.L← Li nkM a p pi n g (GV ,GS , RN , RE , P M a p.N );3 if ((P M a p.N 6= ;) ∧ (P M a p.L 6= ;)) then4 Rt e m pN ← RN ;5 Rt e m pE ← RE ;6 U pdat eRe s ou r ce s(RN , RE , P M a p);7 if (at least one virtual node needs backup) then8 BM a p.N ← BN od eM a p(GV ,GS , RN , P M a p);9 BM a p.L← BLi nkM a p(GV ,GS , RN , RE , P M a p);

10 if ((BM a p.N 6= ;) ∧ (BM a p.L 6= ;)) then11 U pdat eRe s ou r ce s(RS

N , RSE ,BM a p);

12 return (P M a p, BM a p);13 else14 RN ← Rt e m pN ;15 RE ← Rt e m pE ;16 return (;, ;);

17 else18 return (P M a p, ;);

19 else20 return (;,;);

Node Mapping: Algorithm 2 implements the N od eM a p pi n g () procedure, whose goalis to find a valid embedding for the virtual nodes of the VN, taking into consideration allattribute requirements.

The procedure starts by creating a table where a score value orders the virtual nodes (Line2, and top of Figure 5.3). The virtual node score is calculated using:

N Sco r e(nV ) =

c p u(nV )×∑

∀eV→nV

b w(eV )

s e c(nV )× c l oud (nV )(5.5)

104


Algorithm 2: NodeMapping()Input: GV , GS , RN , REOutput: nodeMap //mapping for the nodes

1 nod eM a p←;;2 s co r eT ← g e t Sco r e(GV );3 u t i l T ← g e t U t i l (GS , RN , RE );4 forall (nV

i ∈GV ) do5 vi r t ual N od eM a p ped ← f a l s e ;6 cand i dT = g e tC and i dat e s(nV

i ,GV ,GS , u t i l T );7 forall (nS

j ∈ cand i dT ) do8 if (c p u(nV

i ) ¶ RN (nSj )) then

9 nod eM a p← nod eM a p ∪ (nVi , nS

j );10 d e l U t i l (nS

j , u t i l T );11 vi r t ual N od eM a p ped ← t r ue ;12 b r eak;

13 if (vi r t ual N od eM a p ped = f a l s e) then14 return ;;

15 return nod eM a p;

where eV → nV means nV is an endpoint of link eV . In the figure, virtual node 1 has ascore of 2, and virtual node 3 has a score of 9. As processing of virtual nodes is performedin increasing score order, in this example these virtual nodes are thus the first and the last,respectively, to be processed.

Next, we calculate the U Re s Sec utility for all substrate nodes (Line 3; middle of the figure).For instance, the U Re s Sec score of substrate node a is equal to 15, whereas the score of cis 72 and of e is 70. These scores are stored in u t i l T , a hashMap indexed by the securitylevel and cloud trust of the node, to optimize accesses by security demand. The scoresof substrate nodes c and e referred above, for instance, are stored in the table line thatcorresponds to a security level of 2, and a cloud trust of 2.

Virtual nodes nVi are then processed one at a time. For each, we select all acceptable can-

didate substrate nodes, i.e., the switches nSj that provide security assurances of at least the

same level requested (s e c(nSj )≥ s e c(nV

i ) and c l oud (nSj )≥ c l oud (nV

i )). As an example, ifthe virtual node requires a security level and cloud trust of (at least) 2, the substrate nodesincluded in last u t i l T line are not considered, as those have a cloud trust and security level

105


of only 1. In addition, candidates are chosen based on the type of virtual node: if nVi is

a virtual edge switch (t y pe(nVi ) = 0) then only substrate software switches are acceptable

(t y pe(nSj ) = 0); otherwise, for virtual transit switches (t y pe(nV

i ) = 1), we allow eithersoftware or fabric substrate switches. The algorithm places these candidates in a structurecalled cand i dT , ordered by decreasing U Pat h utility value (i.e., not ordered using theU Re s Sec score stored in the u t i l T table) – observe Line 6 of the algorithm, and bottomof the figure. For instance, node e , with a U Re s Sec score equal to 70, has a U Pat h scoreof 20. Next, we search cand i dT for the first node that has enough residual CPU capac-ity (Line 8), and use the first to be found (Line 9). We also remove this node from u t i l Tto prevent further mappings to this substrate node from this VNR (i.e., from this tenant),thus avoiding situations where a single failure would compromise a significant part of theprimary virtual network of a single tenant (Line 10).

Figure 5.3: Data structures used in node mapping.

We should note that the ordering of the search has a substantial impact on performance,namely concerning request acceptance and costs – we recall that the processing of s co r eTis in increasing score order, while cand i dT is in decreasing utility order. The intuition,which was confirmed by our simulations, is that this leads to embeddings where: (i) virtualnodes with modest security demands are mapped to substrate nodes that give fewer assur-ances (“freeing” nodes with higher security level to serve more demanding requests); (ii)

106


nodes end up being physically located near to each other; (iii) there is a more even distribu-tion of the residual capacities.

Link Mapping: Algorithm 3 finds a mapping between the virtual edges and the substratenetwork. Each edge is processed individually, searching for a suitable network connectionbetween the two substrate nodes where its virtual endpoints will be embedded. The ap-proach is flexible, allowing the use of single or multiple paths.

The algorithm consists of the following steps. First, the substrate edges that do not providethe necessary security guarantees are excluded (Lines 6-9). For this purpose, we set to nullthe residual bandwidth capacity of those edges on an auxiliary variable Rl oo pE (Line 9),thus preventing their selection at later steps.

Second, we obtain a set of paths that could be employed to connect the two substrate nodeswhere the virtual edge endpoints will be embedded (Line 10). In our implementation, we re-sort to the K-edge disjoint shortest path algorithm to find these paths, using as edge weightsthe inverse of the residual bandwidths. This ensures that when “distance” is minimized, thealgorithm picks the paths that have the most available bandwidth.

Third, the substrate paths to be used are chosen (Lines 11-17). Prospect paths p are an or-dered sequence of substrate edges ( p = (e S

1 , e S2 , ...)), which have a certain latency (equal to

the sum of the link latencies, and calculated by g e t Lat ency()) and a maximum bandwidth(given by the smallest residual bandwidth of all the edges, and computed by g e t M i nBandwi d t h()). Eligible paths need to have latency less than the requested (Line 12). We storethese paths in a candidate set cand P together with the corresponding virtual edge and avail-able bandwidth (Line 15). The set will have at most M axPat h s , a constant that defines thedegree of multipathing (when set to 1, we use a single path) (Lines 16-17). This constantcan be used to prevent an excessive level of traffic fragmentation, which is important whenmanaging the number of entries in the packet forwarding tables of the switches. On theother hand, it improves dependability because localized link failures can be automaticallytolerated with multi-path data forwarding (if enough residual capacity exists in the surviv-ing paths).

The last step is to define how much traffic goes through each path, ensuring that togetherthey provide the requested edge bandwidth (Lines 18-22). An edge can only be mapped ifenough bandwidth is available in the paths (Line 18). In this case, we update the bandwidthin every path to an amount proportional to their maximum capacity, therefore distributing

107


the load (Lines 19-20). Then, the residual capacities of the substrate edges are updated,accordingly to this embedding procedure (Line 21), and the set of paths is saved (Line 22).

Backup Mapping: To improve the availability of their virtual networks, tenants can spec-ify replication of specific virtual nodes. For this purpose, the algorithm performs a backupembedding that satisfies the same attribute requirements as the primary mapping. Thebackup functions, BN od eM a p() and BLi nkM a p() called in Algorithm 1, operate simi-larly as their normal counterparts, with the exception that we exclude all resources usedin the primary mappings from the embedding. We note that there are a few cases, howe-ver, where it may be impossible to enforce this objective. For instance, inside the samerack typically there is only one ToR switch. So, in this specific case, there is some level ofsharing.

5.4 Implementation

The enhanced version of the Sirius network hypervisor is still an SDN application runningon top of the Floodlight controller. The orchestrator is deployed on an Apache Tomcatserver. For the compute hypervisor, we have resorted to Docker with Open vSwitch Pfaffet al. (2015) as the software switch. As such, substrate compute elements are Docker con-tainers. Users interact with the system through a GUI based on a browser-based visualiza-tion library vis (2017).

Our system was deployed in a substrate composed of two public clouds (Amazon EC2and Google Compute Engine) and one private infrastructure (our own data center). Thepublic clouds are managed with Apache jclouds. We acquire VMs in the public cloud andthen configure Docker to support the automatic provisioning of containers. In the privatecloud, we resort to VMs running in a rack server. The substrate fabric switches of our datacenter are Pica8 P-3297, operating at 1 Gbps. The intra-cloud edges are created using GRE,and the clouds interconections are set up with OpenVPN tunnels.

108

5.4 Implementation

Algorithm 3: LinkMapping()Input: GV ,GS , RE , nod eM a pOutput: l i nkM a p // link mappings

1 l i nkM a p←;;2 Rt e m pE ← RE ;3 forall (eV

i ∈ EV ) do4 t ot al Bw← 0;5 Rl oo pE ← RE ;6 forall (e S

j ∈ E S ) do7 if (s e c(eV

i )> s e c(e Sj )) then

8 Rl oo pE (eSj )← 0;

9 Pat h s ← g e t Pat h s(eVi ,GV ,GS , Rl oo pE , nod eM a p);

10 foreach ( p ∈ Pat h s ) do11 if (l at (eV

i )≥ g e t Lat ency(p,GS)) then12 b w p← g e t M i nBand wi d t h(p, Rl oo pE );13 t ot al Bw← t ot al Bw + b w p;14 cand P ← cand P ∪ (eV

i , b w p, p);15 if (|cand P | = M axPat h s ) then16 b r eak;

17 if (t ot al Bw ≥ b w(eVi )) then

18 forall (m p ∈ cand P) do19 b w(m p)←d(b w(m p)/t ot al Bw) ∗ b w(eV

i )e;20 U pdat eLi nkRe s ou r ce s(RE , cand P );21 l i nkM a p← l i nkM a p ∪ cand P ;22 else23 RE ← Rt e m pE ;24 return ;;

25 return l i nkM a p;

109


5.5 Evaluation

The evaluation aims to answer several questions. First, we want to determine if our so-lution is efficient in using the substrate resources. Namely, regarding the acceptance ratioof virtual network requests, which will translate into profit for the multi-cloud provider.Second, we want to understand how the system scales, both concerning the enrichmentof the substrate with cloud resources and the rate of arrival of VNRs. Additionally, weneed to find out if Sirius handles well different kinds of topologies, including private, pub-lic, and hybrid clouds. Finally, we would like to measure the overhead introduced by thevirtualization layer, and how it affects application performance.

Notation Description

NS+NA no security or availability demands on the VNRs

10S+NA VNRs with 10% of resources (nodes and links) with security demands (excluding availability)

20S+NA like 10S+NA, but with security demands for 20% of the resources

NS+10A VNRs with no security demands, except for 10% of the nodes requesting replication

NS+20A like NS+10A, but for 20% of the nodes

20S+20A20% of the resources (nodes and links) with security demands and 20% of the nodes withreplication

Table 5.1: VNR configurations that were evaluated.

We evaluate Sirius using large-scale simulations, comparing it with the two most commonlyused heuristics: D-ViNE, which uses a relaxation of a MILP solution for node mapping andMCF for link mapping (Chowdhury et al., 2012, 2009); and the heuristics proposed by Yuet al. (2008) that follow a greedy approach for node mapping and use MCF for link map-ping (we label this solution FG+MCF). As MCF has scalability limitations, for this secondapproach we have also used the shortest path algorithm (FG+SP).

Besides, we evaluate the performance of our prototype over a multi-cloud substrate com-posed of a private data center and two public clouds (Amazon and Google), measuring theelapsed time to create various networks.

110

5.5 Evaluation

5.5.1 Testing environment

We extended an existing simulator vin (2012) to collect various metrics about the embed-ding when a VNR workload arrives at the system.

Substrate networks. We employed two types of substrate network models: for publicclouds, we utilized Waxman, where pairs of nodes are connected with a probability of50% Naldi (2005) (using the GT-ITM tool Zegura et al. (1996)); for the private data cen-ter, we created networks following the Google’s Jupiter topology design Singh et al. (2015).For comparison with the optimal solution (Section 5.5.2), we considered a small scale sub-strate of 25 nodes. The reason is that the optimal solution we proposed in Chapter 4 doesnot scale to large networks. For the large-scale simulations (Section 5.5.3) we consideredthree substrate networks: pub_substrate - 100 nodes spread evenly in three clouds;pvt_substrate - 1900 nodes in one private data center; and multi_substrate -2500 nodes spread in three clouds and a private data center. The CPU and bandwidth(c p uS and b wS ) of nodes and links is uniformly distributed between 50 and 100 and 500and 1000, respectively. Latencies inside a data center were set small (l at S ∈ {1.0}), andbetween clouds were set larger (l at S ∈ {50.0}), following empirical evidence we collected.These resources are also uniformly associated with one of three levels of security and trust(s e c S ∧ c l oud S ∈ {1.0,1.2,5.0}) in the public clouds, and one level (s e c S ∧ c l oud S ∈ {6.0})in the private cloud. These values were chosen to achieve a good balance between the diver-sity of security levels and their monetary cost. Again, the rationale for this choice was ouranalysis of the cost of Amazon EC2 instances with normal and secure VM configurations.These costs assume a wide range of values, related to the implemented defenses. For exam-ple, while an EC2 instance with content protection is around 20% more expensive than anormal instance (hence our choice of 1.2 for the intermediate level of security), the cost ofinstances with more sophisticated defenses are at least five times greater (our choice for thehighest level of security).

Virtual networks. VNRs have a number of virtual nodes uniformly distributed between5 and 20 for the smaller scale setups, and between 40 and 120 for the larger-scale ones. Pairsof virtual nodes are connected with a Waxman topology with probability 50%. The CPUand bandwidth of the virtual nodes and links are uniformly distributed between 10 and 20,and 100 and 200, respectively. Several alternative security and availability requirements areevaluated, as shown in Table 5.1. We model VNRs arrivals as a Poisson process with an

111


average rate of 4 VNRs per 100-time units for the first setup and 8 VNRs for the others.Each VNR has an exponentially distributed lifetime with an average of 1000 time units.

5.5.2 Evaluation against optimal solution

The goal of an heuristic is to compute good solutions fast, in order to obtain results thatare close to the optimal, while enabling it to scale to very large networks. We thus startby asking how far is our heuristic from the optimal solution we proposed in Chapter 4.For this purpose, we have considered a small scale substrate that, as explained above, con-sists of 25 nodes (the optimal solution does not scale to larger values). We simulated 2000VNRs with a number of virtual nodes that is uniformly distributed between 2 and 4 nodes.We compared the acceptance ratio of both solutions considering the VNR configurationNA+NS (recall Table 5.1). The results are presented in Figure 5.4. The Sirius heuristicpresents results that are very close to the optimal, with an acceptance ratio that is less than1% lower than the optimal.

Figure 5.4: VNR acceptance ratio: Sirius vs optimal

5.5.3 Large-scale simulations

We performed an extensive set of simulations to compare Sirius with the state-of-the-artVNE solutions. Figures 5.5a, 5.5b, and 5.6a present the results for the Acceptance Ratio(AR) of the three scenarios under evaluation, respectively. We consider two variants of our

112

5.5 Evaluation

(a) Public cloud (100 nodes). (b) Private DC (1900 nodes).

Figure 5.5: Acceptance ratio: ratio of successful VNRs.

approach to better assess the advantages of the mechanisms that form the overall design:Sirius(w/oPC) only employs the U Re s Sec() utility function, and therefore does nottake into consideration the length of the paths offered by U Pat h() (i.e., it is the versionwithout Path Contraction – recall Section 5.3.2.1); and Sirius(wMCF) using MCF forlink mapping.

Acceptance ratio. In the case of VNRs with no security demands (NS+NA) in the smallernetwork (Figure 5.5a), Sirius approaches behave similarly to FG+MCF, but increase the ARby 8% over FG+SP and present a significant 3-fold improvement over D-ViNE. The poorperformance of D-ViNE is a result of its underlying model not fitting our specific multi-cloud environment. For instance, this solution considers geographical distance, which isnot as relevant in a virtualized environment. Notice however that the results for D-Vinerepresent its best configuration with respect to geographical location – we have tested D-Vine with the entire range of options for this parameter. As the first conclusion, in the net-work topology offered by a public cloud (a full mesh), both FG+MCF and Sirius achievegood results.

As we introduce security demands, Sirius acceptance ratio decreases but only slightly. Forinstance, when 20% of all virtual elements request a level of security above the baseline, thereduction of the acceptance ratio is of only 1%. The same is true with requests that includeavailability, although the decrease is more pronounced (up to 13%). We expected this result,as replication needs not only to double node resources but also leads to an increase in thenumber of substrate paths (to maintain replica connectivity). The alternative solutions per-

113


(a) AR in multi-cloud scenario (2500 nodes). (b) Average revenue (100 nodes).

Figure 5.6: Acceptance ratio (multi-cloud scenario) and provider revenue.

form poorly with security requirements, as they do not consider these additional services.Therefore, most of the produced embeddings were rejected because they would violate atleast one of the demands. We show no results in the case of D-VINE because, after morethan one week of running this experiment, the algorithm had not yet finished. We also donot include results for the tests for availability because the algorithms we compare againstdo not consider the possibility of replication.

Figure 5.7: Embedding time for node mapping.

When observing Figures 5.5b and 5.6a, the advantage of our approach is made clear. No

114

5.5 Evaluation

(a) Public cloud (100 nodes). (b) Multi-cloud (2500 nodes).

Figure 5.8: Embedding time for link mapping.

results are included for algorithms with MCF for link mapping, as they take an extremelylong time to complete. Sirius has an acceptance ratio 18% above FG+SP in the pvt_substrate (Figure 5.5b), and of over 210% in the multi_substrate (Figure 5.6a).These results demonstrate the effectiveness of our solution in improving the acceptanceratio over the alternatives for both virtualized datacenters and, even more strikingly, fora multi-cloud scenario. The main reason is our more detailed model, which incorporatesdifferent types of nodes (software and fabric switches), increasing the options available tomap virtual nodes. The conclusions with respect to security are similar to above. One note,however, to explain why the results do not degrade with security in the pvt_substratecase, compared to the others. The reason is that in this experiment all nodes are consideredof the highest security level, as they are inside the private data center. Another observationis that in some cases the average acceptance ratio is higher (despite still inside the confidenceinterval) with security demands, which can be counter-intuitive. The reason is that in somecases fulfilling security requirements tends to slightly better balance the substrate load.

Provider revenue. Next, we focus on the economical advantage for the multi-cloud provider.As in most previous work Chowdhury et al. (2012); Yu et al. (2008), we assume the revenueof accepting a VNR is proportional to the acquired resources. However, in our case, we as-sume that security is charged at a higher premium value (in line with public cloud services).We calculate the revenue per VN as:

R(VNR) = λ1

∑

i∈NV

[1+ϕ1(i)] c p uV (i) s e cV (i) c l oudV (i) +

115


(a) Number of embedded links (1900 nodes). (b) Average path length.

Figure 5.9: The effect of coupling node and link mapping with Path Contraction.

λ2

∑

(i , j )∈EV

[1+ϕ2(i , j )] b wV (i , j ) s e cV (i , j ),

where λ1 and λ2 are scaling coefficients that denote the relative proportion of each revenuecomponent to the total revenue. These parameters offer providers the flexibility requiredto price differently the resources. Variables ϕ account for the need to have backups, eitherin the nodes ϕ1(i) or in the edges ϕ2(i , j ) 1. In the experiments, we set λ1 = λ2 = 1.

Figure 5.6b presents the average revenue generated by embedding VNRs in pub_substrate. The main conclusion is that Sirius generally improves the profit of the multi-cloudprovider. First, revenue is enhanced when we include security, which gives incentives forproviders to offer value-added services. Second, availability can even have a stronger impactbecause more resources are used to satisfy VNRs

Scalability. We now turn our attention to system scalability, with a focus on embeddinglatency, as this metric translates into the attainable service rate for virtual network requests.The measurements use code that is equivalent to the one used in our network hypervisor(for our approach, it is the same java implementation). Figures 5.7 and 5.8 present the timeto map nodes and links, respectively. As can be seen, D-VINE scales very poorly in bothphases, while Sirius, Sirius(w/oPC), and FG+SP behave better. With mappings tak-ing in the order of tens of ms, these solutions enable embedding hundreds to thousands ofvirtual elements per second. The time to embed backup elements is of the same order of

1 ϕ1(i) = 1 if a backup is required, or 0 otherwise; ϕ1(i , j ) = 1, in case at least one node needs a backup,or 0 otherwise.

116

5.5 Evaluation

magnitude (we omit the graph for space reasons). Finally, as the substrate and the size of thevirtual networks grow, the embedding latency increases accordingly. Figure 5.8b displaysthe worst case of our experiments: the time for link mapping in the multi_substrate.For such large-scale network, embedding increases to around 60 seconds per virtual net-work. In summary, Sirius, Sirius (w/oPC), and FG+SP are the only embeddingsolutions that scale to reasonable numbers in the context of a realistic network virtualiza-tion system.

To conclude, we look into the benefits brought by the Path Contraction (PC) functionU Pat h(). By “contracting” path lengths, Sirius requires fewer substrate links. This canbe confirmed in Figure 5.9a, which shows the total number of substrate links utilized inthe private data center topology (similar results were obtained for the other scenarios).Even when comparing with FG+SP , a scheme that resorts to shortest path, Sirius ends upperforming better. The reason is that our heuristic couples the two phases of embeddingby bringing neighboring nodes closer to each other. As a consequence, there was the ex-pectation of increasing embedding efficiency and improving application performance byreducing network latency. Indeed, the PC mechanism enhances the embedding acceptanceratio (see Figures 5.5b and 5.6a). Moreover, U Pat h() decreases the distance between virtualnodes, measured as the number of hops between their corresponding substrate nodes (“pathlength”). Figure 6.6a illustrates, for one representative simulation run, how path lengthsare significantly decreased. On average, we observed a 26% reduction on this metric.

5.5.4 Prototype experiments

We now turn to evaluate Sirius by running real experiments with our prototype. We haveset up a multi-cloud substrate, composed of 2 public clouds (Google and Amazon), and aprivate data center (our cluster in Lisbon). In Section 5.5.4.1, we test the performance of theprototype, including the time to set up the substrate, the time to provision virtual networks,and data plane performance (regarding latency and throughput). Then, in Section 5.5.4.2,we compare Sirius running our embedding algorithm against the alternative of integratingother state-of-the-art algorithms into Sirius.

117


(a) Substrate setup time. (b) Average path length.

Figure 5.10: Container configuration in-depth.

Figure 5.11: Virtual network provisioning.

5.5.4.1 Prototype performance

Substrate setup time. Figure 5.10a shows the time to set up a substrate with VMs dis-tributed through the three clouds. Since we perform most operations in parallel, it is pos-sible to observe only a small increase in time when the number of VMs more than doubles,from 10 to 25. The slowest operation is VM configuration, which includes the time for

118

5.5 Evaluation

(a) Throughput. (b) Latency.

Figure 5.12: Prototype measurements: intra- and inter-cloud throughput and latencies.

software installation (e.g., Docker) and getting a basic container image. The second mostrelevant delay is VM provisioning by the cloud provider. Overall, the added cost of oursolution is small, as it involves only the setup of the required tunnels.

Virtual network provisioning. Substrate provisioning represents a one-off cost in termsof setup time, and therefore it does not have an impact on user experience. By contrast, thetime to provision virtual networks represents the operating run-time cost that directly im-pacts the user, and so it requires particular care. In Figure 5.11 we present the provisioningtime for VNs of different sizes in number of virtual hosts (containers), on substrates withdistinct numbers of substrate compute nodes (VMs). As the reader can observe, we presenttwo versions in the figure: our first implementation (“baseline”) and the final, optimizedversion. In both cases, the rapid increase in the elapsed time from 1k to 4k containers, andfrom 10k to 25k containers, is due to the rise in the number of containers deployed per VM(from 100 to 400 in the former, and from 400 to 1000 in the latter). In all cases, our em-bedding procedure represents a relatively insignificant fraction of the overall provisioningtime (less than 3%).

The superlinear increase in provisioning time for large VNs has motivated us to investi-gate performance optimizations. In our baseline implementation, container configurationrepresented by far the largest fraction of this time (over 75%). We thus made a closer inspec-tion of this provisioning stage, shown in Figure 5.10b. Container configuration consists ofrunning a customized version of the ovs-docker script — the Open vSwitch utility that en-

119


ables Docker networking. Most of the processing is spent on three tasks: creating virtualcontainer interfaces with ip link, setting up the networking name space with ip netns, andlink the interfaces with a specific OVS switch port with ovs-vsctl add-port. Figure 5.10bshows that the ovs-vsctl command, in particular, scales poorly, with script execution timeexhibiting a growth rate that explains the superlinear increase in configuration time ref-fered to above. We thus realized the bottleneck of the baseline version to be the use ofsequential calls to the script within a single process, a problem that becomes more severewith larger networks. As an optimization of the configuration process, we execute severalinstances of the ovs-docker script in parallel. To avoid connection issues (related to the ma-ximum number of open sockets), we bound the number of concurrent calls to 10 in ourimplementation.

This optimization of the configuration procedure resulted in a significant gain of between2x and 2.5x in virtual network provisioning time. As shown in Figure 5.11, the optimizedversion of Sirius is able to provision a virtual network of 10k containers in less than 2minutes, more than halving the result from the baseline. As the scale of the VN increasesso does the gain, which is close to 2.5x for the larger, 25-thousand hosts virtual network.

Throughput and latency. Figure 5.12 presents the cost of virtualization with respect tolatency and throughput, using as baseline a VM configuration that accesses the networkdirectly. Inter-cloud RTTs raise by around 30% (i.e., between 5 and 10ms), and intra-cloudRTTs increase by less than 400us. As inter-cloud applications typically assume latenciesof this magnitude in their design, and the added intra-cloud cost is small, this overhead isarguably acceptable. Throughput decreases further, with larger costs when the baselineis high, as expected (Koponen et al., 2014). We are currently investigating networking-enhanced VM instances to reduce this overhead.

5.5.4.2 Comparison against alternative embedding algorithms

In this section, we compare the embedding algorithm employed in Sirius against two al-ternatives: the optimal solution presented in Chapter 4, and the commonly-used greedyapproach using MCF for path selection, proposed in Yu et al. (2008). For this purpose, wehave implemented these two embedding algorithms and integrated them into Sirius.

The goal of this experiment is to measure the time it takes for the embedding algorithm to

120

5.5 Evaluation

a

Virtual network topologies

x c

b

d

a

a

x

b

ef

zy

c

d

Big switch Virtual cluster Inter network

Legend (Virtual Network):

Transit Switch

Edge Switch

G

Substrate network topology

L

b

d

c

eLegend (Substrate Network):

Fabric switch

Software switch

f

h

g

i

A

j

lk

m

o

qp

r

n

a

Figure 5.13: Virtual (top) and Substrate (bottom) topologies for experiments consideringthree embedding algorithms: Sirius, full-greedy (Yu et al., 2008), and the optimal solu-tion (Alaluna et al., 2017).

report a solution. In addition, we also want to measure its success rate, as in some cases thealgorithm may fail to find a solution. We have considered the multi-cloud substrate com-posed of three clouds explained before, with the following setup (presented at the bottomof Figure 5.13). We have set up 9 VM instances at Google as substrate compute elements(recall Figure 6.2), each with the following characteristics: 16Gb RAM, four vCPUs, andall running Ubuntu Server 16.04 LTS. In Sirius, one of these instances is the gateway, whichwe set with the role of fabric switch (node G in the figure), to which all other VMs are con-nected. We recall that in our model the fabric switch cannot have any virtual host attached,and so is able only to function as virtual transit switch. The other VMs (b∼i) run a softwareswitch, with several containers running on top.

Moreover, at Amazon we run 10 VM instances, with the same characteristics as above.

121


Again, one (node A) is set with the role of fabric switch, while the others (j∼r) run a soft-ware switch and several containers. Our private data center in Lisbon includes one Open-Flow switch (fabric switch, node L) and one bare metal physical server (node a) with 32Gb RAM, 8 CPUs, running Ubuntu Server 14.04 LTS. The server hosts a single VM assubstrate compute element, running a software switch, with several containers connected.The fabric switches are all interconnected. In all clouds, we model each software switchwith 100 CPU units.

Our experiment consists in emulating nine sequential VNRs, made by different tenants.We consider three virtual topologies, and three requests for each topology, done in se-quence. Namely, the first request is for a big switch topology; the second, a Virtual cluster;and the third is for an inter-network (top of Figure 5.13).

Figure 5.14: Embedding time for 9 sequential VNRs

We repeat this sequence three times. We have chosen these topologies as they are ubiquitousand representative of different applications. The first two, the big switch and the virtualcluster, are taken from Ballani et al. (2011). The first is a single switch to which severalhosts connect, in a star topology. The second topology consists of multiple virtual clus-ters connected to a central switch, over links that are typically oversubscribed. The thirdtopology represents a typical three-tier topology commonly used in web workloads, wherea layer of load balancers connects to a layer of application servers, which then connect toanother layer of database servers. In all the topologies considered, each edge switch runs aset of containers. In each VNR, one virtual switch running several virtual hosts has a CPU

122

5.6 Conclusions

requirement of 50 units.

Figure 5.14 presents the results of our experiment. The plot shows the embedding time ofthe three algorithms considered. The symbol X marks the cases when the algorithm failedto return a solution. As expected, the optimal solution is very slow. This fact is apparentin all cases, with each embedding taking many seconds to return a result, but is made par-ticularly evident in the inter-network case. The three VNRs requesting this topology havefailed due to a timeout1. Sirius is also faster than the full greedy approach, due to the latterusing MCF for mapping the virtual links. It is also possible to observe that full greedy hasfailed to find a solution to the last request, whereas Sirius mapped all requests. This re-sult illustrates the advantage of our enhanced model that distinguishes between fabric andsoftware switches. Our solution makes better use of the substrate resources, as it can mapvirtual transit switches to fabric switches. As the full greedy approach from Yu et al. (2008)does not make this distinction, it needs to map all switches (edge and transit) to softwareswitches. As a result, it does not have enough resources to fulfill the last request.

5.6 Conclusions

In this chapter, we presented the improved design and implementation of Sirius, our multi-cloud network virtualization platform. Specifically, we have extended the network modelof private clouds to include modern data center network designs, and have developed a newheuristic to allow the embedding process to scale to very large networks.

Evaluations of our prototype in large-scale simulations reveal that, compared with thestate-of-the-art alternatives, our solution scales well, increases the acceptance ratio and theprovider profit for diverse topologies, maintaining short path lengths to guarantee applica-tion performance.

So far, our solution have been show to improve the security and dependability of networkvirtualization. Next, we turn to flexibility.

1We set the timeout to 1-hour.

123

6Elastic Virtual Networks

In this chapter, we explore the capacity of flexibly changing the topology of a virtual net-work by proposing a VNE solution that adds elasticity to the tenant’s virtual infrastruc-tures. For this purpose, we introduce four primitives to tenants’ virtual networks – includ-ing scale in and scale out – and propose new algorithms to materialize them. The mainchallenge is to enable these new services while maximizing resource efficiency and withoutimpacting service quality. Instead of further improving existing online embedding algo-rithms – always limited by the inability to predict future demand – we follow a radicallydifferent approach. Specifically, we leverage network migration for our embedding pro-cedures and to introduce a new reconfiguration primitive for the infrastructure provider.As migration introduces network churn, our solution uses this technique parsimoniously,to limit the impact to running services. We show our solution to achieve efficiencies thatare on par with the state-of-the-art solution that fully reconfigures the substrate network,while reducing the migration footprint by at least one order of magnitude. For the sakeof simplicity, in this chapter we abstract away the multi-cloud structure of the substrateconsidered in this thesis and the security and dependability services available (as the onesproposed in the previous chapters). However, the approach we propose here is general andas such it also naturally applies to our multi-cloud context.

125

6. ELASTIC VIRTUAL NETWORKS

6.1 Introduction

The emergence of cloud services has fundamentally changed the nature of computing. Byoutsourcing computing to the cloud, businesses are relieved from the burden of operatingand maintaining an infrastructure, while being provided with the flexibility and elasticityrequired to respond to the dynamic demand for their services. Until recently, the pay-as-you-go model of the cloud has been restricted to computing services (renting VMs) andstorage resources (renting storage space and access). This model falls short on providing therequired guarantees for modern services, however. Certain applications, such as MapRe-duce or HPC, have strict networking constraints, for instance with respect to latency andthroughput, that traditional cloud services fail to guarantee (He et al., 2010; Schad et al.,2010; Wang & Ng, 2010). The unpredictable network performance of cloud services neg-atively affects user workloads and increases tenant’s costs Ballani et al. (2011); Schad et al.(2010); Zaharia et al. (2008), hindering deployment of several classes of applications in thecloud Ballani et al. (2011).

Some efforts have been made to improve cloud services with network guarantees (Ballaniet al., 2011; Guo et al., 2010), as we alluded to in Chapter 2. First, SecondNet (Guo et al.,2010) proposed the Virtual Data Center abstraction, providing tenants with bandwidthguarantees for pairs of VMs, and a data center network virtualization architecture (Sec-ondNet) to materialize it. As this solution provides a dense connectivity that makes itdifficult to multiplex multiple tenants on the underlying infrastructure, Oktoptus (Ballaniet al., 2011) enhanced cloud offerings with two new virtual network abstractions: the vir-tual cluster, geared for data-intensive traffic, and the virtual oversubscribed cluster, goodfor applications featuring local communication patterns. Other solutions have improvedover these works by considering more advanced bandwidth allocations (Jeyakumar et al.,2013; Popa et al., 2012) and additional guarantees, such as bounded packet delay Jang et al.(2015).

As argued before, these solutions based on simple network abstractions are typically enoughfor the most basic hosting applications workloads but this leaves aside many typical work-loads. Whereas traditional enterprise workloads using service discovery protocols may re-quire only a flat L2 networking service, large analytics workloads typically demand for L3routing, and web services often require multiple tiers. The experience from production-level environments (e.g., Koponen et al. (2014)) confirms this scenario: as deployments

126

6.1 Introduction

scale up

scale down

scale out

scale in

Elastic primitives for virtual networks

User

VM

vSwitch

reconfigure

Reconfiguration primitive for substrate network

Provider

used

free

Figure 6.1: ElasticVN primitives

mature, tenants migrate to more complicated workloads. This strengthens the case for of-fering tenants with arbitrary virtual networks, with diverse network topologies.

The requirements for arbitrary virtual networks brings, however, a difficulty. The solu-tions that provide only a limited set of topologies (e.g., Ballani et al. (2011); Fuerst et al.(2016); Guo et al. (2010); Popa et al. (2012); Rost et al. (2015)), use specialized embeddingapproaches, and as such do not work for arbitrary topologies. As we aim to offer virtualnetworks with arbitrary topologies, we need to address the generalized Virtual NetworkEmbedding (VNE) problem. Unfortunately, the approaches that solve the generic VNEproblem Chowdhury et al. (2012); Yu et al. (2008) are static, and do not allow the virtualnetwork to evolve, dynamically.

Ideally, virtual networks (VN) should provide an abstraction that makes tenant infrastruc-ture more elastic by allowing them to re-scale their networks depending on demand. To-ward this goal, in this chapter we introduce four network primitives to user VNs whichprovide the necessary elasticity to a cloud environment: scale-out, scale-up, scale-in, andscale-down. These primitives are depicted in Figure 6.1. Each node represents a virtual

127


switch that connects multiple VMs. The first two primitives (top of the figure) are use-ful to enable tenants to automatically respond to “black friday” events. For instance, byexpanding the network with new nodes (scale-out), following an arbitrary topology, or byincreasing the capacity of specific nodes to serve more VMs (scale-up). The latter two repre-sent, respectively, the converse operations (middle of Figure 6.1), useful to save costs whenservice demand is low. We detail these primitives in section 6.4.1.

A strawman approach to the problem of extending the network would consist in mappingthe new nodes with traditional VNE algorithms while retaining the existing mappings. Theproblem is that this results in resources being fragmented across the substrate network.As we show in Section 6.5, this leads to inefficiencies and negatively impacts applicationperformance. An alternative would be to investigate new VNE heuristics for the problem.We note, however, that any practical solution to the online VNE problem is limited by thelack of knowledge about future requests. In practice, VN requests are not known in advance,and they arrive dynamically and stay in the network for an arbitrary period of time.

As such, we propose an alternative approach: to consider network migration jointly withembedding. Modern migration techniques have shown that it is possible to migrate a net-work of virtual machines with little to no impact to application or system performance (Ghor-bani & Godfrey, 2017; Ghorbani et al., 2014). This enables – for the first time – the remap-ping of running virtual networks, and thus gives hope to overcome the limitation of tra-ditional VNE approaches. We are, however, informed that migration, if not performedjudiciously, can inadvertently overload the network, and result in application performancedegradation (Jo et al., 2017; Shrivastava et al., 2011). As such, we opt not to perform acomplete VN remapping as response to a scale-out request. A complete remap potentiallyrequires several network elements to be migrated, resulting in an undesirable high level ofchurn. By contrast, our solution, ElasticVN, migrates virtual network elements selec-tively. The key idea is to migrate VN elements only when it is estimated this procedure tohave high positive impact, in terms of substrate network efficiency (by making better useof resources) and VN performance (by reducing path lengths).

Enabling tenants to scale their networks over time leads to resource fragmentation. Thishas been demonstrated recently (Michel et al., 2019) and we have also observed this phe-nomenon (Section 6.5) by noting an increase in path lengths in the virtual networks, asresources belonging to the same virtual network are mapped across distant regions of thesubstrate. As a result, communication between adjacent nodes (in the virtual topology)

128

6.1 Introduction

Figure 6.2: Elastic VN system

needs to traverse additional switches, increasing latency and bandwidth variability betweenvirtual machines. This has a negative impact in job completion times and overall applica-tion performance (Wilson et al., 2011). In addition, physical resources increasingly becomeunusable with a decline of the virtual network acceptance ratio, leading to a decrease inrevenue for the infrastructure provider, even though the physical network technically stillhas sufficient capacity to accommodate a request. Toward the goal of mitigating the ef-fects of resource fragmentation, we propose reconfiguration as a new network managementprimitive (bottom of Figure 6.1). The idea is to explicitly remap virtual networks, in orderto optimize network performance and resource usage. To implement this primitive, weagain resort, parsimoniously, to network migration solutions (Ghorbani & Godfrey, 2017;Ghorbani et al., 2014).

Figure 6.2 illustrates the entire system, centered in our contributions. The current stateof network virtualization is shown in white. Our new extensions are shown in green, andfuture work, beyond the scope of this chapter, is shown in blue. The user API is extendedto allow users not only to create and destroy virtual networks, but also to scale them. Thisprocedure can be either on-demand, or be triggered automatically by a VN monitor thatis pre-configured by the user (e.g., when customer requests are above a certain thresholdthe VN can be scaled up in a pre-configured manner). The VN requests are then embeddedusing ElasticVNE, resulting in a mapping of the virtual to the substrate network. Theprovider can also interfere, triggering a reconfiguration (again, on-demand or by means ofa substrate monitor that is observing key metrics, such as average path lengths). After themappings are defined, the system implements them in the substrate. When specific networkelements need to be migrated, the system runs the network migration tool (Ghorbani &

129


Godfrey, 2017; Ghorbani et al., 2014).

To evaluate our solution we have performed large scale simulations considering arbitraryvirtual and substrate topologies. Our results show that ElasticVNE improves acceptance ra-tios over common VNE algorithms by over 20% for new requests and reduces path lengthsby over 3x. Importantly, these results are achieved with a very small migration footprint.Our solution requires 10x less migrations and reduces migration time by 100x over thestate-of-the-art solution (Michel et al., 2019) that has no migration restrictions. Our resultsgo further in concluding that an unrestricted use of migration is harmful for network effi-ciency. While offering similar results in terms of acceptance ratio and migration footprint,network reconfiguration further increases the gains in terms of path length reduction, ul-timately improving user application performance.

To summarize, we make the following contributions in this chapter:

• We present four new primitives that allow tenants to scale their networks, and a newreconfiguration primitive to assist providers.

• We propose new algorithms to materialize these primitives. The key novelty is thatwe leverage network migration techniques recently made available.

• We perform large scale simulations to show that our solutions achieve higher accep-tance ratios and reduced path lengths over the state-of-the-art, limiting the migrationfootprint, thus avoiding excessive network churn.

6.2 Motivating use cases

We motivate the need for elastic virtual networks with four use cases: (i) Scale out to satisfycostumers’ demand; (ii) Scale in to save costs; (iii) Rapid deployment; and (iv) NFV servicechaining.

Diverse topologies. Users increasingly want to tailor the virtual network topology totheir applications. For instance, a streaming video application provider may require a treeto distribute streaming to a group of receivers. By contrast, certain applications containcentralized services for which a hub-and-spoke topology is more suitable. In a virtual-ized environment, as virtual networks grow, the complexity and size of the logical net-

130

6.3 Abstracting the Network

works tend to grow steadily, alongside the number of hypervisors. The experience fromproduction environments confirms these diverse use cases. As an example, in the typicalVMware NSX/NVP deployment virtual networks have hundreds of VMs attached to anumber of logical switches interconnected by a few logical logical routers, complementedwith ACLs (Koponen et al., 2014).

Scale-out and scale-up. A network that is built to provide a fixed level of service capacitywill occasionally be overwhelmed by peak loads that occur on rare but important occasions,such as on a “black friday” or on the Christmas week. The key advantage of cloud solutionsis mostly their auto-scaling ability, namely to automatically scale-up (increasing the capacityof a resource) and scale-out (adding instances of a resource). This enables users to satisfyincreasing demands without turning away customers.

Scale-in and scale-down. On average, however, services tend to be operating at far belowthe maximum capacity required for peak loads. As such, the ability to scale-down (reducingthe capacity of a resource) and scale-in (removing instances of a resource) is fundamental tosave costs and achieve complete elasticity.

Deployment and testing. Distributed system designers typically want to run experimentsunder a variety of topologies to explore how their new protocol performs in different set-tings. In addition, testing how distributed applications respond to scale up and scale downevents is usually required before actual deployment.

Service chaining. The elasticity of NFV has been recently exploited in 5G networks (Sunet al., 2018) and in data centers (Gember-Jacobson et al., 2014). For instance, 5G networksare centered around providing isolated network slices featuring high throughput and lowlatency to its customers. Each slice contains network services composed of virtual networkfunctions. To adapt to traffic loads in a network slice, a single network function, or evenan entire service chain, may need to be scaled-out and scaled-in dynamically by adding orremoving virtual nodes (Sun et al., 2018).

6.3 Abstracting the Network

This section explains how we abstract the infrastructure, capturing the fundamental char-acteristics of the elements composing the substrate network and the user virtual networks

131


requests.

Substrate network. We model the substrate network as a weighted undirected graph GS =(N S , E S ,AS

N ,ASE ), where nS ∈ N S represents the nodes and e S ∈ E S are the edges (or links)

connecting them. A node is a network element capable of forwarding packets. Nodes andedges are characterized by attributes AS

N and ASE , respectively.

In our setting, a physical/virtual machine contains a software switch nS to interconnectthe local compute elements (e.g., containers) to the data center infrastructure. Since theyshare the existing resources, we employ attribute c p u(nS) to aggregate the total CPU ca-pacity available in the substrate machine for network tasks and for user application pro-cessing. An edge connecting two switches is characterized by its bandwidth capacity (at-tribute b w(e S)> 0) and its average latency (l at (e S)> 0). Including the latter is importantas inter-cloud links typically have much higher latency when compared to internal datacenter connections (we have empirically confirmed this, as we explain in Section 6.5). Insummary, the node and edge attributes are, respectively, AS

N = {c p u(nS) : nS ∈ N S} andAS

E = {(b w(e S), l at (e S)) : e S ∈ E S}.

Virtual network. Similar to the substrate, VNs are modeled as weighted undirected graphsGV = (N V , EV ,AV

N ,AVE ), where N V is the set of virtual nodes, EV is the set of virtual edges

(or links), and AVN / AV

E correspond to the virtual nodes and edges attributes, respectively.The c p u(nV ) attribute of a node nV represents a requirement associated to the virtualswitch and its locally connected virtual hosts (containers in our virtualization platform).The attribute c p u(nV ) is thus the sum of all necessary processing capacity (for networkand applications tasks). For the virtual edges, b w(eV ) and l at (eV ) represent the requiredbandwidth and maximum acceptable latency. In summary, the two sets of attributes are:AV

N = {c p u(nV ) : nV ∈N V } and AVE = {(b w(eV ), l at (eV )) : eV ∈ EV }.

Virtual Network Request (VNR). VNRs are composed of a description of the virtualnetwork graph plus two additional parameters: V N R = (GV , i d ,T y pe). These include aunique identifier for the request, V N R.i d , and the type of request, V N R.T y pe , namely:(i) an arrival, for a new VNR; (ii) a departure, when the virtual network ends operationand its resources can be reclaimed; (iii) an upgrade, when the user asks for an increase onthe resources associated with a VNR already in operation, including scale-out and scale-up;(iv) a downgrade, when the user requests for a decrease on the resources consumed by anexisting VNR, including scale-in and scale-down.

132



In this section we present new network primitives for tenants’ virtual networks and for theoperator’s substrate infrastructure, and the algorithms we propose to materialize them.

6.4.1 Elastic VN primitives

Figure 6.1 (top) illustrates the four scaling primitives we propose in this work, to scalevirtual networks both horizontally (out/in) and vertically (up/down):

• scale-out: to add new network elements (e.g., switches) to a running virtual network(VN);

• scale-in: to remove network elements from a running VN;

• scale-up: to increase the capacity of existing resources of a VN (e.g., to connect newVMs to a switch); and

• scale-down: to decrease the capacity of existing resources of a VN.

In addition, we propose a reconfiguration primitive (bottom of Figure 6.1) for operators tobe able to increase the resource efficiency of the substrate:

• reconfigure: to remap existing virtual networks in order to improve resource usageand performance.

As will be made clear in the next section, to their materialization the substrate we consideris capable of migrating network elements using existing techniques (Ghorbani & Godfrey,2017; Ghorbani et al., 2014).

6.4.2 Elastic VNE algorithms

We divide our solutions to materialize these primitives in two sets: user-driven and provider-driven. The first corresponds to the algorithms that respond to user requests (left side ofFigure 6.2), including launching new VNs, and removing, upgrading (scale-up and scale-out), or downgrading (scale-down and scale-in) running VNs. The second corresponds to

133


the algorithms required for substrate reconfiguration (top right of the same figure), trig-gered by actions from the provider side. Some of the algorithms we present next are com-mon to both sets, as will be made clear.

User-driven embedding. Our approach derives an embedding for the virtual network re-quests (VNRs) that arrive. The goal is to find the most appropriate mapping for the virtualnodes and edges, taking into consideration the requirements stated by the user (which ap-pear in the form of attributes). When the available resources are not enough to fulfill therequirements, namely in the case of new requests or upgrades, mapping fails and an erroris signaled.

Algorithm 4, elasticEmbedding(), responds to user requests. It resorts to a few global vari-ables to keep information about the execution (see Table 6.1). Some are also used by otheralgorithms. The algorithm implements an infinitive loop that receives and processes everyVN request that reaches the system. Four types of requests are addressed: arrival, depar-ture, upgrade, and downgrade.

Name Description

GS substrate network specification (see Section 6.3)

N e t sa set that keeps the embedding for the currently deployed requests. Each elementis a pair (v, m), where v is a VNR and m a mapping of the virtual network tothe substrate. Initialized: N e t s ←;

RN

a set that keeps for each substrate node the residual CPU, i.e., the amount ofCPU that has not been consumed. Initialized: RN ← {r c p u(nS) = c p u(nS) :nS ∈N S}

RE

a set that keeps for each substrate edge the residual bandwidth, i.e., the amount ofbandwidth that has not been consumed. Initialized: RE ←{r b w(nS) = b w(e S) :e S ∈ E S}

Table 6.1: Global variables employed by the algorithms.

The procedure starts by waiting for the next virtual network request vn r (line 2). In caseit is a new VNR, it attempts to find a suitable embedding (line 4) by calling getMap(). Thisprocedure is presented later as Algorithm 6. If successful, the mapping is stored in the Netsstructure (line 6), and the vnr is deployed on the substrate (line 7). The functiongetMap()also modifies the global variables that hold information about the residual resources (RN

134


and RE ), reflecting the consumption of CPU and bandwidth1. Otherwise, an error is sig-naled (line 9).

Algorithm 4: elasticEmbedding()Input: GS , N e t s , RN , RE

1 while (t r ue) do2 vn r ← waitForVNR();3 if (vn r.T y pe == a r r i val ) then4 ma p ← getMap(vn r,GS , RN , RE ,;,;);5 if (ma p 6= ;) then6 addVNR(vn r, ma p,N e t s );7 deploy(vn r, ma p,GS ,;,;);8 else9 error(vn r.i d , “Error mapping new VNR”);

10 else11 (v, m)← getVNR(vn r.i d ,N e t s );12 if (vn r.T y pe == d e pa r t u r e) then13 terminate(v, m,GS , RN , RE );14 updateNodeLinkResources(GS , RN , RE , m);15 delVNR(vn r.i d ,N e t s );16 else17 if (vn r.T y pe == u p g rad e) then18 ma p ← getMap(vn r,GS , RN , RE , v, m);19 else /* vnr.Type==downgrade */20 ma p← getDwMap(vn r,GS , RN , RE , v, m);

21 if (ma p 6= ;) then22 delVNR(vn r.i d ,N e t s );23 addVNR(vn r, ma p,N e t s );24 deploy(vn r, ma p,GS , v, m);25 else26 error(vn r.i d , “Could not upgrade VN”);

The other types of requests are related to changes to running VNs. Therefore, we startby obtaining information about the existing VNR and corresponding mapping (v, n) by

1For each node nV in ma p, which is mapped onto the substrate node nS , the corresponding residualr c p u(nS ) ∈ RN becomes r c p u(nS )← r c p u(nS )− c p u(nV ). In a similar manner, the residual bandwidthof the substrate edges is modified in RE .

135


consulting the N e t s set (line 11). In case of departure, the virtual network is terminated(line 13), the corresponding nodes and edge resources (RN and RE ) are freed (line 14), andthe set N e t s is updated (line 15). For upgrades and downgrades, variable map gets thereplacement mapping by calling getMap() (line 18) and getDwMap() (line 20), Algorithms 6and 9, respectively. If the request can be fulfilled, the previous VN is removed from Nets,and the new mapping is stored in this set (Lines 22-23). Then, the necessary changes areapplied to the substrate: to add or remove nodes and links, and/or to increase or decreasetheir capacity. In some cases, network migration may be performed, as will be made clearin Algorithm 7.

Algorithm 5: reconfiguration()Input: GS , N e t s , RN , RE

1 N e t sO← orderVNs(N e t s );2 i ← 0;3 forall ((vn r O, ma pO) ∈N e t sO) do4 t m pRN ← RN ; t m pRE ← RE ;5 RN ← removeVNodes(RN , N e t sOi );6 RE ← removeVEdges(RE , N e t sOi );7 mN e w ← getMap(vn r Oi ,GS , RN , RE , 0, 0);8 if (mN e w 6= ;) then9 N e t s ← (N e t s - N e t sOi ) ∪ (vn r Oi , mN e w);

10 deploy(vn r Oi , ma pOi GS , vn r Oi , mN e w);11 else12 RN ← t m pRN ; RE ← t m pRE ;

13 i++;

Provider-driven embedding. As VNs arrive and depart, are updated or downgraded, sub-strate resources become fragmented. As a consequence, nodes of a VN tend to be locatedfurther apart, increasing the length of the substrate paths (in terms of number of hops)that map the virtual edges. This negatively impacts communication latencies, and reducesefficiencies as more substrate links are used. To address these problems, we provide a recon-figuration primitive to substrate providers. This enables VN redeployment with the goalof improving mapping efficiencies globally, possibly by migrating selected nodes. Algo-rithm 5 materializes this primitive.

We first initialize the N e t sO set (line 1) with the running VNs ordered by decreasing net-

136


work sizes (number of virtual nodes). The average path length, av g P L(v, m), is used tobreak ties (higher values first). We compute this metric as follows. For each pair (v, m) ∈N e t s , where v is a VNR and m is the current mapping, av g P L(v, m) equals:

‖ {e S : ∀eV →{e S}, eV ∈ EV ∧ e S ∈ E S} ‖‖ {eV : ∀eV ∈ EV } ‖

(6.1)

where ‖ . ‖ counts the elements in a set, and eV → {e S} represents edge eV having beenmapped onto a path composed of a group of substrate edges {e S}. The formula thus outputsthe average substrate path length: the ratio of the number of substrate edges used to mapall virtual edges of a specific virtual network. Ideally, av g P L() = 1, with each virtual edgemapped to a single substrate edge. In practice, multiple substrate edges may have to beemployed, and av g P L()> 1.

The algorithm enters in a loop, storing first a copy of the original residual resource sets(lines 4), necessary to enable rollback in case of an unsuccessful reconfiguration. The resid-ual resources are then updated as if the VNs selected for reconfiguration were evicted fromthe substrate (releasing the corresponding CPU and bandwidth) (lines 5-6). Next, we try toremap the VN (line 7). If successful, we update the N e t s structure with the new mapping,and redeploy the VNR in the substrate which, again, may include migration of certainnodes (lines 8-10). Otherwise, we rollback (lines 11-12).

Embedding general procedures. Next, we present the mapping procedures used by theelastic and reconfiguration algorithms. Algorithm 6 is employed to perform VN embed-ding, and is called for both new VNRs, for reconfiguration, or to scale out a VN. Sincethis problem is NP-hard, we resort to a heuristic where nodes are mapped first, followedby mapping of the virtual edges. The goals are to maximize the overall acceptance ratio ofusers’ requests, fulfilling all requirements, and to maximize efficiency (e.g., by minimizingthe number of substrate links used).

The procedure starts by initializing the map variable, which will store the node (map.N ) andlink (map.L) mappings (line 1). Then, node mapping is performed, for each virtual node tobe assigned a specific substrate node (line 2). If successful, follows link embedding, to mapone substrate path for each virtual edge (line 6). If both operations succeed, the residualresources are updated (line 10). Note that as parameters of the nodeMap() and linkMap()procedures (to be presented as Algorithms 7 and 8) we include information on the present

137


Algorithm 6: getMap()Input: vn r , GS , RN , RE , vn r P , ma pPOutput: ma p /* node & link mappings */

1 ma p←;;2 ma p.N ← nodeMap(vn r,GS , RN , RE , vn r P, ma pP );3 if (ma p.N == ;) then4 return ;;5 else6 ma p.L← linkMap(vn r,GS , RN , RE , ma p.N , vn r P, ma pP );7 if (ma p.L== ;) then8 return ;;9 else

10 updateResources(RN , ma p, ma pP );11 return ma p;

mappings (vnrP and mapP), alongside the new request (vnr), to accommodate upgrade,downgrade, and reconfiguration.

Algorithm 7 maps virtual nodes to the substrate. The main idea is to rank nodes in thesubstrate favoring those (i) with more resources available (to balance load), (ii) located closerto substrate nodes that map VN neighbors (to reduce path lengths), and (iii) that minimizemigration cost, in case of upgrades and reconfigurations. After initializing the structurenM a p that maintains the mappings, an auxiliary variable Gau x keeps a copy of the originalsubstrate graph (lines 1-2). Next, the virtual nodes of vn r are stored in ascending order,using as metric the resources required, according with Equation 6.2 (line 3).

s co r eV (nV ) = c p u(nV )×∑

∀eV�nV

b w(eV ) (6.2)

where eV � nV means edge eV is connected to nV . The formula basically takes into consid-eration the CPU and bandwidth requested for the virtual node, returning a higher score fornodes that require more resources. Processing thus starts from the less demanding nodes.

The procedure then loops to embed each virtual node. Variable virtualNodeMapped indi-cates whether a successful mapping was found for this node, and thus is initialized with false(line 5). Function subScore() is then used to order the substrate nodes in Gau x (line 5), using

138


Algorithm 7: nodeMap()Input: vn r , GS , RN , RE , vn r P , ma pPOutput: nM a p /* node mappings */

1 nM a p←;;2 Gau x ←GS ;3 vi r N ← virScore(vn r );4 forall (nV ∈ vi r N) do5 vi r t ual N od eM a p ped ← false;6 s u bN ← subScore(nV ,Gau x , vn r, vn r P, ma pP );7 forall (nS ∈ s u bN) do8 c p uV ← c p u(nV );9 if isMapped(nV , nS , ma pP) then

10 c p uV ← c p u(nV )− g e tC P U (nV , vn r P )11 if (c p uV ¶ RN (n

S)) then12 nM a p ← nM a p ∪ (nV , nS);13 Gau x ← Gau x − nS ;14 vi r t ual N od eM a p ped ← true;15 break;

16 if (v i r t ual N od eM a p ped == false) then17 return ;;

18 return nM a p;

Equations 6.3 and 6.4 to compute the ranking.

bas eSco r e(nS) =

RN (nS )

c p u(nS ) ×∑

∀eS�nS

RE (eS)

b w(e S)

av g Di s t2N ei g h b o r s(nS , N̄ V )(6.3)

s co r eS(nS) =bas eSco r e(nS)

mi g rat i onC os t (nV , nS)(6.4)

The first equation attempts to increase the acceptance ratio while minimizing the consump-tion of substrate links. The value of bas eSco r e is higher for nodes that have a larger share

139


of resources available: RN (nS)/c p u(nS) is the percentage of available CPU at node nS ; and

RE (eS)/b w(e S) is the proportion of available bandwidth of a link e S ending at nS (e S � nS

has an equivalent meaning to the one above). In addition, it penalizes substrate nodes pro-portionally to their distance from the virtual nodes already mapped. For this purpose func-tion avgDist2Neighbors() computes the average hop distance between nS and N̄ S , whichrepresents the set of substrate nodes where the neighbors of nV already mapped are placed.

The second equation determines the final score. For new VNRs, it is equal to the bas eSco r e .For upgrades and reconfigurations, however, there is a penalty based on the estimate of thecost of migration. Function migrationCost() thus returns 1 if node nV is already mappedinto nS , or if it is a new virtual node. Otherwise, it has a value, larger than 1, proportionalto the estimated time necessary to perform the migration. To compute this penalty time,we consider the network latency and bandwidth between the source and destination nodes,and the amount of information required to be moved (i.e., the footprint of virtual hosts andswitch). The substrate nodes are placed in the s u bN set in descending order, ensuring thatthe nodes with best scores are considered first (line 6).

The loop that follows attempts to map virtual node nV . First, it adjusts the required CPU(lines 8-10). This is needed to accommodate scale up requests, as the virtual node mayalready be mapped in the substrate node under consideration nS . In this case, it is onlynecessary to provision the additional CPU requested. If enough residual CPU is available,it stores the new mapping (line 12), and updates the auxiliary variable (line 13). By removingnS from Gau x the algorithm guarantees that a substrate node does not map more than onevirtual node from the same VN. This decreases the impact of substrate failures to the VN.If embedding fails, the algorithm returns 0 (lines 16-17).

Algorithm 8 finds a mapping between virtual edges and substrate paths. Each edge is pro-cessed individually, searching for a suitable path between the two substrate nodes that em-bedded its virtual endpoints. The approach is flexible, enabling either single or multiplepaths embeddings by adjusting the input maxP.

After initializing the link mappings set l M a p (line 1) the algorithm enters in a loop toembed each virtual edge. First, it initializes the set that stores the candidate mapping anda variable that holds the total bandwidth of all candidate paths. Then, it obtains this set ofmaxP paths to connect the two substrate nodes (line 5). In our implementation, we resortto the K-edge disjoint shortest path algorithm to find these paths, using as edge weights the

140


Algorithm 8: linkMap()Input: vn r , GS , RN , RE , nM a p, vn r P , ma pP , maxPOutput: l M a p /* link mappings */

1 l M a p←;;2 forall (eV ∈ vn r.GV .EV ) do3 cand M a p←;;4 t ot al Bw← 0;5 pat h s ← getPaths(eV , vn r,GS , RE , nM a p, vn r P, ma pP ,maxP);6 foreach ( p ∈ pat h s ) do7 if (l at (eV )≥ getLatency( p,GS )) then8 b w p← getMinBandwidth( p, RE );9 t ot al Bw← t ot al Bw + b w p;

10 cand M a p← cand M a p ∪ (eV , b w p, p);

11 if (t ot al Bw ≥ b w(eV )) then12 forall (m p ∈ cand M a p) do13 b w(m p)←d(b w(m p)/t ot al Bw) ∗ b w(eV )e;14 l M a p← l M a p ∪ cand M a p;15 else16 return ;;

17 return l M a p;

inverse of the residual bandwidths. This ensures that when “distance” is minimized, thealgorithm picks the paths that have more bandwidth available.

Then, each candidate path p is evaluated to check if it fulfills the latency requirements(line 7). For this purpose, the getLatency() function returns the overall path latency (thesum of latencies of each individual substrate link that forms the path). Each of the pathsthat fulfills the requirement is stored in cand M a p (lines 8-10), jointly with its availablebandwidth. This is calculated with the getBandwidth() function, that returns the bandwidthof the bottleneck link in p. The set will have at most M axP paths, the constant that definesthe degree of multipathing (when set to 1, a single path is used).

Finally, we define how much traffic goes through each path, ensuring that together theyprovide the requested edge bandwidth (lines 11-13). If enough bandwidth is available in thecandidate paths, we update the bandwidth in every path to an amount proportional to theirmaximum capacity, therefore distributing the load. Then, if successful, the set of paths is

141


added to the link mappings set (line 14).

Algorithm 9: getDwMap()Input: vn r , GS , RN , RE , vn r P , ma pPOutput: ma p /* node & link mappings */

1 forall (nV ∈ vn r P.GV .N V ) do2 c p uP ← c p u(nV );3 c p u← g e tC P U (nV , vn r );4 if (c p u 6= 0) then5 ma p.N ← ma p.N ∪ (nV , g e tN S(nV , ma pP.N ));

6 forall (eV ∈ vn r P.GV .EV ) do7 b wP ← b w(eV );8 b w← g e tBW (eV , vn r );9 if (b w 6= 0) then

10 cand M a p← g e t M a pEd g e(eV , ma pP.L);11 foreach (m p ∈ cand M a p) do12 b w(m p)← b w(m p) ∗ [b w/b wP ];13 ma p.L← ma p.L∪ cand M a p;

14 updateNodeLinkResources(GS , RN , RE , ma p, ma pP );15 return ma p;

The last procedure is Algorithm 9, necessary to downgrade (scale in and/or down) a previ-ously deployed VNR (recall Algorithm 4). The procedure first adds to ma p.N the nodesthat were not removed from the previous mapping (lines 1-5), and follows a similar proce-dure for the edges (lines 6-13). It starts by obtaining, for each node, the used CPU resourcesin the existing embedding (c p uP ) and the new request (c p u). Function getCPU() outputs0 if the virtual node nV was removed from the vn r graph (scaled in). Nodes that remain inthe graph are thus added to ma p.N . Function getNS() outputs the node nS where nV was– and will remain – mapped.

For the edges, a similar procedure takes place. The present (line 7) and new (line 8) virtualedge bandwidths are obtained (getBW() returns 0 if the edge has been eliminated). Then,the bandwidth associated to each of the paths used to map the virtual edge is updated inthe required proportion, in a manner similar to Algorithm 8. The procedure terminatesby updating the residual sets (RN and RE ), releasing the unnecessary resources (line 14).

142

6.5 Evaluation

Notation Algorithm description

FixedScale the VN while fixing the embedded nodes to their substrate nodes,and re-mapping only the new nodes. The state-of-the-art approach for scal-ing (Michel et al., 2019).

Elastic Scale the VN considering our ElasticVNE heuristic.

Renew Scale the VN by re-mapping all nodes.

Fixed+RcfEUsing the Fixed baseline, and applying periodic reconfiguration using ourElasticVNE heuristic.

Fixed+RcfRUsing the Fixed baseline, and applying periodic reconfiguration by re-mapping all nodes. The state-of-the-art approach for reconfiguration (Michelet al., 2019).

Table 6.2: VNR configurations that were evaluated in the experiments.

6.5 Evaluation

In this section we aim to answer some questions. Does ElasticVN improve the efficiencyand acceptance ratio over common embedding approaches and state-of-the-art solutions? Ifso, at what cost with respect to migration footprint? And, is the link embedding approachrelevant for resource consumption?

Experimental setup. For our experiments we prepared a setup similar to most VNEwork (Fischer et al., 2013a). We used the GT-ITM tool (Zegura et al., 1996) to generatethe substrate and the virtual networks, employing the Waxman model to link nodes with aprobability of 50% (Naldi, 2005). To simulate the dynamic arrival of VNRs to the systemwe have extended an existing VNE simulator (vin, 2012).

Substrate networks have a total of 100 nodes. As this solution is to be integrated into the Sir-ius platform, we have set up the simulation parameters with realistic values for this setting.Towards this goal, we have measured, for several consecutive days, the link bandwidth inAmazon EC2 and Google Cloud Platform (both intra- and inter-cloud). The results wereconsistently in the hundreds of Mbps. As such, we have set the bandwidth of substratelinks – b w(nS) – as a random variable uniformly distributed between 500 and 1000. TheCPU resources – c p u(nS) – are uniformly distributed between 50 and 100. We furtherconsidered the nodes to be distributed between three clouds. This has implications in link

143


latencies: we have set intra-cloud links with 1 unit delay, and inter-cloud links with 20 unitsdelay. Again, these values were set based in our empirical analysis.

VNRs have a number of virtual nodes uniformly distributed between 5 and 20. As thenodes in our setting are switches connecting several virtual hosts (in Sirius, containers), wehave set the node footprint to be uniformly distributed between 25 and 50 for the forward-ing table, and between 250 and 500 for the sum of VM/container storage. The first valuesare based on the forwarding table size of switches (Miao et al., 2017). Containers’ sizes arealso on this range hub (2018), and we assume each virtual switch to support tens of con-tainers. In addition, pairs of virtual nodes are connected with a Waxman topology withprobability 50%. The CPU of the virtual nodes are uniformly distributed between 10 and20, and the bandwidth of virtual links between 200 and 400. We assume that VNRs arrivalsare modeled as a Poisson process with an average rate of 4 VNRs per 100 time units. EachVNR has an exponentially distributed lifetime (D u r ) with an average of 1000 time units.

(a) New VNRs. (b) Scaling VNRs.

Figure 6.3: VNR acceptance ratio.

At each 10 VNRs arrivals, the simulator chooses 75% of the embedded VNs to be scaled-inor scaled-out, with 50% probability each. Each VNR extends (scales out) or reduces (scalesin) the number of nodes by 30%. For the reconfiguration experiments, we set reconfigu-ration after 100 events. We have used as baseline embedding algorithm the one proposedin Yu et al. (2008). For link embedding we evaluated both shortest path (SP) and k-edge dis-joint SP, with k=2 (DSP-K2). We emphasize, however, that other embedding algorithmscould be used – what we aim to evaluate is the use of network migration to assist VN scalingand reconfiguration, not the underlying embedding algorithm used.

144

6.5 Evaluation


Figure 6.4: Resource usage.

We set up 5 experiments (shown in Table 6.2), each considering the same substrate andVNRs topologies. We highlight the comparison with the state-of-the-art approach pro-posed in (Michel et al., 2019). In this work the authors proposed similar virtual networkprimitives, but materialized them with the simple algorithm we here call Fixed. The au-thors also proposed “network defragmentation” as a new network primitive, similar to ourreconfiguration. The algorithm to materialize it is the one we term Fixed+RcfR. We ranevery experiment 10 times, for 50k time units each, so on average around 2000 VNRs weresimulated per run. The order of arrival and the capacity requirements of each VNR arekept the same for all configurations in Table 6.2, ensuring that they solve an equivalentproblems.

User-driven results. Figures 6.3 presents the acceptance ratio for new VNRs and for scalingrequests using both link embedding approaches. For new requests and considering the SPlink embedding, with our algorithm elastic, Figure 6.3a shows the acceptance ratio im-proves around 20% over the fixed state-of-the-art (Michel et al., 2019), is very close to therenew approach, and it improves 17% with DSP-K2. Although the acceptance ratio base-line has grown considerably with the use of DSP-K2 embedding, approximately 33%, thedifference between the baseline and our algorithms has remained almost constant. In thecase of scaling requests , although the acceptance ratio for fixed is already high, elasticimproves by close to 4% (with SP) and 7% (with DSP-K2), as shown in Figure 6.3b. Inter-estingly, the results for renew are worse. Although this may seem counter intuitive, thereason is illuminating: an excessive use of migration is harmful. We found that link mapping

145


failures increase after node mappings that demand migration, as it becomes harder to findavailable substrate paths available.

(a) Footprint of migrated nodes. (b) Migrated nodes and migration time.

Figure 6.5: Cost of Migration.

Figure 6.4 illustrates the reason for the improvements on acceptance ratios: a better useof resources. The advantage of DSP-K2 over SP is also clear, as splitting the bandwidthrequirement between two paths reduces bottlenecks.

(a) Path length: one execution. (b) Path lengths: user driven. (c) Path lengths: provider driven.

Figure 6.6: Path lengths.

Importantly, the results obtained by ourelastic algorithms are achieved with a relativelysmall migration footprint. This can be observed in Figure 6.5b. With elastic onlyaround 6% of nodes are migrated, which is a figure more than 10x smaller than with theuse of renew, the solution without migration restrictions. As a result, the migration cost,both in terms of footprint (which translate in number of bytes exchanged) and migrationtime, is reduced by two orders of magnitude (Figure 6.5a).

146

6.6 Conclusion

The heuristics that use network migration also drastically reduce path lengths. As shownin Figure 6.6, paths are shortened by over 3.2x (SP) and 1.5x (DSP-K2). Figure 6.6a showsa single representative run (using SP embedding), where it is clear paths lengths are keptconsistently small. This reduction is translated in improved virtual network performance(e.g., lower latencies) and better resource usage.

The main conclusion is that the improvements obtained by our elastic solution, in bothacceptance ratio and path lengths, are achieved with a parsimonious use of migration.


Figure 6.7: Acceptance ratio: provider-driven.

Provider-driven results. Considering reconfiguration alone, we found that the acceptanceratio improves very slightly over the fixed baseline (Figure 6.7), with either RcfE (re-configuration considering migration cost) or RcfR (no restrictions for migration). On theother hand, Figure 6.6c shows that reconfiguration is effective in decreasing path lengths,specially in the SP case. Provider-driven reconfiguration is anyway less effective when com-pared to user-driven elastic algorithms. This is mainly due to the reconfiguration frequency.While the former is triggered only periodically, the elastic algorithms effectively trigger areconfiguration for every scaling request.

6.6 Conclusion

In this chapter, we proposed new primitives to scale virtual networks and to reconfigurethe underlying substrate, bringing elasticity to virtual networking. The key novelty of the

147


algorithms we proposed was the use of network migration techniques, enabling networkreconfiguration to assist in improving resource efficiencies. Our simulations have shownthat our solution achieves high acceptance ratios and small path lengths, while drasticallylimiting the migration footprint.

148

7Summary and Future Work

This section provides an overview of the main contributions of this thesis and an outlookon future work.

7.1 Summary of Contributions

Modern netwotk virtualization solutions share some limitations: they target a single datacenter of a cloud provider; they offer only traditional networking and ACL-based securityservices; and they also fail to provide the elasticity expected in cloud environments for thetenants’ virtual networks.

We addressed these limitations by developing embedding algorithms to enable the intro-duction of new security, dependability, and elasticity services to virtual networks, and byimplementing a prototype of it: Sirius, the first multi-cloud network virtualization plat-form. Table 7.1 summarizes the contributions of this thesis.

As first contribution, we designed and implemented Sirius, a network virtualization plat-form for multi-cloud environments which, contrary to existing solutions, considers notonly connectivity and performance, but also security and dependability. Sirius allows usersto define virtual networks (VN) with arbitrary topologies, while making use of the full ad-dress space. In addition, the solution leverages from a substrate infrastructure composed ofboth public clouds and private data centers. Sirius architecture is composed of two main

149

7. SUMMARY AND FUTURE WORK

Contribution Description Chapter

Sirius architectureMulti-cloud network virtualization platform which

leverages from a substrate infrastructure that entails bothpublic clouds and private data centers

3

SecVNE MILPAn optimal solution based on MILP for the secure virtual

network embedding (VNE) problem that considers securityand dependability as first class citizens

4

SecVNE heuristicAn heuristic for the SecVNE problem that fits the

multi-cloud network scenario and scales to large-scaledeployments

5

Elastic VNEPrimitives and algorithms to provide elasticity to tenant’s

environments by enabling virtual networks to scale out andscale in

6

Table 7.1: Summary of contributions of the thesis.

modules: the multi-cloud orchestrator and the network hypervisor. The first one is respon-sible for managing interactions with users through a web-based graphical interface; keepinginformation about the topologies of the substrate and virtual networks and their mappings;configuring and bootstrapping VMs in the clouds in cooperation with the network hy-pervisor; and setting up the tunnels required for the inter-cloud connections. The secondmodule enables complete network virtualization running the virtual network embeddingalgorithms; and ensuring isolation between tenants.

As second contribution, we formulated a Mixed Integer Linear Program (MILP) for theonline secure network embedding problem that considers security and dependability re-quirements of the tenants. We also propose a policy language to specify the characteristicsof the substrate and virtual networks, allowing to increase the expressiveness of the users’requirements.

As the optimal VNE solution does not scale to large networks, our third contribution wasa new heuristic to the secure virtual network embedding problem that scales to very largenetworks while fitting to a hybrid multi-cloud setting.

Finally, our fourth contribution consisted of new primitives and algorithms to bring elas-ticity to virtual networks. Our solution allows scale out, scale in, scale up and scale downof virtual networks and reconfiguration of the substrate. The solution leverages networkmigration techniques to achieve high efficiencies, but does so parsimoniously to avoid ex-cessive network churn.

150

7.2 Future work

We evaluated all solutions by means of realistic large-scale simulations and real testbed en-vironments running our prototype, including a substrate composed of a private data centerand two public clouds (Amazon’s and Google’s). The results demonstrate the feasibility ofthe proposals, achieving good trade-offs concerning security and performance, and there-fore allowing the enrichment of cloud providers’ services.

7.2 Future work

All algorithms and techniques developed in this thesis were integrated into the Sirius pro-totype, with the exception of the elastic VNE solution we proposed in Chapter 6. Animmediate future work is thus its integration in Sirius, along with the required networkmigration mechanisms. In addition, we also anticipate a few interesting avenues futurework, which we leave here in closing the thesis.

7.2.1 Multi-cloud Network migration

The Network Migration mechanisms proposed in Ghorbani et al. (2014) allow transparentmigration. While assuring correctness, it focuses on controlled data center environments.As such, they may not be the best fit for a multi-cloud substrate network with a large num-ber of switches geographically distributed into global data centers. A line of research wouldinclude investigating new techniques and algorithms that take into consideration the high-latency and low-throughput of inter-cloud connections to better fit the multi-cloud context.

7.2.2 Secure, dependable and scalable Sirius

In this thesis we argued for security and dependability of virtual networks. However, theplatform that enables it, Sirius, is currently a centralized, non-replicated component that istherefore a single point of failure and a juicy target for attack.

An interesting future work would be to integrate the techniques used in SDN to improvethe infrastructure’s scalability (e.g., Onix (Koponen et al., 2010), ONOS (Berde et al., 2014)),dependability (e.g., SMaRtLight (Botelho et al., 2014)), and security (e.g., ANCHOR (Kreutzet al., 2019)) into our network virtualization platform.

151

7. SUMMARY AND FUTURE WORK

7.2.3 Programmable Virtual Networks

Following the recent advances on data plane programmability (Bosshart et al., 2013, 2014),a research avenue with high potential is to investigate virtual networks that are fully pro-grammable. Instead of being fixed to traditional L3 or L2 network processing, users couldcustomize the packet processing of all network elements with the use of a high level lan-guage, such as P4 (P4, 2019). This would entail adressing several unsolved challenges: newcompilers from P4 to the virtual switch targets, new network embedding algorithms fora programmable data plane infrastructure, and new techniques for network orchestration,control, and isolation.

152

Bibliography

(2012). ViNE-Yard. http://www.mosharaf.com/ViNE-Yard.tar.gz. 85, 111, 143

(2017). VIS.JS. http://visjs.org/, accessed: 2017-02-20. 55, 108

(2018). Docker Repositoty. https://hub.docker.com. 144

(2019). Barefoot Tofino. https://barefootnetworks.com/products/brief-tofino/, [Online; accessed 30-May-2019]. 89

AL-SHABIBI, A., DE LEENHEER, M., GEROLA, M., KOSHIBE, A., PARULKAR, G., SAL-VADORI, E. & SNOW, B. (2014). Openvirtex: Make your virtual sdns programmable.In Proceedings of the Third Workshop on Hot Topics in Software Defined Networking,HotSDN ’14, 25–30, ACM, New York, NY, USA. vi, xv, 2, 3, 19, 46

ALALUNA, M., RAMOS, F.M.V. & NEVES, N. (2016). (Literally) above the clouds: Vir-tualizing the network over multiple clouds. In 2016 IEEE NetSoft Conference (NetSoft),112–115. 5, 41

ALALUNA, M., VIAL, E., NEVES, N. & RAMOS, F.M.V. (2017). Secure and dependablemulti-cloud network virtualization. In Proceedings of the 1st International Workshop onSecurity and Dependability of Multi-Domain Infrastructures, XDOMO’17, 2:1–2:6, ACM,New York, NY, USA. xvi, 5, 41, 101, 121

ALALUNA, M., VIAL, E., NEVES, N. & RAMOS, F.M. (2019). Secure multi-cloud networkvirtualization. Computer Networks. 5

ANDERSEN, D.G. (2002). Theoretical approaches to node assignment, unpublishedManuscript. 28

BALIGA, J., AYRE, R.W.A., HINTON, K. & TUCKER, R.S. (2011). Green cloud comput-ing: Balancing energy in processing, storage, and transport. Proceedings of the IEEE, 99,149–167. 43

153

http://www.mosharaf.com/ViNE-Yard.tar.gz

http://visjs.org/

https://hub.docker.com

https://barefootnetworks.com/products/brief-tofino/

https://barefootnetworks.com/products/brief-tofino/

BIBLIOGRAPHY

BALLANI, H., COSTA, P., KARAGIANNIS, T. & ROWSTRON, A. (2011). Towards pre-dictable datacenter networks. In Proceedings of the ACM SIGCOMM 2011 Conference,SIGCOMM ’11, 242–253, ACM, New York, NY, USA. xv, 21, 22, 122, 126, 127

BAYS, L.R., OLIVEIRA, R.R., BURIOL, L.S., BARCELLOS, M.P. & GASPARY, L.P. (2012).Security-aware optimal resource allocation for virtual network embedding. In 2012 8thInternational conference on network and service management (CNSM) and 2012 workshopon Systems Virtualiztion Management (SVM), 378–384. 32, 33, 101

BEN-YEHUDA, M., DAY, M.D., DUBITZKY, Z., FACTOR, M., HAR’EL, N., GORDON,A., LIGUORI, A., WASSERMAN, O. & YASSOUR, B.A. (2010). The turtles project: De-sign and implementation of nested virtualization. In Proceedings of the 9th USENIX Con-ference on Operating Systems Design and Implementation, OSDI’10, 423–436, USENIXAssociation, Berkeley, CA, USA. 66

BERDE, P., GEROLA, M., HART, J., HIGUCHI, Y., KOBAYASHI, M., KOIDE, T., LANTZ,B., O’CONNOR, B., RADOSLAVOV, P., SNOW, W. & PARULKAR, G. (2014). Onos: To-wards an open, distributed sdn os. In Proceedings of the Third Workshop on Hot Topics inSoftware Defined Networking, HotSDN ’14, 1–6, ACM, New York, NY, USA. 12, 151

BESSANI, A., CORREIA, M., QUARESMA, B., ANDRÉ, F. & SOUSA, P. (2013). Depsky:Dependable and secure storage in a cloud-of-clouds. Trans. Storage, 9, 12:1–12:33. xv, 34,35, 36

BESSANI, A., MENDES, R., OLIVEIRA, T., NEVES, N., CORREIA, M., PASIN, M. &VERISSIMO, P. (2014). SCFS: A shared cloud-backed file system. In 2014 USENIX AnnualTechnical Conference (USENIX ATC 14), 169–180, USENIX Association, Philadelphia,PA. 34, 35, 36, 43

BLODGET, H. (2017). Amazon’s Cloud Crash Disaster Permanently Destroyed ManyCustomers’ Data. http://www.businessinsider.com/amazon-lost-data-2011-4. 2

BOSSHART, P., GIBB, G., KIM, H.S., VARGHESE, G., MCKEOWN, N., IZZARD, M., MU-JICA, F. & HOROWITZ, M. (2013). Forwarding metamorphosis: Fast programmablematch-action processing in hardware for sdn. In Proceedings of the ACM SIGCOMM 2013Conference on SIGCOMM, SIGCOMM ’13, 99–110, ACM, New York, NY, USA. 152

154

http://www.businessinsider.com/amazon-lost-data-2011-4

http://www.businessinsider.com/amazon-lost-data-2011-4

BIBLIOGRAPHY

BOSSHART, P., DALY, D., GIBB, G., IZZARD, M., MCKEOWN, N., REXFORD, J.,SCHLESINGER, C., TALAYCO, D., VAHDAT, A., VARGHESE, G. & WALKER, D. (2014).P4: Programming protocol-independent packet processors. SIGCOMM Comput. Com-mun. Rev., 44, 87–95. 152

BOTELHO, F.A., BESSANI, A.N., RAMOS, F.M.V. & FERREIRA, P. (2014). Smartlight: Apractical fault-tolerant SDN controller. CoRR, abs/1407.6062. 151

BOZAKOV, Z. & PAPADIMITRIOU, P. (2012). Autoslice: Automated and scalable slicingfor software-defined networks. In Proceedings of the 2012 ACM Conference on CoNEXTStudent Workshop, CoNEXT Student ’12, 3–4, ACM, New York, NY, USA. 18

CASADO, M., KOPONEN, T., RAMANATHAN, R. & SHENKER, S. (2010). Virtualizingthe network forwarding plane. In Proceedings of the Workshop on Programmable Routersfor Extensible Services of Tomorrow, PRESTO ’10, 8:1–8:6, ACM, New York, NY, USA.v, 1

CHENG, X., SU, S., ZHANG, Z., WANG, H., YANG, F., LUO, Y. & WANG, J. (2011).Virtual network embedding through topology-aware node ranking. SIGCOMM Comput.Commun. Rev., 41, 38–47. xv, 28, 29, 33

CHOWDHURY, M., SAMUEL, F. & BOUTABA, R. (2010). Polyvine: Policy-based virtualnetwork embedding across multiple domains. In Proceedings of the Second ACM SIG-COMM, VISA ’10. 32, 33

CHOWDHURY, M., RAHMAN, M.R. & BOUTABA, R. (2012). Vineyard: Virtual networkembedding algorithms with coordinated node and link mapping. IEEE/ACM Transac-tions on Networking, 20, 206–219. xv, 28, 29, 30, 33, 63, 64, 86, 87, 88, 101, 110, 115,127

CHOWDHURY, N.M.M.K., RAHMAN, M.R. & BOUTABA, R. (2009). Virtual networkembedding with coordinated node and link mapping. In IEEE INFOCOM 2009, 783–791. 110

CLARK, C., FRASER, K., HAND, S., HANSEN, J.G., JUL, E., LIMPACH, C., PRATT, I.& WARFIELD, A. (2005). Live migration of virtual machines. In Proceedings of the 2NdConference on Symposium on Networked Systems Design & Implementation - Volume 2,NSDI’05, 273–286, USENIX Association, Berkeley, CA, USA. 38

155

BIBLIOGRAPHY

COSTA, P.A.R.S., BAI, X., RAMOS, F.M.V. & CORREIA, M. (2016). Medusa: An efficientcloud fault-tolerant mapreduce. In 2016 16th IEEE/ACM International Symposium onCluster, Cloud and Grid Computing (CCGrid), 443–452. 34, 36

COSTA, P.A.R.S., RAMOS, F.M.V. & CORREIA, M. (2017). Chrysaor: Fine-grained, fault-tolerant cloud-of-clouds mapreduce. In Proceedings of the 17th IEEE/ACM InternationalSymposium on Cluster, Cloud and Grid Computing, CCGrid ’17, 421–430, IEEE Press,Piscataway, NJ, USA. xv, 34, 35, 36

CROCKFORD, D. (2015). The application/json Media Type for JavaScript Object Notation(JSON). RFC 4627. 14

DALTON, M., SCHULTZ, D., ADRIAENS, J., AREFIN, A., GUPTA, A., FAHS, B., RU-BINSTEIN, D., ZERMENO, E.C., RUBOW, E., DOCAUER, J.A., ALPERT, J., AI, J., OL-SON, J., DECABOOTER, K., DE KRUIJF, M., HUA, N., LEWIS, N., KASINADHUNI, N.,CREPALDI, R., KRISHNAN, S., VENKATA, S., RICHTER, Y., NAIK, U. & VAHDAT, A.(2018). Andromeda: Performance, isolation, and velocity at scale in cloud network virtu-alization. In 15th USENIX Symposium on Networked Systems Design and Implementation(NSDI 18), 373–387, USENIX Association, Renton, WA. vi, xv, 2, 3, 23, 25, 26, 41, 42,62

DEMOOFSIRIUS (2019). Demonstration of Sirius. https://www.youtube.com/watch?v=vygTlX7oTEY, accessed: 2019-06-18. 6

DIETRICH, D., RIZK, A. & PAPADIMITRIOU, P. (2015). Multi-provider virtual networkembedding with limited information disclosure. IEEE Transactions on Network and Ser-vice Management, 12, 188–201. 33

DRÄXLER, S., KARL, H. & MANN, Z.A. (2017). Joint optimization of scaling and place-ment of virtual network services. In 2017 17th IEEE/ACM International Symposium onCluster, Cloud and Grid Computing (CCGRID), 365–370. 37, 38

DRUTSKOY, D., KELLER, E. & REXFORD, J. (2013). Scalable network virtualization insoftware-defined networks. IEEE Internet Computing, 17, 20–27. xv, 20

FIRESTONE, D. (2017). Vfp: A virtual switch platform for host sdn in the public cloud. InProceedings of the 14th USENIX Conference on Networked Systems Design and Implemen-tation, NSDI’17, 315–328, USENIX Association, Berkeley, CA, USA. xv, 24, 25

156

https://www.youtube.com/watch?v=vygTlX7oTEY

https://www.youtube.com/watch?v=vygTlX7oTEY

BIBLIOGRAPHY

FIRESTONE, D., PUTNAM, A., MUNDKUR, S., CHIOU, D., DABAGH, A., AN-DREWARTHA, M., ANGEPAT, H., BHANU, V., CAULFIELD, A., CHUNG, E., CHAN-DRAPPA, H.K., CHATURMOHTA, S., HUMPHREY, M., LAVIER, J., LAM, N., LIU,F., OVTCHAROV, K., PADHYE, J., POPURI, G., RAINDEL, S., SAPRE, T., SHAW, M.,SILVA, G., SIVAKUMAR, M., SRIVASTAVA, N., VERMA, A., ZUHAIR, Q., BANSAL, D.,BURGER, D., VAID, K., MALTZ, D.A. & GREENBERG, A. (2018). Azure acceleratednetworking: Smartnics in the public cloud. In 15th USENIX Symposium on NetworkedSystems Design and Implementation (NSDI 18), 51–66, USENIX Association, Renton,WA. vi, 2, 3, 23, 25, 26, 41, 42, 62

FISCHER, A. & MEER, H. (2011). Position paper: Secure virtual network embedding. 34.31

FISCHER, A., BOTERO, J.F., BECK, M.T., DE MEER, H. & HESSELBACH, X. (2013a). Vir-tual network embedding: A survey. IEEE Communications Surveys Tutorials, 15, 1888–1906. 28, 50, 94, 143

FISCHER, A., BOTERO, J.F., BECK, M.T., DE MEER, H. & HESSELBACH, X. (2013b). Vir-tual network embedding: A survey. IEEE Communications Surveys Tutorials, 15, 1888–1906. 63, 87

FLOODLIGHT-PROJECT (2019). Floodlight Controller. Accessed: 2019-06-20. 12, 51

FUERST, C., SCHMID, S., SURESH, L. & COSTA, P. (2016). Kraken: Online and elasticresource reservations for multi-tenant datacenters. In IEEE INFOCOM 2016 - The 35thAnnual IEEE International Conference on Computer Communications, 1–9. 37, 38, 127

GEMBER-JACOBSON, A., VISWANATHAN, R., PRAKASH, C., GRANDL, R., KHALID,J., DAS, S. & AKELLA, A. (2014). Opennf: Enabling innovation in network functioncontrol. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, 163–174, ACM, New York, NY, USA. 131

GHORBANI, S. & GODFREY, P.B. (2017). Coconut: Seamless scale-out of network ele-ments. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys’17, 32–47, ACM, New York, NY, USA. 128, 129, 133

GHORBANI, S., SCHLESINGER, C., MONACO, M., KELLER, E., CAESAR, M., REXFORD,J. & WALKER, D. (2014). Transparent, live migration of a software-defined network. In

157

BIBLIOGRAPHY

Proceedings of the ACM Symposium on Cloud Computing, SOCC ’14, 3:1–3:14, ACM,New York, NY, USA. xv, 38, 39, 128, 129, 130, 133, 151

GLPK (2008). GNU Linear Programming Kit. http://www.gnu.org/software/glpk/. 87

GREENBERG, A., HAMILTON, J.R., JAIN, N., KANDULA, S., KIM, C., LAHIRI, P.,MALTZ, D.A., PATEL, P. & SENGUPTA, S. (2009). Vl2: A scalable and flexible datacenter network. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Commu-nication, SIGCOMM ’09, 51–62, ACM, New York, NY, USA. xv, 16

GREER, M. (2010). Survivability and information assurance in the cloud. In Proceedings ofthe 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W), DSNW ’10, 194–195, IEEE Computer Society, Washington, DC, USA. 21

GUDE, N., KOPONEN, T., PETTIT, J., PFAFF, B., CASADO, M., MCKEOWN, N. &SHENKER, S. (2008). Nox: Towards an operating system for networks. SIGCOMM Com-put. Commun. Rev., 38, 105–110. 12

GUO, C., LU, G., WANG, H.J., YANG, S., KONG, C., SUN, P., WU, W. & ZHANG, Y.(2010). Secondnet: A data center network virtualization architecture with bandwidthguarantees. In Proceedings of the 6th International COnference, Co-NEXT ’10, 15:1–15:12,ACM, New York, NY, USA. 21, 126, 127

HE, Q., ZHOU, S., KOBLER, B., DUFFY, D. & MCGLYNN, T. (2010). Case study forrunning hpc applications in public clouds. In Proceedings of the 19th ACM InternationalSymposium on High Performance Distributed Computing, HPDC ’10, 395–401, ACM,New York, NY, USA. 126

HIGGINS, J., HOLMES, V. & VENTERS, C. (2015). Orchestrating Docker Containers inthe HPC Environment. 506–513. 53

HOUIDI, I., LOUATI, W., BEN AMEUR, W. & ZEGHLACHE, D. (2011). Virtual networkprovisioning across multiple substrate networks. Comput. Netw., 55, 1011–1023. 33

JANG, K., SHERRY, J., BALLANI, H. & MONCASTER, T. (2015). Silo: Predictable messagelatency in the cloud. In Proceedings of the 2015 ACM Conference on Special Interest Groupon Data Communication, SIGCOMM ’15, 435–448, ACM, New York, NY, USA. 126

158

http://www.gnu.org/software/glpk/

http://www.gnu.org/software/glpk/

BIBLIOGRAPHY

JEYAKUMAR, V., ALIZADEH, M., MAZIÈRES, D., PRABHAKAR, B., KIM, C. & GREEN-BERG, A. (2013). Eyeq: Practical network performance isolation at the edge. In Proceed-ings of the 10th USENIX Conference on Networked Systems Design and Implementation,nsdi’13, 297–312, USENIX Association, Berkeley, CA, USA. 126

JIN, X., REXFORD, J. & WALKER, D. (2014). Incremental update for a compositional sdnhypervisor. In Proceedings of the Third Workshop on Hot Topics in Software Defined Net-working, HotSDN ’14, 187–192, ACM, New York, NY, USA. 21

JO, C., CHO, Y. & EGGER, B. (2017). A machine learning approach to live migration mod-eling. In Proceedings of the 2017 Symposium on Cloud Computing, SoCC ’17, 351–364,ACM, New York, NY, USA. 128

KHAN, M.A. (2016). A survey of security issues for cloud computing. Journal of Networkand Computer Applications, 71, 11 – 29. vi

KIM, C., CAESAR, M. & REXFORD, J. (2008). Floodless in seattle: A scalable ethernetarchitecture for large enterprises. In Proceedings of the ACM SIGCOMM 2008 Conferenceon Data Communication, SIGCOMM ’08, 3–14, ACM, New York, NY, USA. 15

KOPONEN, T., CASADO, M., GUDE, N., STRIBLING, J., POUTIEVSKI, L., ZHU, M., RA-MANATHAN, R., IWATA, Y., INOUE, H., HAMA, T. & SHENKER, S. (2010). Onix:A distributed control platform for large-scale production networks. In Proceedings of the9th USENIX Conference on Operating Systems Design and Implementation, OSDI’10, 351–364, USENIX Association, Berkeley, CA, USA. 12, 23, 151

KOPONEN, T., AMIDON, K., BALLAND, P., CASADO, M., CHANDA, A., FULTON, B.,GANICHEV, I., GROSS, J., INGRAM, P., JACKSON, E., LAMBETH, A., LENGLET, R.,LI, S.H., PADMANABHAN, A., PETTIT, J., PFAFF, B., RAMANATHAN, R., SHENKER,S., SHIEH, A., STRIBLING, J., THAKKAR, P., WENDLANDT, D., YIP, A. & ZHANG,R. (2014). Network virtualization in multi-tenant datacenters. In 11th USENIX Sympo-sium on Networked Systems Design and Implementation (NSDI 14), 203–216, USENIXAssociation, Seattle, WA. v, vi, 2, 3, 23, 41, 42, 44, 46, 62, 120, 126, 131

KREUTZ, D., RAMOS, F.M.V., VERISSIMO, P.E., ROTHENBERG, C.E., AZODOLMOLKY,S. & UHLIG, S. (2015). Software-defined networking: A comprehensive survey. Proceed-ings of the IEEE, 103, 14–76. v, xv, 1, 10, 11, 42

159

BIBLIOGRAPHY

KREUTZ, D., YU, J., RAMOS, F. & ESTEVES, P. (2019). Anchor: Logically centralizedsecurity for software-defined networks. ACM Transactions on Information and System Se-curity, 22, 8:1–. 151

LACOSTE, M., MIETTINEN, M., NEVES, N., RAMOS, F., VUKOLIC, M., CHARMET, F.,YAICH, R., OBORZYNSKI, K., VERNEKAR, G. & SOUSA, P. (2016). User-Centric Secu-rity and Dependability in the Clouds-of-Clouds. IEEE Cloud Computing, 3. 45

LEISERSON, C.E. (1985). Fat-trees: Universal networks for hardware-efficient supercom-puting. IEEE Transactions on Computers, C-34, 892–901. 16

LI, A., YANG, X., KANDULA, S. & ZHANG, M. (2010). Cloudcmp: Comparing publiccloud providers. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Mea-surement, IMC ’10, 1–14, ACM, New York, NY, USA. 57, 58

LIU, S., CAI, Z., XU, H. & XU, M. (2014). Security-aware virtual network embedding. In2014 IEEE International Conference on Communications (ICC), 834–840. 32, 33, 101

LOS, R., SHACKLEFORD, D. & SULLIVAN, B. (2013). The notorious nine cloud computingtop threats in 2013. In Cloud Security Alliance. 2

MCKEOWN, N., ANDERSON, T., BALAKRISHNAN, H., PARULKAR, G., PETERSON, L.,REXFORD, J., SHENKER, S. & TURNER, J. (2008). Openflow: Enabling innovation incampus networks. SIGCOMM Comput. Commun. Rev., 38, 69–74. 13

MIAO, R., ZENG, H., KIM, C., LEE, J. & YU, M. (2017). Silkroad: Making stateful layer-4load balancing fast and cheap using switching asics. In Proceedings of the Conference ofthe ACM Special Interest Group on Data Communication, SIGCOMM ’17, 15–28, ACM,New York, NY, USA. 144

MICHEL, O., KELLER, E. & RAMOS, F. (2019). Network defragmentation in virtualizeddata centers. Sixth IEEE International Conference on Software Defined Systems (SDS). 37,38, 128, 130, 143, 145

MYSORE, R.N., PAMBORIS, A., FARRINGTON, N., HUANG, N., MIRI, P., RADHAKR-ISHNAN, S., SUBRAMANYA, V. & VAHDAT, A. (2009). Portland: A scalable fault-tolerant layer 2 data center network fabric. In Proceedings of the ACM SIGCOMM 2009Conference on Data Communication, SIGCOMM ’09, 39–50, ACM, New York, NY,USA. 16

160

BIBLIOGRAPHY

NALDI, M. (2005). Connectivity of Waxman Topology Models. Computer Communica-tions, 29, 24–31. 85, 111, 143

OLIVEIRA, T., MENDES, R. & BESSANI, A. (2017). Exploring Key-Value Stores in Multi-Writer Byzantine-Resilient Register Emulations. In P. Fatourou, E. Jiménez & F. Pedone,eds., 20th International Conference on Principles of Distributed Systems (OPODIS 2016),vol. 70 of Leibniz International Proceedings in Informatics (LIPIcs), 30:1–30:17, SchlossDagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany. 45

ONF (2016). OpenFlow Definition. Accessed: 11 October 2016. 11, 13

ONF (2018a). OpenFlow Switch Specification. 13, 55

ONF (2018b). Software-Defined networking (SDN) Definition. Accessed: 06 February2018. 10

P4 (2019). P4. https://p4.org/, accessed: 2019-06-23. 152

PERLMAN, R., EASTLAKE-3RD, D.E., DUTT, D.G., GAI, S. & GHANWANI, A. (2011).TRILL - Routing Bridges (RBridges): Base Protocol Specification. RFC 6325. 17

PFAFF, B. & DAVIE, B. (2013). The Open vSwitch Database Management Protocol. RFC7047. 14

PFAFF, B., PETTIT, J., KOPONEN, T., JACKSON, E., ZHOU, A., RAJAHALME, J., GROSS,J., WANG, A., STRINGER, J., SHELAR, P., AMIDON, K. & CASADO, M. (2015). Thedesign and implementation of open vswitch. In 12th USENIX Symposium on NetworkedSystems Design and Implementation (NSDI 15), 117–130, USENIX Association, Oakland,CA. 13, 14, 45, 108

POPA, L., KUMAR, G., CHOWDHURY, M., KRISHNAMURTHY, A., RATNASAMY, S. &STOICA, I. (2012). Faircloud: Sharing the network in cloud computing. In Proceedingsof the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, andProtocols for Computer Communication, SIGCOMM ’12, 187–198, ACM, New York,NY, USA. 126, 127

RAHMAN, M.R., AIB, I. & BOUTABA, R. (2010). Survivable virtual network embedding.In Proceedings of the 9th IFIP TC 6 International Conference on Networking, NETWORK-ING’10, 40–52, Springer-Verlag, Berlin, Heidelberg. 30, 33, 101

161

https://p4.org/

BIBLIOGRAPHY

RIGHTSCALE (2017). 2017 state of the cloud report. 3

ROST, M., FUERST, C. & SCHMID, S. (2015). Beyond the stars: Revisiting virtual clusterembeddings. SIGCOMM Comput. Commun. Rev., 45, 12–18. 22, 127

SCHAD, J., DITTRICH, J. & QUIANÉ-RUIZ, J.A. (2010). Runtime measurements in thecloud: Observing, analyzing, and reducing variance. Proc. VLDB Endow., 3, 460–471.126

SHAHRIAR, N., AHMED, R., CHOWDHURY, S.R., KHAN, M.M.A., BOUTABA, R., MI-TRA, J. & ZENG, F. (2016). Connectivity-aware virtual network embedding. In 2016 IFIPNetworking Conference (IFIP Networking) and Workshops, 46–54. xv, 30, 31, 33, 101

SHARWOOD, S. (2016). Salesforce.com crash caused DATA LOSS. https:

//www.theregister.co.uk/2016/05/13/salesforcecom-crash-caused-data-loss/. 2

SHEN, Z., JIA, Q., SELA, G.E., SONG, W., WEATHERSPOON, H. & VAN RENESSE, R.(2017). Supercloud: A library cloud for exploiting cloud diversity. ACM Trans. Comput.Syst., 35. 36

SHERWOOD, R., GIBB, G., YAP, K.K., APPENZELLER, G., CASADO, M., MCKEOWN,N. & PARULKAR, G. (2009). FlowVisor: A Network Virtualization Layer. Tech. rep.,Deutsche Telekom Inc. R&D Lab, Stanford, Nicira Networks. xv, 17, 18

SHRIVASTAVA, V., ZERFOS, P., LEE, K., JAMJOOM, H., LIU, Y. & BANERJEE, S. (2011).Application-aware virtual machine migration in data centers. In 2011 Proceedings IEEEINFOCOM, 66–70. 128

SINGH, A., ONG, J., AGARWAL, A., ANDERSON, G., ARMISTEAD, A., BANNON, R.,BOVING, S., DESAI, G., FELDERMAN, B., GERMANO, P., KANAGALA, A., PROVOST,J., SIMMONS, J., TANDA, E., WANDERER, J., HÖLZLE, U., STUART, S. & VAHDAT,A. (2015). Jupiter rising: A decade of clos topologies and centralized control in google’sdatacenter network. In Proceedings of the 2015 ACM Conference on Special Interest Groupon Data Communication, SIGCOMM ’15, 183–197, ACM, New York, NY, USA. 94,111

STEUER, R.E. (1986). Multiple Criteria Optimization: Theory, Computation and Applica-tion. John Wiley, New York, 546 pp. 75, 87

162

https://www.theregister.co.uk/2016/05/13/salesforcecom-crash-caused-data-loss/



BIBLIOGRAPHY

SUN, C., BI, J., MENG, Z., YANG, T., ZHANG, X. & HU, H. (2018). Enabling nfv elasticitycontrol with optimized flow migration. IEEE Journal on Selected Areas in Communica-tions, 36, 2288–2303. 131

SUPERCLOUD (2019). SUPERCLOUD. https://supercloud-project.eu/, ac-cessed: 2019-06-18. 6

SURYATEJA, P. (2018). Threats and vulnerabilities of cloud computing: A review. INTER-NATIONAL JOURNAL OF COMPUTER SCIENCES AND ENGINEERING, 6. vi

USA TODAY (2017). Massive amazon cloud service outage disrupts sites. 42

VAHDAT, A. (2017). Networking challenges for the next decade. Google Networking Re-search Summit Keynote. 66

VMWARE (2018). Nsx data center. 23

VMWAREMULTICLOUDSEC (2019). With Secure State VMware Dives Deeper IntoMulti-Cloud Security. https://www.sdxcentral.com/articles/news/with-secure-state-vmware-dives-deeper-into-multi-cloud-security/

2019/06/, accessed: 2019-06-18. 3

WANG, F., LING, R., ZHU, J. & LI, D. (2015). Bandwidth guaranteed virtual networkfunction placement and scaling in datacenter networks. In 2015 IEEE 34th InternationalPerformance Computing and Communications Conference (IPCCC), 1–8. 37, 38

WANG, G. & NG, T.S.E. (2010). The impact of virtualization on network performance ofamazon ec2 data center. In Proceedings of the 29th Conference on Information Communi-cations, INFOCOM’10, 1163–1171, IEEE Press, Piscataway, NJ, USA. 126

WANG, Y., KELLER, E., BISKEBORN, B., VAN DER MERWE, J. & REXFORD, J. (2008).Virtual routers on the move: Live router migration as a network-management primi-tive. In Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication,SIGCOMM ’08, 231–242, ACM, New York, NY, USA. 38

WILLIAMS, D., JAMJOOM, H. & WEATHERSPOON, H. (2012). The xen-blanket: Virtu-alize once, run everywhere. In Proceedings of the 7th ACM European Conference on Com-puter Systems, EuroSys ’12, 113–126, ACM, New York, NY, USA. xv, 34, 36, 43

163

https://supercloud-project.eu/

https://www.sdxcentral.com/articles/news/with-secure-state-vmware-dives-deeper-into-multi-cloud-security/2019/06/



BIBLIOGRAPHY

WILSON, C., BALLANI, H., KARAGIANNIS, T. & ROWTRON, A. (2011). Better never thanlate: Meeting deadlines in datacenter networks. In Proceedings of the ACM SIGCOMM2011 Conference, SIGCOMM ’11, 50–61, ACM, New York, NY, USA. 129

XIE, D., DING, N., HU, Y.C. & KOMPELLA, R. (2012). The only constant is change: In-corporating time-varying network reservations in data centers. In Proceedings of the ACMSIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols forComputer Communication, SIGCOMM ’12, 199–210, ACM, New York, NY, USA. 22

YAMANAKA, H., KAWAI, E., ISHII, S. & SHIMOJO, S. (2014). AutoVFlow: Autonomousvirtualization for wide-area OpenFlow networks. In Third European Workshop on Soft-ware Defined Networks, EWSDN ’14. 18

YEOW, W.L., WESTPHAL, C. & KOZAT, U. (2010). Designing and embedding reliablevirtual infrastructures. In Proceedings of the Second ACM SIGCOMM Workshop on Virtu-alized Infrastructure Systems and Architectures, VISA ’10, 33–40, ACM, New York, NY,USA. 30, 33

YU, H., ANAND, V., QIAO, C. & SUN, G. (2011). Cost efficient design of survivable virtualinfrastructure to recover from facility node failures. In 2011 IEEE International Confer-ence on Communications (ICC), 1–6. v, 1, 30, 33, 101

YU, L. & CAI, Z. (2016). Dynamic scaling of virtual clusters with bandwidth guaranteein cloud datacenters. In IEEE INFOCOM 2016 - The 35th Annual IEEE InternationalConference on Computer Communications, 1–9. 37, 38

YU, M., YI, Y., REXFORD, J. & CHIANG, M. (2008). Rethinking virtual network embed-ding: Substrate support for path splitting and migration. SIGCOMM Comput. Commun.Rev., 38, 17–29. xvi, 28, 33, 88, 94, 101, 110, 115, 120, 121, 123, 127, 144

ZAHARIA, M., KONWINSKI, A., JOSEPH, A.D., KATZ, R. & STOICA, I. (2008). Improv-ing mapreduce performance in heterogeneous environments. In Proceedings of the 8thUSENIX Conference on Operating Systems Design and Implementation, OSDI’08, 29–42,USENIX Association, Berkeley, CA, USA. 126

ZEGURA, E.W. et al. (1996). How to Model an Internetwork. In IEEE INFOCOM, 594–602.85, 111, 143

164

BIBLIOGRAPHY

ZHANG-SHEN, R. (2010). Valiant Load-Balancing: Building Networks That Can Support AllTraffic Matrices, 19–30. 16

ZHENG, L., JOE-WONG, C., TAN, C.W., CHIANG, M. & WANG, X. (2015). How to bidthe cloud. 71–84. vi, 2, 43

165

Secure and Dependable Multi-Cloud Network Virtualization · 2020-03-20 · nuvem a possibilidade de...

Documents

Transcript of Secure and Dependable Multi-Cloud Network Virtualization · 2020-03-20 · nuvem a possibilidade de...