Data manipulation techniques
Transcript of Data manipulation techniques
Data manipulation techniques for configuration management using Ansible
Distinguished Solutions Architect, World Wide Technology
@programmablenetworks
Joel W. King
title | abstractTitle: Data manipulation techniques for configuration management using Ansible
What is the most important thing that people will learn from your presentation?: This talk explores techniques and best practices for ingesting, manipulating and storing configuration management data for managing multi-cloud infrastructure deployments using Ansible.
Abstract:Ansible is widely adopted as a configuration management tool for both on-premise infrastructure and multi-cloud deployments. Most learning tracks focus on procedural programming concepts, learning playbook syntax. This talk focuses on techniques to ingest, manipulate and optimize configuration management data to drive the process. We examine techniques to create data sinks for audit and input to downstream workflows. Also highlighted is the application of relational, NoSQL and graph databases as well as sequential files used for configuration management.
Joel W. KingDistinguished Solutions Architect, WWT
I develop and evangelize network programmability solutions.
DevNet Creator 2019
DevNet 500CCIE 1846 (ret.)$10,000 Phantom Contest - F5 and Cisco Meraki
Code Exchange
Find my code at: https://developer.cisco.com/codeexchange/explore/#search=joelwking
DevNet Tools & Resourcesdeveloper.cisco.com/site/aci/
EXPECTATION REALITY
Optimizing ML algorithm
Collecting DataInfrastructure Build
Integration
https://www.coursera.org/learn/google-machine-learning/
How Google does Machine Learning
… It's all about data
?MongoDB
S3Minio
DatabasesFun with variables
Variables
Device ConfigurationsAnalytic Engines
Ansible Facts
Topics
"All we know are the facts, ma'am."
HOSTS
NETWORK DEVICES
{ API }
Variables can be defined in a bewildering variety of places in an Ansible project.(*)
Ansible facts are variables that are automatically discovered by Ansible from a managed host.
(*) Automation with Ansible -Student Workbook
Most
Least
Variable Precedence
include_varsset_facts / registered varsrole (and include_role) paramsinclude paramsextra vars (always win precedence)
host facts / cached set_factsplay varsplay vars_promptplay vars_filesrole vars (defined in role/vars/main.yml)block vars (only for tasks in block)task vars (only for the task)
command line values (eg “-u user”)role defaults inventory file or script group varsinventory group_vars/all playbook group_vars/all inventory group_vars/* playbook group_vars/* inventory file or script host vars inventory host_vars/* playbook host_vars/*
https://docs.ansible.com/ansible/latest/user_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable
Ansible: 4GL overlay for Python
11
acl_action: {"ALLOW": "permit", "DROP": "deny”}
- name: Configure firewall access-list debug: msg: "access-list {{ acl_name }} line 1 extended {{ acl_action[item.action] }} {{item.ip_protocol }}"
loop: "{{ tnp.ansible_facts.intents }}"
Dictionaries can be defined as translators
Filters in Ansible… are used for transforming data …https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html
Lookup plugins allow Ansible to access data from outside sources.Lookups are an integral part of loops
Modules are used as information gatherers_info_facts
Modules have modes to ‘put’ or ‘get’ aws_s3
Example of overriding playbook variable(s) at runtime
https://github.com/joelwking/csv-source-of-truth/blob/master/manage_aci_dhcp.yml
tasks:- name: Summarize the sheet and … csv_to_facts:src: '{{ src }}/{{ sheet }}.csv'vsheets:- DHCPentries:
- DC- Tenant- BD- AppProfile- DHCP- EPG
Illustrates reading a CSV file, and returning a list of unique values of the columns specified
https://github.com/joelwking/csv-source-of-truth/blob/master/manage_aci_dhcp.yml
rowscolumns
Ansible Facts:Device ConfigurationsAnalytic Engines
Analytic Engines
https://github.com/joelwking/pensando/blob/master/playbooks/workflow_use_case.yml
Python module
Ansiblevariable
WriteAPI
output
Variables:Databases
MongoDBPython module
Load JSONfile into
DatabaseCollection
ReturnJSON
Objectbased on
Querymongodb_query: {"_id": "{{ _id }}" }
A key, value filter (dictionary) used to query the collection within the database
MongoDBCompass
rfc3139provide expiration time and effective time
Firewall ‘group’
Load balancer ‘pool’
Load balancer ‘group’
MinIO
https://developer.cisco.com/codeexchange/github/repo/nsthompson/minio-ansible-pyats-sandbox
https://min.io/product
MinIO is a high performance, distributed open source object storage system.
It provides private cloud object storagecompatible with the Amazon S3 API.
Object storage enables storing any type of file/data in its native format.
Binary files like Excel files, or IOS images are as easily stored and referenced as structured data like CSV or JSON objects
- name: Data Manipulation DevNet Create 2020hosts: localhostconnection: localgather_facts: no
module_defaults:aws_s3:s3_url: 'http://{{ minio.host }}:{{ minio.port }}'aws_access_key: '{{ minio.access_key }}'aws_secret_key: '{{ minio.secret_key }}'validate_certs: Falseencrypt: Falsebucket: '{{ bucket | default("devnet") }}'
vars_files:- '{{ playbook_dir }}/passwords.yml'
vars:input:- name: '{{ playbook_dir }}/files/aci/DHCPRelay.csv'state: putmeta_data: 'type=csv'
tasks:- name: PUT/upload with metadataaws_s3:object: '{{ item.name | basename }}'src: '{{ item.name }}'mode: '{{ item.state }}'metadata: '{{ item.meta_data }}'
loop: '{{ input }}'
https://gitlab.com/joelwking/cisco_dc_community_of_interest/-/tree/master/demos/minio
Key-points
ü Playbooks are data-driven with variables from directories on the file system
ü Facts are variables discovered from the host or external sources, like APIs or a device configuration
ü Databases can be used to store or populate configuration data