EDB Guide

8
Unlock the potential of Data for better Business Inside the Guide Enterprise Data Bag (EDB) EDB: Data Access & Management 5 EDB: Data Integration & Administration 5 EDB: Data Security 5 EDB: Big Data Cluster 5 EDB: Cluster Deployment 5 Big Data Processing & Analytics Batch Processing 6 SQL like Query 6 Data Search 6 Streaming Data 6 Predictive Analytics 6 Power of Apache™ Hadoop® to the Enterprise Enterprise Data Bag (EDB) combines the world’s best open components from Apache Hadoop and enterprise grade software to deliver the complete Big Data solution to our data consumers. Realize the true potential of your data and achieve new business insights. Bringing Big Data to the Enterprise The Big Data Team Data Architect 7 Data Scientist 7 Data Analyst 7 Developer 7 System Administrator 7 ENTERPRISE DATA BAG BIG DATA FOR THE ENTERPRISE Connect your data to your Business AGGREGATE YOUR DATA ANALYSE EVERY PROCESS ADAPT TO CHANGE

description

From data to actionable insights. EDB or Enterprise Data Bag helps you get the most out off your internal data and the open world data. EDB guide gives brief highlights of our Data Analytics platform and services.

Transcript of EDB Guide

Page 1: EDB Guide

 

Unlock  the  potential  of  Data  for  better  Business  

Inside  the  Guide  

Enterprise  Data  Bag  (EDB)  

EDB:  Data  Access  &  Management-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  5  

EDB:  Data  Integration  &  Administration-­‐-­‐-­‐-­‐  5  

EDB:  Data  Security-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  5  

EDB:  Big  Data  Cluster-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  5  

EDB:  Cluster  Deployment-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  5  

Big  Data  Processing  &  Analytics  

Batch  Processing-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  6  

SQL  like  Query-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  6  

Data  Search-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  6  

Streaming  Data-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  6  

Predictive  Analytics-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  6  

   

Power  of  Apache™  Hadoop®  to  the  Enterprise  Enterprise   Data   Bag   (EDB)   combines   the  world’s  best  open  components  from  Apache  Hadoop  and  enterprise  grade  software  to  deliver  the  complete  Big  Data  solution  to  our  data  consumers.  

Realize  the  true  potential  of  your  data  and  achieve  new  business  insights.  

Bringing  Big  Data  to  the  Enterprise  

 

The  Big  Data  Team  

Data  Architect-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  7  

Data  Scientist-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  7  

Data  Analyst-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  7  

Developer-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  7  

System  Administrator-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  7  

     

 

ENTE

RPRISE

 DATA

 BAG  BIG

 DATA

 FOR  TH

E  EN

TERP

RISE  

 

Connect  your  data  to  your  Business  

 

 

AGGREGATE  YOUR  DATA  

ANALYSE  EVERY  PROCESS  

ADAPT                      TO  CHANGE  

 

 

Page 2: EDB Guide

 

   

Harnessing  the  power  of  Data  Our   custom   created   integration   modules  make   it   easier   to   augment   your   existing  application   and   data   landscape.   With   faster  rollout   and   rapid   integration   you   can   make  your   Big   Data   solution   up   and   running   in   no  time.  

“With   exploding   data   size,   Big   Data   is   a  

necessity  and  EDB  gives  a  robust  platform  

to  do  it  in  the  right  manner”  

 

 

   

Apache  Hadoop  started  with  two  open  source  components,  

1.   Hadoop   Distributed   File   System   or   HDFS  (the  storage  file  system  for  Big  Data  in  Hadoop  cluster)  &  

2.  Map  Reduce  (the  programing  framework  to  process   and   analyze   data   stored   in   HDFS).  From  just  two  simple  components  the  Hadoop  ecosystem   has   expanded   to   multitude   of  components  providing  ease  of  data  processing  and   analytics   and   making   monitoring   and  management  of  Big  Data  cluster  much  easier.    

The   Hadoop   ecosystem   can   be   divided   into  following  types  of  components  

Data  Access  

Data  Management  

Data  Integration  

Data  Administration  

Data  Security  

Cluster  Operations  

 

 

 

Enterprise  Data  Bag    

Enterprise   Data   Bag   combines   industry’s   most  reliable   open   source   components   with   Apache  Hadoop   to   give   Enterprise   grade   Big   Data  solution   that   is   open   at   heart.   The   individual  components  are  chosen  to  encompass  all  the  Big  Data   processing   and   analytics   needs   of   an  Enterprise.  

 

 

Page 3: EDB Guide

Data  Access  Access  the  stored  data  in  the  Big  Data  cluster  in  many  ways  like  SQL  like  query,  batch,  real-­‐time.  

 

Data  Management  Process  data  and  store  all  your  enterprise  data  sets  in  the  Big  Data  cluster  

 

Data  Integration  &  Administration  

Load  your  data  in  the  Big  Data  cluster  and  manage  according  to  custom  defined  rules  and  policies  for  data  access  and  processing  

 

Data  Security  User  authentication,  authorization  to  manage  data  security  and  accessibility  to  the  right  set  of  data  consumers  

 

Operations  Easily  deploy  Big  Data  cluster  and  effectively  manage  all  aspects  of  Big  Data  operations  

 

 

 

 

   

Enterprise  Data  Bag  Inheriting   the   power   of   Hadoop   Enterprise  Data  Bag   is   completely   integrated  and   tested  platform  that  is  ready  for  any  enterprise.  

EDB   seamlessly   integrates   with   your   current  application   and   data   landscape   to   leverage  your   current   technology   and   augment   them  with  Big  Data  platform.  

 

  Custom  Made  

 

 

 

EDB  provides  custom  configurations  out  of  the  box   that   make   your   system   ready   to   adapt  right  from  the  deployment.  

For  those  not  finding  the  right  configuration  in  our  list,  our  Big  Data  team  can  help  you  up  and  running  in  no  time.  

Fully  Integrated    

EDB  is  designed  with  the  most  robust,  reliable  and  useful   software  components  with  custom  made   APIs.   EDB   easily   integrates   with   your  data   center,   enterprise   applications,   data  warehouse   or   any   other   deployment   already  present  in  your  environment.  

Our   cluster   operations   and   management  platform  is  easy  to  manage  and  integrate  even  with   non-­‐standard   solutions   to   your   Big  Data  cluster.  

Solution  Partner    

We   as   a   solution   partner   help   our   data  consumers   at   every   phase   of   the   Big   Data  project.  Whether  its  

Deployment  

Integration  

Creating  custom  business  jobs  

 

Page 4: EDB Guide

 

  ESM  

 Enterprise  System  Manager  

ESM   is   the   operation   hub   of   Enterprise   Data  Bag.   It  provides  a  single  customizable  console  where  users  can  manage  the  Big  Data  cluster.  

ESM   provides   multi   tenant   capability   where  users  have  different  access   rights  e.g.  Admin.  User   has   control   over   resources,   while   a  Business   user   has   control   over   insights   and  report  snapshots.  

 

AppConnect  Manager  ADAM  

  EVE  

App  Connect  Manager  It   provides   custom   APIs   designed   to   connect  with   various   standard   applications   like   ERP  

solutions,   Business   Intelligence   software,  Asset  management  software  and  many  more.  

Our   team   works   round   the   clock   to   design  custom   APIs   and   integrate   Big   Data   to  applications.  

Advance  Data  analytics  module  

ADAM   shortens   the   learning   curve   from  deployment   of   Big   Data   platform   to   designing  data  processing  and  analytics  jobs.  

ADAM   provides   GUI   based   data   processing   and  analytics   engine   where   users   can   generate   data  jobs   using   our   custom   made   programming  templates   to   get   the   accurate   output   in   shorter  time  frame.  

 Effective  Volume  Enhancer  

EVE  lets  the  user  save  terabytes  of  space  with  compression  technology.  EVE  also  tiers  less  non-­‐critical  data  to  dormant  parts  of  the  cluster.  

  SLM  Smart  Learning  Manager  

SLM   incorporates   machine-­‐learning   algorithms   to  understand   data   processing   and   analytics   jobs   with   time  and   facilitates   identify   critical   insights.   SLM   self-­‐learning  gets  better  with  time  and  helps  reach  business  goals  faster.  

 

Page 5: EDB Guide

Data  Access       Data  Integration  

 

  Data  Security  

   

EDB   provides   multiple   engines   to   interact   and  access  data  based  on  the  actual  requirement  and  the  type  of  data  accessed.  

 

 

Import   &   Integrate   data   silos   into   single   Big  Data  cluster  using  inbuilt  components  

 

 

Manage  all  security  policies  from  single  location.  

Authenticate  user  accessing  the  Big  Data  cluster.  

Authorize  user  from  the  access  control  list  for  type  of  authorization.  

Encrypt  data  during  import  and  export  irrespective  of  the  cluster  location  whether  on-­‐premises  or  on  the  cloud.  

Cluster  Operations  

 

Manage   the   complete   cluster   form   one  console   using   EDB   manager.   It   manages  hardware   resources,   operational   parameters  and  dead  nodes.  

Provision  resources  with  ease.  

Monitor   running   jobs   and   the   resource  consumption   to   easily   identify   cluster   needs  and  upgrades.   It  also  monitors   jobs  failed  and  the  cause  of  failure.  

Operate   your   Big   Data   cluster   with   data  processing  and  analytics  scheduler  

Data  Management    

    Data  Administration  

 

 

       

Manage   replication   of   your   data   sets   set  policies  for  streaming  data.  EVE  simplifies  the  job   for   the   user   through   easy   accessible  configurations.  

 

 

Manage   your   data   import   and   export   jobs   from  GUI   based   console   whether   your   Big   Data  solution  is  on-­‐premises  or  hosted  on  the  cloud.  

 

Page 6: EDB Guide

 

Batch  Processing       SQL  like  Query  

 

  Data  Search  

   

Hadoop   cluster   can   ingest   massive   amounts   of  data   sets   whether   structured   or   unstructured.  EDB   Batch   processing   engine   creates   small  manageable   units   of   data.   Familiar   tools   like   BI  can   use   these   batch   data   for   more   accurate  analysis.  

 

 

SQL   is   the   most   widely   used   data   query  language.   Hive   uses   paradigm   to   query   data  that   is   very   similar   to   SQL   and   can   run   the  same  sequential  queries  at  much  larger  scale.  

 

 

EDB   search   can   query   different   formats   of  data  sets.  Single  query  can  do  the  search  using  GUI  based  search  module.  

Streaming  Data   Predictive  Analysis   Custom  Analytics  

 

Streaming   data   from   various   resources   can   be  imported   into  EDB  cluster.  The  data  can  be  from  sensors,   network   elements,   RFID,   field  instruments  and   live  data.  This  data   is   combined  to  get  the  360-­‐degree  view.  

     

EDB   includes   machine-­‐learning   libraries   that  are   used   for   predictive   analytics   of   data.   It   is  critical   for   companies   like   Oil   &   Gas,  Telecomm,  Pharma,  Manufacturing  and  many  more.  

   

EDB   provides   custom   made   templates   for  generating  data  processing  and  analytics  jobs.  This   saves   huge   resource   and   consulting   cost  for  the  enterprise.  

 

     

Page 7: EDB Guide

A  reliable  team  powers  great  Big  Data  operations.  Our  Big  Data  team  has  specific  roles  and  definitions  for  

Data  Architect       Data  Analyst  

 

  Expert  Consulting  

   

Outlines   the   plan,   architecture   and   deployment  strategy   of   a   Big   Data   deployment.   They   are  stepping  stone  for  all  Big  Data  projects.  

 

They   are   similar   to  Data  Scientist   but   look   at  the  business  aspect  of   insights.  They  become  the   bridge   between   information   and   its  implementation.  

 

For   data   consumers   exploring   Big   Data   for  their   business   we   provide   expert   consulting  services.  We  help  our  customers  to  

Identify  Big  Data  opportunity  

Deploy  Big  Data  solution  

Integrate  Big  Data  solution  

Run  smooth  Big  Data  operations  

 

Data  Scientist   Developer    

They   explore,   identify   and   analyze   data   to   find  relevant  insights.  From  writing  scripts  to  SQL  like  queries  it’s  their  job  to  extract  information  out  of  data.  

     

Developing   new   and   useful   jobs   around   the  Big   Data   ecosystem   is   their   prime  responsibility.  They  use  technologies  like  Java,  python  to  develop  the  apps  that  make  our  life  easy.  

     

   

 

 

 

 

 

   

Page 8: EDB Guide

 

 

 

 

 

 

  Connect  with  us       Visit:  www.alethelabs.com    

 

 

 

 

©2014.  Alethe  Labs.  All  rights  reserved.  Alethe  Labs  and  the  Alethe  Labs  logo  are  trademarks  or  registered  trademarks  of  Alethe  Labs.  All  other  trademarks  are  the  property  of  their  respective  companies.  Information  is  subject  to  change  without  notice.