Dátové sklady

Click here to load reader

download Dátové sklady

of 68

  • date post

    14-Jan-2016
  • Category

    Documents

  • view

    28
  • download

    0

Embed Size (px)

description

Dátové sklady. Pokročilé dátové technológie Genči. Obsah. Literatúra Pojem INFORMÁCIA Motivácia pre DWH Bližší pohľad na DWH Š tr uktúra DWH Metadata Komponenty DWH Nástroje (Tools). Literatúra. - PowerPoint PPT Presentation

Transcript of Dátové sklady

  • Dtov skladyPokroil dtov technolgieGeni

  • ObsahLiteratraPojem INFORMCIAMotivcia pre DWHBli pohad na DWHtruktra DWHMetadataKomponenty DWHNstroje (Tools)

  • Literatra[1] Lacko L.: Datov sklady, analza OLAP a dolovn dt s pklady . Computer Press. Brno. 2003

    [2] Paulraj Ponniah: Data Warehousing Fundamentals: A Comprehensive Guide for IT Professionals. 2001. John Wiley & Sons, Inc. ISBNs: 0-471-41254-6 (Hardback); 0-471-22162-7 (Electronic)

  • Literatra (pokr.)[3] Ralph Kimball, Margy Ross: The Data Warehouse Toolkit. Second Edition. 2002. Wiley Computer Publishing.

    [4] W. H. Inmon: Building the Data Warehouse Third Edition. 2002. John Wiley & Sons, Inc.

  • Pojem INFORMCIA [1]daje sa stvaj informciami, akmme daje;vieme, e mme daje;vieme, kde mme tieto daje;mme k nim prstup;zdroju dajov meme dverova.

  • Hierarchia informanch rovndajeInformcieZnalostiMdros

  • Motivcia pre DWHExekutva potrebuje informcie (napr.) kvli rozhodnutiu:kde postavi al sklad;ktor produktov lniu rozvja;ktor trn segment by mal by posilnen

    t.j. potrebuje realizova strategick rozhodnutia a pre ne potrebuje strategick informciu

  • Strategick informciaNemu ju poskytn OLTP systmyNesli pre denno-denn riadenie spolonosti

  • Poadovan vlastnosti strategickej informcie

  • Vstup dt

  • Vstup informci

  • Vyplvajce protireeniaOrganizcie maj vek mnostvo dt

    ale

    IT zdroje a systmy nie s schopn efektvnym spsobom toto mnostvo dt premeni na strategick informciu

  • Informan krzaNie kvli nedostatku dt, ale preto, e dta nie s pouiten pre strategick rozhodovanie

    Dvody:daje s v spolonostiach rozloen naprie mnohmi typmi nekompatibilnch truktr a systmov

  • Prevdzkov systmy (spracovanie objednvok, skladov evidencia, fakturcia, ...) nie s navrhovan pre poskytovanie strategickej informcie.Ak potrebujeme poskytova strategick informciu, musme spracova dta uloen v rznych typoch systmov. Iba pecilne navrhnut DSS alebo IS mu poskytova strategick informciu.

  • Rozdiely

  • Koncepcia dtovho skladuVezmite vetky dta ktor mte v organizcii, vyistite a transformujte ich a nsledne poskytujte uiton strategick informciu

  • Koncepcia dtovho skladu

  • Bli pohad na DWH

  • Inmonov defincia DWHWilliam (Bill?) Inmon, povaovan za otca dtovch skladov, definoval DWH takto: A Data Warehouse is a subject oriented, integrated, nonvolatile, and time variant collection of data in support of managements decisions.

  • Subjektovo-orientovan

  • Integrovan dta

  • Nemenn dta

  • asovo zvisl (time-variant) dtaPrevdzkov systmy aktulne hodnoty dt. Dta v dtovom sklade s uren na analzy a podporu rozhodovania. Dtov sklad, vo svojej podstate, mus obsahova historick dta a nielen aktulne hodnoty. Dta s ukladan ako obrazy (momentky, fotky; z angl. snapshots) minulch a sasnch obdob.

    Kad dtov truktra v dtovom sklade obsahuje element asu.

  • DWH zmes technolgi

  • truktra dtovho skladu

  • Celkov truktra DWH

  • Zdrojov dta

    Produkn systmyIntern dta (spreadsheets)Archvn dta (psky)Extern dta (akcie, roky, kurzy )

  • Doasn loisko (data staging)

    Extrakcia (Data Extraction)Transformcia (Data Transformation)Prenos dt (Data Loading)

  • Presun dt do dtovho skladu

  • Poskytovanie informci (Information Delivery)

  • METADATA v dtovom sklade

  • Dleitos METADT Users to compose and run the query can have several important questions:Are there any predefined queries I can look at?What are the various elements of data in the warehouse?Is there information about unit sales and unit costs by product?How can I browse and see what is available?From where did they get the data for the warehouse? From which source systems?How did they merge the data from the telephone orders system and the mail orders system?How old is the data in the warehouse?When was the last time fresh data was brought in?Are there any summaries by month and product?

  • Metadata v dtovom sklade obsahuj odpovede na otzky ohadom dt v dtovom sklade

  • Metadata v OLTPV prevdzkovch systmoch nepotrebujeme pozna podstatu uloench dt. Neexistuje poiadavka user-friendly interfejsu na prstup k obsahu databzy. Data dictionary alebo systmov katalg sa vyuva iba pre systmov potreby (IT potreby).

  • Metadata v DWHPouvatelia potrebuj dostaton podklady k tomu, aby boli schopn prezera a skma obsah dtovho skladu. Pouvatelia potrebuj pozna vznam (zmysel) jednotlivch dtovch poloiek. Pouvateom mus by zabrnen urobi nesprvne zvery analz potencilne vyplvajce z nesprvnej interpretcie smantiky dt. Bez adekvtnej podpory v oblasti metadt, pouvatelia vekch dtovch skladov s plne straten!

  • Typy MetadtMetadata v dtovom sklade delme do troch kategri:Prevdzkov (Operational) MetadtaExtrakn a Transforman (Extraction and Transformation) MetadtaPouvatesk (End-User) Metadata

  • Prevdzkov metadtaData for the data warehouse comes from several operational systems of the enterprise. These source systems contain different data structures. The data elements selected for the data warehouse have various field lengths and data types. In selecting data from the source systems for the data warehouse, you split records, combine parts of records from different source files, and deal with multiple coding schemes and field lengths. When you deliver information to the end-users, you must be able to tie that back to the original source data sets. Operational metadata contain all of this information about the operational data sources.

  • Extraction and Transformation MetadataExtraction and transformation metadata contain data about the extraction of data from the source systems, namely, the extraction frequencies, extraction methods, and business rules for the data extraction. Also, this category of metadata contains information about all the data transformations that take place in the data staging area.

  • End-User MetadataThe end-user metadata is the navigational map of the data warehouse. It enables the end-users to find information from the data warehouse. The end-user metadata allows the end-users to use their own business terminology and look for information in those ways in which they normally think of the business.

  • THE ARCHITECTURAL COMPONENTS

  • ARCHITECTURAL FRAMEWORKFlow of data

  • ARCHITECTURAL FRAMEWORKControl

  • Data AcquisitionData acquisition covers the entire process of extracting data from the data sources, moving all the extracted data to the staging area, and preparing the data for loading into the data warehouse repository. The two major architectural components are source data and data staging.

  • Data Acquisition (2)

  • List of Functions and Services Data ExtractionSelect data sources and determine the types of filters to be applied to individual sourcesGenerate automatic extract files from operational systems using replication and other techniquesCreate intermediary files to store selected data to be merged laterTransport extracted files from multiple platformsProvide automated job control services for creating extract filesReformat input from outside sourcesReformat input from departmental data files, databases, and spreadsheetsGenerate common application code for data extractionResolve inconsistencies for common data elements from multiple sources

  • List of Functions and Services (2)Data TransformationMap input data to data for data warehouse repositoryClean data, deduplicate, and merge/purgeDenormalize extracted data structures as required by the dimensional model of the data warehouseConvert data typesCalculate and derive attribute valuesCheck for referential integrityAggregate data as neededResolve missing valuesConsolidate and integrate data

  • List of Functions and Services (3)Data StagingProvide backup and recovery for staging area repositoriesSort and merge filesCreate files as input to make changes to dimension tablesIf data staging storage is a relational database, create and populate databasePreserve audit trail to relate each data item in the data warehouse to input sourceResolve and create primary and foreign keys for load tablesConsolidate datasets and create flat files for loading through DBMS utilitiesIf staging area storage is a relational database, extract load files

  • Data StorageData storage covers the process of loading the data from the staging area into the data warehouse repository. All functions for transforming and integrating the data are completed in the data staging area. The prepared data in the data warehouse is like the finished product that is ready to be stacked in an industrial warehouse.

  • Data Storage (2)

  • Data Storage (3)List of Functions and ServicesLoad data for full refreshes of data warehouse tablesPerform incremental loads at regular prescribed intervalsSupport loading into multiple tables at the detailed and summarized levelsOptimize the loading processProvide automated job control services for loading the data warehouseProvide backup and recovery for the data warehouse databaseProvide securityMonitor and fine-tune the databasePeriodically archive data from the database according to preset conditions

  • Information DeliveryInformation delivery spans a broad spectrum of many different methods of making information available to users. For users, the information delivery component is the data warehouse.

  • Information Delivery (2)The information delivery component makes it easy for the users to access the information either directly from the enterprise-wide data warehouse,