AmazonRedshift
-
Upload
ahasan-habib -
Category
Documents
-
view
32 -
download
0
Transcript of AmazonRedshift
Amazon Redshift~ Ahasan Habib
Technical Project Manager, Ixora SolutionDhaka, Bangladesh
Data warehouse conceptWhat is Data warehouse?
●Relational database
●Query & analysis
●Transaction processing data => Historical data
●Transaction workload => Analysis work load
●Extract, Transform, Load
Data warehouse architecture
Big DataSo large and complex traditional data processing applications are not adequate
characteristics:
●Volume:
The amount or quality of data.
●Velocity
The rate at which data is created.
●Variety
The different types of data.
Big data Architecture
Operational Data●Transactional data.
●Event data.
●Realtime data
●Helps to run day to day system/business operation.
Analytical Data●Historical data
●Numerical values, measure, matrix (numerical measurement)
●Business intelligence & decision making
Rows Vs Columnar Database
What is Redshift ?●A data warehouse management tool.
●Develop and manage by Amazon.
●Cloud hosted large data management system.
●Distributed data management system.
●Columnar data storage.
Redshift Speciality1.Extremely fast.
2.Web service API based communication.
3.Massive parallel processing.
4.Full ANSI SQL support.
5.Columnar database.
6.Learning is very easy.
Redshift Product History●November 2012 Bita release
●Feb 14 2014 Initial release
●POSTGRESQL 8.0.2
Redshift Architecture
Advantages using Redshift●Extremly faster for analytical data processing.
●Support ANSI SQL syntax.
●Cloud based solution.
●Highly secured (context of data & system access)
Redshift data warehouse design1.Start schema
2. Snowflakes Schema
3. Denormalized Fact TableCustomer Id
Customer Name
Customer Address
State
City
Country
Product Id
Product Name
Product Category
Gross Sales Amount
Net Sales Amount
Index and Constraints1.Sort Key
2.Distribution Key
3.Primary-key/Foreign Key
4.Triggers
Data TypesData Type Alias Description
SMALLINT INT2 Signed 2 byte
INTEGER INT4 Signed 4 byte
BIGINT INT8 Signed 8 byte
DECIMAL NUMERIC Selectable precision
REAL, Double Precision Float4, Float8 Single, Double Precision (32,64)
CHAR CHARACTER,NCHAR Fixed Length (4096)
VARCHAR NVARCHAR, TEXT Variable Length (65535)
DATE, TIMESTAMP Calendar Date, Date & Time (UTC)
BOOLEAN BOOL True/False
Data Loading● S3
●COPY command
●Data Pipeline
Query●CRUD
●Dynamic query
●Metadata Query
●Query execution Plan
Other database objects●Built in Function
●User defined Function
●Stored Procedures
●Transactions
Security●User Management
●Role Management
●Schema Management
Client Development Tools●Navicat
●SQL Server Management Studio
●Various Drivers:
Linux
Visual Studio
Scala
Python
Q & A