Big data

21
BIG DATA By Sakshi Chawla

Transcript of Big data

Page 1: Big data

BIG DATA

By Sakshi Chawla

Page 2: Big data

What is BIG DATA?

• The Oxford English Dictionary (OED )defines big data: “data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges.”

• Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Although big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of data.

Page 3: Big data

.

• Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques.

Page 4: Big data

Characterstics of Big Data: The 5 Vs

Page 5: Big data

Volume:

• It is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not. The name ‘Big Data’ itself contains a term which is related to size and hence the characteristic.

• Big data implies enormous volumes of data.• Now that data is generated by machines,

networks and human interaction on systems like social media the volume of data to be analyzed is massive.

Page 6: Big data

Velocity:

• The term ‘velocity’ in the context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development.

• Big Data Velocity deals with the pace at which data flows in from sources like business processes, machines, networks and human interaction with things like social media sites, mobile devices, etc.

Page 7: Big data

Variety:

•  The next aspect of Big Data is its variety. This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts.

• This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data.

Page 8: Big data

Veracity:

• Big Data Veracity refers to the biases, noise and abnormality in data. Is the data that is being stored, and mined meaningful to the problem being analyzed.

• The quality of the data being captured can vary greatly. Accuracy of analysis depends on the veracity of the source data.

Page 9: Big data

Variability:

•  This is a factor which can be a problem for those who analyse the data. This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

Page 10: Big data
Page 11: Big data

Storage And Architecture:

• Recent studies show that the use of a multiple layer architecture is an option for dealing with big data. The Distributed Parallel architecture distributes data across multiple processing units and parallel processing units provide data much faster, by improving processing speeds.

• This type of architecture inserts data into a parallel DBMS, which implements the use of MapReduce and Hadoop frameworks. This type of framework looks to make the processing power transparent to the end user by using a front end application server.

Page 12: Big data

Hadoop

• Hadoop is a set of algorithms (an open-source software framework written in Java) for distributed storage and distributed processing of very large data or Big Data on computer clusters built from commodity hardware.

• It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance.

• Hadoop changes the economics and the dynamics of large-scale computing

Page 13: Big data

Applications:

Smart Healthcare

Homeland Security

Traffic control

Manufacturing

Multi channel Sales

Telecom

Trading analytics

Search quality

Page 14: Big data
Page 15: Big data

Government:

The use and adoption of Big Data, within governmental processes, is beneficial and allows efficiencies in terms of cost, productivity and innovation. That said, this process does not come without its flaws. Data analysis often requires multiple parts of government (central and local) to work in collaboration and create new and innovative processes to deliver the desired outcome. Below are the thought leading examples within the Governmental Big Data space.

India:

• Big data analysis was, in parts, responsible for the BJP and its allies to win a highly successful Indian General Election 2014.

• The Indian Government utilises numerous techniques to ascertain how the Indian electorate is responding to government action, as well as ideas for policy augmentation

Page 16: Big data

Risks of Big Data:

#1: Loss of agility

In a typical large-scale organization, data is housed on multiple platforms. There is transactional data, email data, analytics data, etc. Management wants people to be able to locate, analyze, and make decisions based on this data quickly. But if the data isn’t evaluated, organized, and stored properly, critical information can be either difficult or impossible to find – slowing a business down at the exact moment when speed is essential.

#2: Loss of compliance

Laws are getting more and more complex with regard to how long companies need to retain data, how they need to retain it, and where they need to retain it. There are both general regulations in place as well as state- or industry-specific regulations that may apply. It is not uncommon for regulators to perform random audits to examine a company’s policies regarding data and their actual management of that data. A compliance failure can result in significant fine or damage to reputational risk.

Page 17: Big data

#3: Loss of security

With more data located in and moving between more places than ever before, there are also a vastly increased number of ways to hack into that data. A security breach can result in theft, fraud, fines … and, of course, reputational loss.

#4: Loss of money

A server may seem inexpensive at first glance – but never assume that storage is cheap.

Page 18: Big data
Page 19: Big data

Benefits:

• Cost reduction Big data technologies like Hadoop and cloud-based analytics can provide

substantial cost advantages. While comparisons between big data technology and traditional architectures (data warehouses and marts in particular) are difficult because of differences in functionality, a price comparison alone can suggest order-of-magnitude improvements

• Faster, better decision makingAnalytics has always involved attempts to improve decision making, and big data doesn’t change that.

Page 20: Big data

• New products and services

Perhaps the most interesting use of big data analytics is to create new products and services for customers. Online companies have done this for a decade or so, but now predominantly offline firms are doing it too. GE, for example, has made a major investment in new service models for its industrial products using big data analytics.

Page 21: Big data

ThankYou