What Is Big Data and Why Does It Matter

Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

What Is Big Data and Why Does Iit Matter?:

What Is Big Data and Why Does Iit Matter ? by Benyamin Allah Gholi Zadeh for academic class presentation Dr. Azadeh Asgari

What Is Big Data?:

What Is Big D ata? There is not a consensus as to how to define big data 2 “ Big data exceeds the reach of commonly used hardware environments and software tools to capture, manage, and process it with in a tolerable elapsed time for its user population .” - Teradata Magazine article, 2011 “ Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze .” - The McKinsey Global Institute, 2011

What Is Big Data?:

What Is Big Data? 3 IOPS( Input/Output Operations Per Second)

Slide4:

4

3 Vs. of Big Data:

3 Vs. of Big Data The “BIG” in big data isn’t just about volume 6

Slide7:

7

Big Data Analysis Example: Product arrangement:

Big Data Analysis Example: Product arrangement How does location tracking work? Recognize the dead zone

Big Data Analysis Example:

Big Data Analysis Example Big data can generate significant financial value across sectors

The big data opportunities:

The big data opportunities

ML Algorithms:

ML Algorithms Math Vectors/Matrices/SVD Recommenders Clustering Classification Freq.Patten Mining Genetic Utilities Lucene/Vectorizer Collections (primitives) Apache Hadoop Applications Examples

How Is Big Data Different?:

How Is Big Data Different ? 1) Automatically generated by a machine (e.g. Sensor embedded in an engine) 2) Typically an entirely new source of data (e.g. Use of the internet) 3) Not designed to be friendly (e.g. Text streams) 4) May not have much values Need to focus on the important part 14

How Is Big Data More of the Same?:

How Is Big Data More of the Same ? Most new data sources were considered big and difficult Just the next wave of new, bigger data 15 < The present > < The past > < The future >

Risks of Big Data:

Risks of Big Data Will be so overwhelmed Need the right people and solve the right problems C osts escalate too fast Isn’t necessary to capture 100% Many sources of big data is privacy self-regulation Legal regulation

Why You Need to Tame Big Data:

Why You Need to Tame Big Data Analyzing big data is already standard (e.g. ecommerce) Be left behind in a few years So far, only missed the chance on the bleeding edge Capturing data, using analysis to make decisions Just an extension of what you are already doing today

The Structure of Big Data:

The Structure of Big Data Structured Most traditional data sources Semi-structured Many sources of big data Unstructured Video data, audio data

Exploring Big Data:

Exploring Big Data Gathering & preparing d ata (70~80%) Analyzing data (20~30%) The time for developing an analysis Gathering & preparing d ata (95%) The time for developing an analysis (Initially working with big data) Analyzing data (5%)

Most Big Data Doesn’t Matter:

Most Big Data Doesn’t Matter long-term strategic use short-term tactical use P ieces that don’t matter at all

The Example of RFID Tags:

The Example of RFID Tags Have short-term value (e.g.) The responses at 10 second intervals between tags and readers Have long-term value With the entry and exit of the pallet

Filtering Big Data Effectively:

Filtering Big Data Effectively The extract, transform, and load (ETL) processes taking a raw feed of data, reading it, and producing a usable set of output

Filtering Big Data Effectively:

Filtering Big Data Effectively Sipping from the hose Focus on the important pieces of the data It makes big data easier to handle

Mixing Big Data with Traditional Data:

Mixing Big Data with Traditional Data The biggest value in big data can be driven by combing big data with other corporate data

Mixing Big Data with Traditional Data:

Mixing Big Data with Traditional Data Browsing history Knowing how valuable a customer is What they have bought in the past Smart-grid data For a utility company Knowing the historical billing patterns Dwelling type Text (Online chat and e-mails) Knowing the detailed product specification being discussed The sales data related those products

The Need for Standards:

The Need for Standards Become more structured over time Fine-tune to be friendlier for analysis Standardize enough to make life much easier

Today’s Big Data Is Not Tomorrow’s Big Data:

Today’s Big Data Is Not Tomorrow’s Big Data Banking industries were very hard to handle even a decade ago “BIG” will change Big data will continue to evolve

Where are we going?:

Where are we going?

Summary:

Summary • Big Data Analytics is about the convergence of multiple data streams in real or near-real time yielding new information sources . • Storage , and particularly shared storage, in the context of big data analytics is often seen as an avoidable source of latency . • Big Data Analytics is the next storage frontier.

What Do You Think?:

What Do You Think? Have you ever heard about “ Big data solutions and the cloud”? what is that? Do you know any other examples of big data? What are they? Is there any other field that can improve by using big data analysis? What is Hadoop? Have you ever heard something about Map Reduce? Can you explain it? Where Are We Going?

Slide32:

32

What will happen?:

What will happen?

authorStream Live Help