Download - Great Data By Design

Transcript
Page 1: Great Data By Design

Great Data.By Design.

Page 2: Great Data By Design

2

Great data isn’t an accident. It happens by design. Ensuring that you have the clean, safe, connected data you need to power confident decisions and effective business processes isn’t an easy task.

You have to work at it…

Page 3: Great Data By Design

3

The challenge is that the market trends are working against you as data professionals.

Your jobs are getting harder.

Page 4: Great Data By Design

More Data. In More Places. Moving Faster Than Ever

Before.

Market Trend

#1

Page 5: Great Data By Design

The volume, velocity, and variety of data is increasing at an unprecedented pace. The amount of data generated in the world today is doubling every two years.

It’s the new Moore’s law.

20090.8Zettabytes

202035.2Zettabytes

Page 6: Great Data By Design

And, to top it off, we are attaching RFID devices and sensors to everything.

Technologies like Hadoop allow us to affordably store vast amounts of data.

The power of mainframe computing now fits in the palm of our hands.

Page 7: Great Data By Design

Take jet airplanes for example. A jet aircraft engine has up to 3000 sensors

on it, and they are constantly throwing off data. The amount of data that comes off an engine during flight ranges from .5 TB to 4 TB.

And we are only just beginning.

Page 8: Great Data By Design

The volume, variety, and velocity of data will only continue to increase.

Page 9: Great Data By Design

Data is Everywhere and It’s Quality is

Questionable

Market Trend

#2

Page 10: Great Data By Design

It’s in all the old places, and all the new ones.

Both on-premise and in the cloud.

Data is scattered everywhe

re

Mobile Devices

Social Media

CRM Applications ERP Applications

Message Queues

Flat Files

SensorsFlat FilesObscure Legacy Systems

Databases

Unstructured Docs

Cloud

Hadoop ClustersMainframes

Page 11: Great Data By Design

It used to be that data integration projects were limited or put at risk by the cost and performance of CPU, memory, network, or disk. Today, that’s no longer the case.

Now we’re limited by our ability

to deal with data that is fragmented and of poor or questionable quality.

Page 12: Great Data By Design

12

To realize the full value of their data, organizations need to be able to integrate it across the entire enterprise.

And data quality needs to be built into the process. Much like manufacturing went through a transition in the 80’s – where the quality steps for building products were baked into the manufacturing process – the same needs to be done with data.

Page 13: Great Data By Design

The Business Wants Self Service

Market Trend

#3

Page 14: Great Data By Design

Over the last five years, business users have become more technically savvy. Easy-to-use technology now plays a large role in their personal lives, helping them do things faster, easier, and better. It has empowered them. And they expect the same experience at work.

The business doesn’t want to wait for IT to deliver great data. They want to do it on their own. The Empowered

Consumer

Search

SocialNetworking

Apps

Mobility

Page 15: Great Data By Design

There are (some pretty cool) self-service tools that allow them to visualize their data.

The trouble is, they only work for a single data set at a time.

When the business needs data that crosses business boundaries, or data set boundaries, they still have to come back to IT.

Page 16: Great Data By Design

Or worse yet, they come back to IT because they have done all they can with their self-service tools and then realize that the data they are using is mission-critical and requires mission-critical processes…

…that they can’t run on their laptop.

Page 17: Great Data By Design

Self-service can only take the business so far.

Page 18: Great Data By Design

A new way of thinking is needed

Page 19: Great Data By Design

A lot of companies believe that the way to achieve competitive advantage is to focus on their core business processes.

If we’re the best at what we do, we

can beat the competition.

And they aren’t entirely wrong.

Page 20: Great Data By Design

They believe that by investing in applications to support those core business processes they can use the new efficiencies – or the improved service that comes from those efficiencies – for competitive advantage.

We need an application that will

automate and improve our core processes, so we

can beat the competition.

And they aren’t entirely wrong.

Page 21: Great Data By Design

The trouble is: people still think about their business application as a single, monolithic thing.

Business-Critical Application

Page 22: Great Data By Design

That’s where they’re wrong.

Page 23: Great Data By Design

The reality is that these processes and the core applications supporting them aren’t a single monolithic thing. Any business process today is highly distributed across multiple systems…

Business-Critical Applications

Page 24: Great Data By Design

…. and the number of systems and data points to which data must flow in or out is only increasing.

Business-Critical Applications

Page 25: Great Data By Design

It is generally true that innovation exists at the edges of boundaries, or the intersection of different disciplines.

Innovation happens here

Page 26: Great Data By Design

As more data gets created across more systems, the ability to integrate and intersect data across those boundaries becomes a critical success factor for the next generation of innovation.

Do we have all the data we need to support our compliance constraints?

Who are our most profitable customers?

How can I improve collaboration between suppliers and contractors?”

How do I accelerate my supply chain?

Can we drive efficiencies in our procurement processes?”

Can we create new information based services to offer our customers

Business-Critical Applications

Page 27: Great Data By Design

But integrating data is harder than most people think.

Page 28: Great Data By Design

Take the jet aircraft, for example. While the engines may be the same from plane to plane, the data coming off of them – via their 3000 sensors -- is not controlled by the engine manufacturer. It’s controlled by the airlines.

And each airline stores those same 3000 attributes in their own format.

Page 29: Great Data By Design

Which means that when the data for the same kind of engine is sent back to the manufacturer for analysis, they first have to normalize it. What would seem like an easy exercise – analyzing data from the same kind of engine -- is much harder than it looks.

The additional challenge is that the legacy data never dies and has to be pulled in as well.

Page 30: Great Data By Design

Every data project is like this. It is always harder than anyone thinks and the number of moving parts is only increasing.

Page 31: Great Data By Design

To overcome this challenge, you have to design great data into your business processes.

Page 32: Great Data By Design

Just like you invest in people, process, and technology for your core business processes, you have to invest in people, process, and technology to integrate the distributed data that supports those processes.

Page 33: Great Data By Design

That is because business agility now depends on data integration agility. And data integration agility depends on getting everyone involved -- and ensuring that the business and IT have the right tools to enable collaboration. In fact, we’ve seen that in companies where the business and IT collaborate, DI projects are executed 5x faster than in companies where they don’t.

Page 34: Great Data By Design

34

Considerations for Designing Great Data5

Page 35: Great Data By Design

Connect to all your dataRDBMS, Flat Files, XML, Hadoop, NOSQL, Social Media, Mainframe, Machine Data, and More …#1Data integration enables you to combine data from many different and rich sources to produce new business information you couldn’t get from a single source. Make sure your data integration tools are able to connect to any data source (both current and legacy) including RDBMS, NOSQL, mainframe, text, applications, and so on — and not just the data sources you consume today. It’s this universal set of connections that makes it possible to bring all that data together.

Page 36: Great Data By Design

Support the Right Format and LatencyBatch, Real-Time, Near Real-Time. Structured, Unstructured, Semi-Unstructured.

#2In the same way data integration draws data from many different sources, it also must be able to consume various and multiple data types, including structured, semi-structured, and unstructured data sources in batch and real-time modes. You need a tool that is flexible enough to work with any type of data you encounter.

Page 37: Great Data By Design

Understand Data Structure and ContentInclude Data Profiling in Your Methodology#3With so many different sources of data involved, you need to have a means to make sure that your data is what you expect. It’s important that your tools allow a level of data profiling so that you can verify the data going into and out of your system, and ensure that you’ll end up with the desired results.

Page 38: Great Data By Design

Enable Effective Business and IT Collaboration Be Agile and Lean#4You can’t afford to create and execute projects using traditional, isolated development methods anymore. Your data integration tools need to support lean and agile integration processes that enable business and IT collaboration so that development happens quickly and interactively.

Page 39: Great Data By Design

Support Business Growth and ExpansionBe Able to Scale Up and Scale Down#5Companies grow, and so do the sizes of their projects. You don’t want to be locked into tools that are only appropriate for today’s projects. Rather, you want tools that have the ability to scale, grow, and move projects from small departmental innovative exercises to large enterprise mission-critical environments, or vice versa.

Page 40: Great Data By Design

Learn how you can build great data, by design.

Click here to download