the big data revolution.pdf

17

Transcript of the big data revolution.pdf

  • THE BIG DATA REVOLUTION

    OTHER GREAT READS

    COPYRIGHT

  • A business running without accurate data is running blind.- Ash Mahmud

    In the third century B.C., Egypts ruling Ptolemaic dynasty erected the Libraryof Alexandria. A testimony to civilization and learning - and the dynastys ownwealth and influence - the library was reputed to hold the sum of humankindsknowledge. For the next twenty-three centuries, this body of knowledge grewat a steady pace and libraries were constructed all over the globe to contain it.Then in the late twentieth century the Internet came of age. This historicadvance revolutionized information gathering and storage capabilities. Today,there is approximately 2.5 trillion times as much information in the world aswas held in the Library of Alexandria - and the amount is doubling every threeyears. The vast majority of it is stored not in libraries but on servers. In 2000,three-quarters of the worlds stored information was in analog form; now, thatfigure is less than 2 percent. In other words, to explore the worlds knowledgea person no longer needs a library - a laptop or tablet or smartphone will dojust fine. This vast digital storehouse is, in essence, big data.

    Big data means big profits. Research from the Harvard Business School, theMassachusetts Institute of Technologys Sloan School of Management,McKinsey & Companys business technology office, and the University ofPennsylvanias Wharton School shows that businesses that make the most ofbig data are fully 6 percent more profitable than those that dont. In times likethese, thats an uncommonly large number.

    Heres what else you need to know about big data.

    The Business CaseThe Harvard Business Review concluded that the companies that significantlyoutperform their peers are more likely to collect multiple types of data - fromthat generated by RFID tags, for instance, to data from Web-trackingtechnologies. Furthermore, high-performing companies are more in touch withdata than their less-successful rivals. When asked about marketing andcommunications, 59 percent of executives from top-rated companies calleddata extremely important to their businesses as compared to a 39-percentranking from lower-rated concerns.

    As for the technology itself, executives of companies that are ahead of theirpeers in analyzing data gave it high marks. Forty percent of high-performingcompanies said the speed at which their organization processes data hasincreased significantly over the past twelve months, and that speed isproducing dividends.

  • A joint study conducted by IBM and MIT and published under the title, TheNew Intelligent Enterprise concluded that the number of businesses usinganalytics to build a competitive advantage had jumped by almost 60 percentfrom a year earlier. The study said that nearly six out of ten organizations areusing analytics to differentiate themselves from competitors.

    Case studies back up the results.

    Picture a customer call center that has the technology to detect a change intone as a frustrated customer raises his voice to say: This is the third poweroutage Ive had in one week! A big-data solution would both identify thewords third and outage as negative terms affecting the consumer, and thetonal change would be another indicator that he or she might choose anotherprovider. These insights can be gleaned from unstructured data. But what ifthat unstructured data could be combined with the customers record data andtransaction history? Now the company has a personalized model of theconsumers value and a sense of how tenuous the relationship has become.

    The demand for big-data insights has risen so quickly that there are nownumerous companies whose product itself is big data. Cases in point:

    Anyone who flies regularly knows it isnt always a pleasant experience. And one ofthe most frustrating parts is the all-too-common wait after landing for a gate andground crew to be ready for deplaning. This happens when a plane lands early. If aplane lands late, its less annoying for the passengers, but the ground crew hasbeen waiting, costing the airline money. Its not only the carriers that suffer;airports with lots of late and early arrivals develop bad reputations with passengers.

    Enter PASSUR Aerospace, a big-data company that tracks plane arrivals withunprecedented accuracy. Its rightETA program relies on patented algorithmsbacked by over 150 passive radar sensors and integrated with real-timemining of multiple databases. Every 4.5 seconds, PASSUR collects data onevery plane tracks. It also uses its vast storehouse of past arrivals to compareprevious landings under the same conditions at the same airport. PASSURcurrently has contracts with the top eight North American carriers, sixty U.S.airports, 200 corporate aviation departments, and the U.S. government.

    IBM has a patent on technology for securing premises using surface-basedcomputing technology. The surface in question is the floor, and the technologyidentifies what or who is on the floor - i.e. furniture or people, what the object orperson weighs, and when and where they move. In other words, IBM has developedsmart floors. Walk into a room and the lights come on, and the latest episode ofMad Men appears on your flat-screen. If an elderly person falls, Lifeline will receivea signal. Stores, casinos, hotels, and government agencies could use the floors to

  • gather data on foot-traffic patterns and use it to redesign their spaces. And it couldbe a potent antitheft device, alerting homeowners and businesses to an invaderwith unprecedented precision.New York City is using big data in innovative and effective ways. It was a pioneer inthe use of proactive policing, which uses computers to track crime as it takes placein the citys neighborhoods. From experience, police know that a spike in minorcrimes in a neighborhood is the precursor to a wave of more serious offenses. Whenthe police departments big-data modeling shows this happening, officers flood thearea. The citys crime rate has fallen to historic lows since this system wasinstituted. It works.

    So does the citys use of big data to cut down both on fires and violations ofthe housing code.

    With affordable housing at a premium in New York, many buildings andapartments are illegally subdivided into small rooms. Often these rooms areshared by groups of immigrants from Asia or Latin America who work at low-paying jobs. Fire-department data showed that fires were far more likely tooccur in buildings that had been illegally subdivided. But with approximately25,000 complaints a year about overcrowded buildings and only 200inspectors, the city was overwhelmed. Big-data analysts in then-Mayor MichaelBloombergs office searched for ways to identify which of the 25,000complaints were most egregious. They created a database of all 900,000buildings in New York, along with information from nineteen city agencies thatincluded ambulance visits, rodent infestations, calls to 911, unpaid taxes,unexplained spikes in utility usage, building age, neighborhood crime rates,and more. They then cross-referenced this with five years of fire-departmentinformation that ranked fires by their size and cause. The data quicklydemonstrated that certain combinations of complaints, fines, and otherviolations were a reliable predictor of where fires would occur. Inspectors couldnow triage their visits. The results speak for themselves: Before the big-datainitiative, inspectors issued emergency vacate orders to 13 percent of thebuildings they visited; today that figure is 70 percent. And the numbers offires in the city has decreased dramatically.

    Big data can not only prevent fires, it can save lives. Canadian medical researchershave developed a series of big-data measures to improve the survival rate ofpremature babies. The goal is to determine which babies are most likely to developlife-threatening infections. The process continuously measures sixteen vital signs,including breathing, blood-oxygen levels, organ function, blood pressure, andtemperature. This information is aggregated into a 1,000-data-points-per secondflow that alerts doctors to subtle changes that are often the precursor of infection.Prophylactic antibiotics can then be administered. The system has increased thesurvival rate of premature infants at the hospitals where it is used.

  • In Baltimore, John Hopkins Medical School researchers have found that datafrom Google Flu Trends - a free aggregator of flu-related search terms predicts increases in flu-related emergency room visits a full week before theCenters for Disease Control. Flu outbreaks are tracked nationwide, and localhospitals are able to prepare for the onslaught. Twitter and Google asdiagnostic tools this is what the new digital democracy looks like.

    What Big Data IsntBig data is not the same as analytics. It is true that both seek to gleanintelligence from data and translate the findings into competitive advantage,but it is the volume, speed, and advanced technology involved that puts bigdata on a higher plane.

    For example, in 2012, about 2.5 exabytes of data were created every day. Forperspective, one exabyte is equal to one quintillion bytes. Put another way,the data that now floods the Internet every second is equivalent to the datastored on the entire Internet twenty years ago. And now, in 2014, companiesare working with petabytes, i.e., one quadrillion bytes of data in a single dataset. Wal-Mart alone collects more than 2.5 petabytes of data every hour fromits customer transactions - or about 20 million, four-drawer filing cabinetsworth of text.

    But speed of data creation can trump volume in certain cases, as an MIT groupdiscovered one recent Black Friday, the kick-off of the holiday shopping seasonin the United States. Using location data from mobile phones to deduce thenumber of people in a Macys parking lot, the researchers were able toestimate the retailers sales on that day before the retailer itself knew itsnumbers. Accessing real or nearly real-time information can make it possiblefor a company to be much more agile than its competitors.

    Even a bull named Badger-Bluff Fannie Freddie figures in the big-datarevolution. There are 8 million-plus Holstein cows in the United States, butonly one bull merits mention based on scientific data. Dairy cattle sired by him 346 daughters and counting - produce more milk than cows fathered bymore ordinary bulls.

    The dairy industry, long cursed by skimpy margins and bulging costs, isprofiting from knowledge acquired from big data. Dairy breeding, it turns out,is ideal for quantitative analysis, because breeders keep exacting pedigreerecords and artificial insemination makes it easier to trace genetic informationabout the handful of top-rated animals. The producers can also track overallmilk output, fat and protein content, and udder quality. Since farmers paythousands of dollars for their breeding stock, they dont mind paying top dollar

  • for the best genetic material.

    So whether we are talking about airlines or fires or animal husbandry, it alladds up to an almost unfathomable amount of information. Learning to masterand manipulate it is the defining managerial competence of the twenty-firstcentury.

    Where Big Data Comes FromTechnological advances, coupled with changes in consumer behavior, nowallow managers, marketers, and leaders to see every online step a customermakes. Add up the oceans of data from digital video recorders, retailcheckouts, credit-card transactions, and countless other sources, andbusinesses everywhere are privy to a heretofore unimaginable trove ofinformation about what consumers see and do.

    It is difficult to find a product or piece of equipment these days that doesntcontain coding. Airplanes are equipped with more than a billion lines of codethat generate about ten terabytes of data per engine during every thirtyminutes of operation. To put those numbers in perspective, a flight fromHeathrow Airport in London to John F. Kennedy in New York City wouldgenerate about 650 terabytes of data.

    Time stamps are ubiquitous. Think of the auto-dating metadata that getscaptured every time someone takes a picture with their camera orsmartphone, posts to Facebook, or uses a tablet to watch a television show. Inshort, it is not hard to assemble a timeline of peoples lives. The averagecommuter in London will have his or her picture taken more than 150 times aday while traveling from home and back from downtown London. Add thatsurveillance to the variety of sentiment, temporal, and spatial informationgenerated during that time frame, and a good deal of big data is there forbusinesses to access.

    There also is a common misconception that big data is generated only online.Yes, there are over a billion Google searches and Twitter messages each day,and Facebook, LinkedIn, MySpace, and Pinterest alone have over a billioncombined users who post text, images, and links. In addition, Web sites trackvisitor preferences and habits, as well as internal processes and results. Butbig data doesnt stop there. It also includes scientific, medical, and sociologicalstudies and research; organizational, government, and political documents;entertainment and arts content in all media; and news of every type.

    While big data has the potential to radically improve the way organizationsfunction, it also has the potential to overwhelm them. Many leaders dont

  • know where to begin. The key is to develop processes that recognize, mine,and exploit the information that is relevant to an organizations needs.Achieving this requires changes in management, processes, and culture. Thechoice is stark: Master the data or drown in it.

    To make the most of big data requires us to rethink how we think. Wheninformation was stored in analog form, data collection was expensive andtime-consuming. Taking an opinion survey, for example, required face-to-face,telephone, or mail contact with interviewees. Today, people can respond bytouching a single key, and their responses are recorded and filtered instantly.With the big-data deluge, there will be some inaccuracy, but it is worth it. Aslightly disorganized sample of 100,000 consumers preferences, for example,yields better insights than a 100-percent accurate survey of 200 people.

    Language translation provides a vivid example. Because language is complexand idiosyncratic, early online translations were literal dictionary downloadsand overly formal and often contained errors in grammar and tense. Then IBMfound a way to improve the process, using French as its model. Because theCanadian government conducts its business in both French and English, IBMused transcripts of that countrys parliamentary proceedings to create softwarethat translated sentences with far more sophistication and accuracy than asimple dictionary match. This technique was then applied to other languageswith equally improved results. But there were limits. The syntax was too oftenstilted, and there were few colloquialisms. Enter Google. It gatheredtranslations from across the Internet: business documents, Web sites, songlyrics, books, even emails that had been translated for one reason or other.The result was translations that were not only more accurate, but morecolloquial and fluid. The lesson: Vast quantities of messy data yielded a resultsuperior to the one gleaned from small amounts of orderly data.

    What Over WhyThere is another key difference with the way information was treated in thepast. Traditionally, the goal was to work backwards from the data to figure outthe why of what the information showed. If, for example, data revealed thatthere was more employee absenteeism on Monday than on any other day ofthe week, the goal would be to find the reasons behind the behavior. With bigdata, the what becomes more important than the why. The primary goal isto allow organizations to prepare for events, minimize any negative impact, orseize any opportunity they may present - in this case, change staffing patternson Mondays. Oxford scholar Viktor Mayer-Schonberger and Economist dataeditor Kenneth Cukier put it this way: Society will need to shed some of itsobsession for causality in exchange for simple correlations: not knowing whybut only what. This overturns centuries of established practices and challenges

  • our most basic understanding of how to make decisions and comprehendreality.

    The change has profound implications for all organizations. It creates apractical, proactive, forward-thinking mindset.

    Not surprisingly, it is business that is leading the charge. UPS is an example.The delivery giant wanted to lower the number of truck breakdowns. Thecompany knew which parts were most likely to give out, so it has placed heator vibration sensors on those parts. When the sensors detect a certainmeasure, the part is replaced before it fails in the shop, not on the road. Thedata doesnt tell UPS why the parts are failing, just that they are. But it doesachieve the goal of minimizing delayed deliveries and idled drivers. With theimmediate mission accomplished, the company can explore the reasons for thebreakdowns.

    Vehicle parts are just the beginning. Big data allows businesses to measurejust about everything: sales, product performance and reviews, productionschedules and snafus, customer preferences and habits, stakeholders profiles,trends, and employee work patterns. Leaders and managers can nowunderstand their companies in real time with an acuteness that was impossibleeven two years ago.

    Many startups, of course, embrace big data from day one. For this reason, theyhold important lessons for all organizations.

    Consider shoes. Everyone wears them; it is a multi-billion-dollar market. Shoestores have been a staple of the retail landscape for hundreds of years. Theirleaders know their inventory, what sells, and what doesnt. They may runsales and promotions and have a customer loyalty program, but thats about asfar as it goes with knowing and understanding their customers. And forhundreds of years, it was enough. Then along came Zappos.com, which provedthat consumers would buy shoes online. Zappos tracks not only whatcustomers buy, but what else they browse, how they respond to promotions,whether or not they read reviews, and how they navigate the site; it alsoestablishes a dialogue with customers by encouraging feedback andsuggestions. Using this information, the company categorizes its customersinto cohorts, demographic groups that may share nothing more than a love ofred shoes - and that can be targeted with crimson precision.

    The algorithms designed by Zappos data technologists predict exactly how totarget these groups. If a man buys only athletic shoes, the company wont tryand sell him dress shoes but it will certainly alert him to the newest Nikes. Ifa woman purchases only flats, it keeps her up to date on the latest styles, but

  • it leaves out the stilettos.

    The beauty of these algorithms is that they self-improve: Every time acustomer makes a purchase (or ignores a promotion) the amount of predictivedata Zappos harvests grows. And why stop at shoes? Zappos has moved intoclothing and accessories. After all, it has a detailed profile of its customers;why not sell them skirts and shirts and hats and accessories that match theirshoes and lifestyles? The algorithms help Zappos create, in effect, a personalshopper for each customer.

    Data Overrules InstinctsAs the acceptance of the big-datas efficacy spreads, leaders have to adjust,and some of the adjustments may be painful. Traditional thinking aboutdecision-making and the value of experience must be recalibrated. Leadershave to be open to having their instincts overruled by data. This can bedifficult for leaders prized for their ability to make decisions that ultimatelyrely on gut instinct. This means letting go of some of their authority, whichcan be difficult. But to change an organizations decision-making culture,nothing is more important than starting at the top.

    That said, big data is not a substitute for leadership - it is a tool for leaders.Companies still need vision and inspiration and motivation. What big datasupplies are relevant facts; it is still a leaders job to turn those facts intoactions.

    In order to exploit big data effectively, leaders must receive the data mostpertinent to their challenges. The leaders of a large hardware or home-improvement chain, for example, may ask for a predictive analysis of weatherpatterns for the next month - specifically, what are the odds of extremestorms, such as hurricanes or tornadoes hitting various regions of the country?The big-data analysts go to work, deliver the science-based prediction, and thecompany then ships extra quantities of emergency supplies - plywood,flashlights, generators, tape, and so on - to the regions most likely to needthem. While this crucial supply-chain information was delivered by the datascientists, the process was initiated by the leader.

    Heres another process where big data is having a profound impact: hiring andrecruiting. In the past, the resume and personal interview were the mostimportant factors in a hiring decision. Big data is changing that. In the wordsof Dan Shapero, LinkedIns vice president of talent solutions and insights,Recruiting has always been an art, but its becoming a science.

    Big data opens up whole new streams of verifiable information about potential

  • hires that dwarfs those provided by resumes and interviews. But perhaps bigdatas greatest human-resources value comes as a predictive tool. Byanalyzing information on established high performers, it allows organizationsto determine the attributes that have proven themselves valuable in theworkplace. It can then evaluate potential hires for those characteristics. Infact, Google has an entire division devoted to people analytics. Thus, animpressive resume or winning interview is trumped by qualities such asflexibility, perseverance, social skills, a positive attitude, and emotionalintelligence.

    In the days before big data, recruiting was often contracted out to head-hunting firms, which would seek out candidates who werent actively looking tochange jobs. With big-data tools, many human-resources departments candispense with head hunters and turn to networking and data-aggregating sitessuch as LinkedIn and TalentBin. As Jennifer Hasche, an Intuit recruiter, putsit: With TalentBins search engine, it seems nobody is out of reach. We foundit to be a massive timesaver and critical tool in our discovery of top talent.The new approach has also benefitted the networking sites: LinkedIns paidrecruiting services accounts for some 60 percent of its annual revenues.

    In another sign of the democracy of data, a degree from a prestigious collegecarries far less weight these days. Big data promotes meritocracy. GuyHalfteck, who founded Knack, a Silicon Valley company that uses big data andgames to uncover the qualities of a stellar hire, puts it this way: You mightget into a school because your father got into the school. That is not indicativeand insightful about who you are as a person and your potential.

    Many companies these days expect employees to help define their own jobs, tobe self-starters and take initiative. Sometimes straight-A students lack thistrait. They have done what they were asked to do and done it well, but this isno guarantee they will be innovative thinkers who seek out new challenges.Googles vice president for people analytics, Prasad Setty, has stated that highSAT scores and GPAs are unreliable predictors of success at the company andare no longer used as important hiring criteria. Josh Bersin, founder ofBersin by Deloitte, a data-driven human-resources consultancy, has this to sayabout a conservative insurance company that had a policy of only hiring MBAsfrom top schools: They looked at the performance of their best salespeople,and they found it had nothing to do with where they went to school andnothing to do with their grades.

    In a managerial job or one that requires lots of collaboration, big data hasshown that the key quality is emotional intelligence, the ability to readpeoples unspoken signals and respond appropriately. Knacks Halfteck says,Whether youre an innovator, a physician, a teacher, a retailer, or a

  • salesperson, your social abilities, being able to intelligently manage the sociallandscape, intelligently respond to other people, read the social situation, andreason with social savviness - this turns out to differentiate between peoplewho do better and people who dont do as well.

    Interviews simply dont yield a lot of actionable data. A lot of stellar talentslack social skills and may not make a particularly good impression in aninterview. Steve Jobs is Exhibit A: He would have failed many job interviews.Data doesnt have those biases. It uncovers the accomplishments and potentialof the Steven Jobses of the world and relays the message to hire this talent.

    Sophisticated algorithms can spot hidden potential. By harvesting Twitterposts, social connections, blog comments, and other digital footprints, a fullprofessional picture emerges.

    Valuable human-resources data also lies in game playing. The way people playa game reveals a great deal about how they engage, react to stimuli,collaborate, and respond to stress. Knack has designed games that undercoverbehaviors and traits that are markers for high performers. Halfteck labels itbehavioral gig data, saying, We measure everything from creative abilitiesto emotional and social intelligence, to how you think and make decisions, howyou learn new information, how curious you are about the world. Knack workswith client companies to design customized games. When the companys topperformers play the games, Knack uses their skills to create a profile forpotential hires.

    The human-resources power of big data is being recognized by many of thelargest companies in the world. IBM paid $1.3 billion to acquire Kenexa, acompany that gathers data on and assesses approximately 40 million people ayear. Kenexas chief marketing officer Tim Geisert explains the companysvalue: What were bringing to the table is what we call . . . human-insightanalytics. What are the key data points that make people good at what they doin their jobs? Whats supercharged is the technology that can gather that dataand the platforms which we run that data on, using it for analytics, insights,and predictability.

    According to Meghan Biro, a human-resources consultant and founder ofTalentCulture, Probably the single most important quality in an applicant islearning agility. This is the ability to dive into a new situation or project andquickly grasp whats needed, and then . . . get to work. Big data makes findingthese kinds of people infinitely easier. The right filters can help human-resources managers measure skills, talent, and propensity for learning. Itsmaking the job not only easier but far more productive. What used to takehours or even days can sometimes be accomplished in minutes. Equally

  • valuable is big datas ability to winnow the pool, to look for unfavorablemarkers that disqualify a candidate, once again saving time, effort, andmoney.

    It is commonly accepted that good salespeople have outgoing, engagingpersonalities. Many do, but big data has shown that charm alone is notenough. Really great salespeople have something called emotional courage,which is the ability to persevere in the face of repeated rejection. Big dataallows companies to screen for this trait and not be blinded by an applicantscharisma.

    This kind of revelation is exactly where big data proves its worth. Josh Bersinswork with an oil company uncovered a fascinating insight. The company washaving trouble finding engineers and production workers able to thrive indifficult living conditions, often out in the desert, away from home and familyfor long stretches of time. Turnover was astronomical. A data-driven analysisof those who performed well showed that a multicultural background was aclear marker for success. These people were comfortable working with adiverse group of colleagues, and were more willing to adjust to adverse livingand working conditions.

    Meghan Biro sums up big datas effect on hiring: We have a whole newtoolkit. Its intriguing to see what big data reveals. I do advise anyoneinterested in being hired, even if theyre not actively looking, to keep up anonline presence. Thats how companies find you. All in all, its an exciting timeto be in HR because big data enables great hiring and great hiring enricheseveryones lives and, of course, leads to new levels of performance.

    Big Data, Big PossibilitiesBig data is changing society and business in profound ways. It is here to stayand its power will only increase. We have seen that organizations are using itto increase performance, boost profits, stop crime, and even save lives. Bigdata is changing our relationship to knowledge, taking it from a tool ofunderstanding to a tool of action.

    Create, Add, Revamp, EliminateTo harness the power of big data, follow this framework:

    First, create relationships with exponentially expanding networks and the data theirmembers generate.Second, add big datas insights to your internal and external reporting system.Third, revamp your processes, thinking more about what big data can do for you(Apple has done it with its developer network).

  • Finally, eliminate your fear of big data because its here to stay.

    Lets take each in turn:

    1. Create: Identify the sources of big data that are most likely to generate value. Thisincludes social interactions that enable your organization to see and hear fromcustomers, employees, partners, and investors about their experiences; providefeedback on products, services, and strategy; and set the foundation for co-creating new products or services.

    2. Add: Include big data in your existing financial and operating reports so your entireorganization - from board to front line employees - can better understand whatcustomers, employees, and partners are saying, whether to each other or to theorganization. This will support improvements to existing business practices acrossthe enterprise, including hiring, training, customer service, up- and cross-selling,marketing, the development of new sales channels, and new product development.

    3. Revamp: Reengineer existing business processes and technology investments sothat you can capture the big data generated from todays new sources - includingthe interaction between things, not just people.

    4. Eliminate: Work to remove existing biases about the importance of and valuecreated by unstructured data. Encourage all members of the organization acceptand gain expertise in big data so that everyone is working from the complete andcurrent set of information. It will support good decision-making.

  • Other Great Reads

  • Published by New Word City LLC, 2014www.NewWordCity.com

    Barry Libert

    All rights reserved. No part of this book may be reproduced, in any form or byany means, without permission in writing from the publisher.

    ISBN 978-1-61230-738-1

    THE BIG DATA REVOLUTIONOTHER GREAT READS