Video Summary

7
Name: Sumeet Patnaik Roll number: uemf15029 About the Speaker Professor Richard C. Larson Move Over, Big Data! How Small Simple Models Can Yield Big Insights” by Professor Richard C Larson Brief Summary Dr. Larson in his presentation, discussed about how simple Dr. Larson received his Ph.D. from MIT where he is Mitsui Professor in the Engineering Systems Division (ESD). He is currently founding director of the MIT’s new Center for Engineering System Fundamentals. He is author, co-author or editor of six books and author of over 85 scientific articles, primarily in the fields of urban service systems (esp. emergency response systems), queueing, logistics, disaster management, disease dynamics, dynamic pricing of critical infrastructures, education and workforce planning. He has worked closely with a wide variety of organizations, including—in the private sector—banks, airlines, retailers, industrial gas distributors, amusement parks, and—in the public sector—the City of New York, many public school systems, the U.S. Postal Service, the World Bank, the Centers for Disease Control, the National Institutes of Health, and numerous police departments. At MIT he has founded several initiatives, including MIT Learning International Networks Consortium and MIT

Transcript of Video Summary

Page 1: Video Summary

Name: Sumeet Patnaik

Roll number: uemf15029

About the Speaker

Professor Richard C. Larson

“Move Over, Big Data! How Small Simple Models Can Yield Big Insights” by Professor Richard C Larson

Brief SummaryDr. Larson in his presentation, discussed about how simple mathematical models and small relationships of systems can help to improve one’s perception and lead to better decision making and how it can help to guide the analysis with big data sets and simple always give you the strategy and how big data and small models go hand in hand together. In his presentation, he has cited certain examples from his professional research and consulting engagements, and discussed general applications to industry.

Dr. Larson received his Ph.D. from MIT where he is Mitsui Professor in the Engineering Systems Division (ESD). He is currently founding director of the MIT’s new Center for Engineering System Fundamentals. He is author, co-author or editor of six books and author of over 85 scientific articles, primarily in the fields of urban service systems (esp. emergency response systems), queueing, logistics, disaster management, disease dynamics, dynamic pricing of critical infrastructures, education and workforce planning. He has worked closely with a wide variety of organizations, including—in the private sector—banks, airlines, retailers, industrial gas distributors, amusement parks, and—in the public sector—the City of New York, many public school systems, the U.S. Postal Service, the World Bank, the Centers for Disease Control, the National Institutes of Health, and numerous police departments. At MIT he has founded several initiatives, including MIT Learning International Networks Consortium and MIT Blended Learning Open Source Science or Math Studies.

Dr. Larson is a member of the National Academy of Engineering (NAE) and co -chairs a major panel on the application of systems engineering to health, cosponsored by the NAE and the Institute of Medicine.

Page 2: Video Summary

With big data there is seamless data coming in so there should be some knowledge of it, there should be a prior analysis and focus where and what we need then we can achieve our goals. We need a strategy to extract decision relevant information out of a dataset“The speaker mentions that fetching a data is like fishing in a big ocean and there should be right strategy and simple data models, there are ideas that big data trends now are too discounted and have very less importance. “

He has covered on Flaws of Averages – what they are and how to avoid them.Square root laws - how to apply them to locating facilities and moreNonlinearities in Queueing-Lateral thinking—and how it can sometimes make a problem go away.Case study – marrying small models with big data analysis

FLAWS OF AVERAGESThis video presents an introduction to the Flaws of Averages using some exciting examples like the “crossing of the river” example, the “cookie” example, and “friends on Facebook” example.If we are about to deal with lots of data, averages will be important. We need to be savvy customers of averages!!! Averages are often worthwhile representations of a set of data by a single descriptive number. The objective of this module, however, is to simply point out a few drawbacks that could arise if one is not attentive to details when calculating and interpreting averages. This video talks about three flaws of averages: (1) The average is not always a good description of the actual situation, (2) The function of the average is not always the same as the average of the function, and (3) The average depends on your perspective.

The idea is to get the students to think about averages in their everyday lives, how averages assist in our understanding and how they can be misleading. This gets STEM thinking outside of the classroom and into everyday experiences. It should engage the students for many hours.Consider a case where the average may not always be a good description of the real situation!1st flaw of averagesThat depending on the situation, the average may be exactly the same, but the distribution may be different. For example, think of the crossing of the river. We had the flat line, we had the sloped line, we had the sort of double table top line. Both of those were essentially three different distributions that all have the same average value. So those are the kinds of things that we were thinking are flaw of averages.Second flaw of averages, that the function of the average is not always the same as the average of the function. For example, we had plate A with two cookies and plate B with two cookies, where the average diameter for plate A of the two cookies was 7 cm and the average diameter of the two cookies on plate B was 8 cm. And as Rhonda helped us beautifully illustrate, intuitively you would think that the plate B set of cookies would be bigger in terms of area

Page 3: Video Summary

since the diameter is bigger. But as you saw, the cookies on plate A were actually bigger when we uncovered them. So that was just a nice way to illustrate this point that the function of the average is not always the same as the average of the function.Last flaw of averages : The average depends on your perspectiveFor Example-:Now Lake Woebegone is a fictional town from a radio program here in the US, and one of the key phrases about Lake Woebegone is that “all of the children are above average.” Now this may sound like an impossible statement, but again it all depends on your point of view. If the average that you’re discussing is a national average, it could be possible that Lake Woebegone’s children could all be above average.

THE SQUARE ROOT LAW: If the distance travelled by the police car is considered than in a graphical representation it might be actually greater what it is to be represented so the square root law helps us in getting the correct data.In the above example if A is the service area and N is number of police cars than the Distance (D) is directly proportional to the Sq root of the area of the police car. In big data we might need to look at this type of special data.NONLINEARITIES IN QUEUEINGRichard C Larson in his presentation, discussed about Queueing that

A queue is a waiting line Uncertainties cause delays

Usually there is uncertainty in: The arrival times of customers The service requirements of customers The urgency with which a customer must be served.

Some Familiar Queues:Fast food restaurants, Toll booths, Retail shopping, Airports, Automatic Teller Machines, waiting lists for college acceptance, on hold to an “Toll free number”It is a basic engineering or mathematics that effectively helps in management of the queuing system. Today there are large number of files and big data helps us in analysing the queues.The little’s law helps us in analysing the queuing system with the help of the equation:L= λWL= time avg number of customerLambda: avg rate of arrivals W: Mean time SpentMMK queuing: In this type as the fraction of time the server gets busy as close to 100% the queuing delay explodes so the managers must manage it in a way so that there is a delay or an idle time of the server else there can be a queuing explosion. So if there are many servers which are working 95% of the time and if can manage the queuing delay then it is very critical to have an estimate for the Lambda which we can get from the analysis of big data , If there is a difference in 5%then the queuing can go higher which might cause a distortion. If we make everything deterministic there can be no queuing delay at all.

Performances degrade as the arrival rate increases or mean service time increases. Performance degrades as the variance of time between arrivals increases or

Page 4: Video Summary

variance of service time increases.

There are queues everywhere and it is important to have them in-order to maintain a good and optimum service level. Case Study: A personal big data Small Model ExperienceHow can we differentiate if people were there in the queue or not in the queue? : So the queuing had to be decided and the signature of the queue was important. The signature of the queue was of the session after which when the card was inserted. There is a gap between each session during the queuing process.

Queue Inference Engine is a mathematical algorithm used to derive results regarding queue statistics like mean wait in queue, time-dependent mean number in queue, and probability distribution of the number in queue observed by a randomly arriving customer. The queueing behaviour of Poisson arrival queueing systems from only the transactional data and the Poisson assumption.For each congestion period in which queues may form (in front of a single or multiple servers) are estimated.Richard C. Larson here provides us with a real life case study when he was approached by BayBank (headquarters in US which performs several commercial banking operations) who were struggling to know how many ATM machines should they place and to find the balance between fully functional and limited functional ATM machine. Fully functional ATM machine are those were you can carry even deposit cash while limited functional ATM machine can only facilitate withdrawal of cash. They provided Mr. Larson with reams and reams of data which the author describes as almost "3 feet high pages of data in his desk”. This case is in 1989 when Big data was not much known. Each line of data has up to second data of when the customer inserted its card and the service commenced till when the transaction was completed and the service terminated for each customer. If the card was inserted 2-5 minutes after the previous service was over, then there was a gap. Thus it was important to partition customers based on who and how many were in the queue during congestion period and how many were not in queue. This would have helped Mr.Larson to estimate how many ATM machines could be installed by Baybank and the balance between both. First, people come to an ATM machine in random process i.e. Poisson Process. Now Mr.Larson tried to mathematically impose small models in big data (loads of data provided by BayBank). Thus they came up with a new mathematically correct algorithm to determine statistics of customer queue without any monitoring devices like camera or sensor. The author developed a novel O(n^3 ) algorithm which uses those data to deduce transient queue lengths as well as the waiting times of each customer in the busy period.For example: -if there are 17 customers in a queue, then amount of work that is to be done goes by 17^3.There are also other mathematical deductions done in the paper Larson, R.C., "The Queue Inference Engine: Deducing Queue Statistics from Transactional Data."In this case, small models were must for the algorithm on Big Data that was deduced. Thus as the author states by marrying small modelsand Big data, it was possible to deduce queue statistics. This paper of Marson was also appreciated by Simchi-Levi. This case study can also be expanded to other examples or

Page 5: Video Summary

fields like invisible queues in communication systems(many communication systems have finite queues during congestion period have invisible long queues of customers outside the system waiting to gain access of the system) or traffic queued at intersections or other queuing networks.