Transportation and logistics modeling 2

47
Basics of statistics and Data collection and analysis Weeks 2,3, and 4

Transcript of Transportation and logistics modeling 2

Page 1: Transportation and logistics modeling 2

Basics of statisticsand

Data collection and analysis

Weeks 2,3, and 4

Page 2: Transportation and logistics modeling 2

Variability

• Statistical techniques are useful for describing and understanding variability.

• Variability: Successive observations of a system or phenomenon do not produce exactly the same result.

• Statistics gives us a framework for describing this variability and for learning about potential sources of variability.

Page 3: Transportation and logistics modeling 2

•Nylon connector to be used in an automotive engine application.•Design specification on wall thickness at 3/32 inch •The effect of this decision on the connector pull-off force. If the pull-off force is too low, the connector may fail when it is installed in an engine. •Eight prototype units are produced and their pull-off forces measured (in pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1.

Random Experiment

Page 4: Transportation and logistics modeling 2

•This plot allows us to see easily two features of the data; the location, or the middle, and the scatter or variability.

Random Experiment

Page 5: Transportation and logistics modeling 2

Definition

Random Experiment

Page 6: Transportation and logistics modeling 2

A closer examination of the system identifies deviations from the model.

Random Experiment

Page 7: Transportation and logistics modeling 2

Discrete Random Variables• They are variables which have finite (countable) range.• They can not be assigned other value.• Examples:

– Let x be your grade S={A,A-,B+,B,B-,C,D}– Let y be a working day in a week S={M,T,W,R,F}– Let z be your weight S={40k<z<300K} this is not discrete, it is

continuous, z can be 120.63k– 48 digital lines are observed, x indicate how many are in use, x can be

any integer from 1 to 48….Note x can not be 1.5

Page 8: Transportation and logistics modeling 2

Descriptive statistics

Definition: Sample Mean

Page 9: Transportation and logistics modeling 2

Definition: Sample Variance

Descriptive statistics

Page 10: Transportation and logistics modeling 2

Descriptive statistics

How Does the Sample Variance Measure Variability?

How the sample variance measures variability through the deviations . xxi

Page 11: Transportation and logistics modeling 2

Descriptive statistics

Definition

Page 12: Transportation and logistics modeling 2

Stem-and-Leaf Diagrams

Page 13: Transportation and logistics modeling 2

Frequency Distributions and Histograms

• A frequency distribution is a more compact summary of data than a stem-and-leaf diagram.

• To construct a frequency distribution, we must divide the range of the data into intervals, which are usually called class intervals, cells, or bins.

Constructing a Histogram (Equal Bin Widths):

Page 14: Transportation and logistics modeling 2

Frequency Distributions and Histograms

Histogram of compressive strength for 80 aluminum-lithium alloy specimens.

Page 15: Transportation and logistics modeling 2

Stem-and-Leaf Diagrams

Page 16: Transportation and logistics modeling 2

Data Features

• When an ordered set of data is divided into four equal parts, the division points are called quartiles.

The first or lower quartile, q1 , is a value that has approximately one-fourth (25%) of the observations below it and approximately 75% of the observations above.

The second quartile, q2, has approximately one-half (50%) of the observations below its value. The second quartile is exactly equal to the median.

The third or upper quartile, q3, has approximately three-fourths (75%) of the observations below its value. As in the case of the median, the quartiles may not be unique.

Stem-and-Leaf Diagrams

Page 17: Transportation and logistics modeling 2

Stem-and-Leaf Diagrams

Data Features

• The interquartile range is the difference between the upper and lower quartiles, and it is sometimes used as a measure of variability.

• In general, the 100kth percentile is a data value such that approximately 100k% of the observations are at or below this value and approximately 100(1 - k)% of them are above it.

Page 18: Transportation and logistics modeling 2

Stem-and-Leaf Diagrams

Stem-and-leaf diagram for the compressive strength data

Page 19: Transportation and logistics modeling 2

2-2 Interpretations of Probability

Relative frequency of corrupted pulses sent over a communications channel.

Page 20: Transportation and logistics modeling 2

2-2 Interpretations of Probability

2-2.2 Axioms of Probability

Page 21: Transportation and logistics modeling 2

Distribution

• Probability distribution of variable x is the description of the probability of each outcome of x.

• Examples:– Tossing a coin, x is getting head, s={0,1} P(x=0)=0.5,

P(x=1)=0.5– In an experiment of examining 2 independent items from a

product line, the probability that an item passes the test is 0.8, if x is the number of products passing the inspection. What is the distribution of x?

Page 22: Transportation and logistics modeling 2

Mean and Variance of a Discrete Random Variable

A probability distribution can be viewed as a loading with the mean equal to the balance point. Parts (a) and (b) illustrate equal means, but Part (a) illustrates a larger variance.

Page 23: Transportation and logistics modeling 2

Discrete Uniform DistributionDefinition

Page 24: Transportation and logistics modeling 2

Binomial Distribution

•You know the probability of the outcome of interest in a single try.

• Flip a coin P(H)=1/2

• Draw a card P(K)=4/52

• In a production line P(defective)=0.10

•You repeat the experiment (trial) n times WITH REPLACEMENT

• Flip the coin 7 times

• Draw 5 cards

• (Take an item, test it, return it back) 25 times

•You want to know the probability that you will get a certain outcome x times.

• Probability of getting 3 heads in 7 times knowing that P(H)=0.5

• Probability of getting 5 Kings in 5 drawn cards knowing that P(K)=4/52

• Probability of getting 2 defectives in a sample of 25 items knowing that P(d)=0.1

Page 25: Transportation and logistics modeling 2

Binomial Distribution

Page 26: Transportation and logistics modeling 2

Poisson Distribution

• This distribution deals with the case that you only know THE AVERAGE NUMBER of the required outcome

• Examples– Flaws in a rolls of textile– Calls to a telephone exchange

Page 27: Transportation and logistics modeling 2

Poisson Distribution

Page 28: Transportation and logistics modeling 2

Continuous vs. Discrete

Page 29: Transportation and logistics modeling 2

Continuous Uniform Distribution

Page 30: Transportation and logistics modeling 2

Normal Distribution

• The most widely used distribution• Also called Gaussian distribution• Can be used to virtually approximate results of

any experiment as we will see in later chapters.

• Characterized by mean and variance.• Can you notice it represents weight, height,

your strength?

Page 31: Transportation and logistics modeling 2

Effect of mean and variance

Page 32: Transportation and logistics modeling 2

Interesting Fact

Page 33: Transportation and logistics modeling 2

Standard Normal

Page 34: Transportation and logistics modeling 2

Example

• Find 1. P(Z>1.26)2. P(Z<-0.86)3. P(Z>-1.37)4. P(-1.25<Z<0.37)5. P(Z<-4.6)6. Find z such that P(Z>z)=0.057. Find z such that P(-z<Z<z)=0.99

Page 35: Transportation and logistics modeling 2

Solution

Page 36: Transportation and logistics modeling 2

Example

• If the current measurements is a trip of wire are assumed to follow N(10,4).Find– Probability that the current will exceed 13 milliamperes– Probability that the current will be within 9 to 11

milliamperes– The value for which the probability that the current

intensity measurement is below this value is 98%

Page 37: Transportation and logistics modeling 2

Transformation

Page 38: Transportation and logistics modeling 2

Exponential Distribution

• Used to model inter-arrival time (What is the probability that the next customer will arrive in 5 minutes? In 8 minutes?

• Defined by the average number of customer in a certain period (= 5 customer in an hour)

Page 39: Transportation and logistics modeling 2

Weibull Distribution

The Weibull distribution is used• In reliability engineering and failure analysis• In industrial engineering to

represent manufacturing and delivery times• In communications systems engineering• In radar systems to model the dispersion of the received signals

level produced by some types of clutters• In General insurance to model the size of Reinsurance claims• In forecasting technological change• In hydrology the Weibull distribution is applied to extreme events

such as annually maximum one-day rainfalls and river discharges. • In weather forecasting

Page 40: Transportation and logistics modeling 2

Weibull Distribution

• k > 0 is the shape parameter and λ >0 is the scale parameter of the distribution

Page 41: Transportation and logistics modeling 2

Statistical data analysis

• Studying a problem through the use of statistical data analysis usually involves four basic steps.1. Defining the problem2. Collecting the data 3. Analyzing the data 4. Reporting the results

Page 42: Transportation and logistics modeling 2

Collecting the data

• Two important aspects of a statistical study are: • Population - a set of all the elements of interest in a

study• Sample - a subset of the population • statistical inference is to obtain information about a

population form information contained in a sample• Cross-sectional data re data collected at the same or

approximately the same point in time. • Time series data are data collected over several time

periods.

Page 43: Transportation and logistics modeling 2

Analyzing the data

• Hypothesis test: – Claim that a collected data can be approximated

by a certain distribution– If we find a significant evidence that this is not

true, we reject the claim

Page 44: Transportation and logistics modeling 2

Stat::fit

Or Insert the data here

Choose file-open then choose files of type data files (*.txt)

Page 45: Transportation and logistics modeling 2

Stat::fit

•Choose continuous or discrete•Choose whether the distribution is• Unbounded (no upper or lower limit)• Lower bound• Assigned bound (you specify the lower bound)

Page 46: Transportation and logistics modeling 2

Stat::fit

• Choose assigned bound and specify the lower bound as 0.

Page 47: Transportation and logistics modeling 2

Stat::fit