Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with...

10
Accurate annual average daily traffic (AADT) information is important to many applications, including roadway design, air quality compliance, travel model validation, and others. Yet complete or even extensive cov- erage of a network to collect traffic count data is impractical because of the cost involved. Studies have been conducted to estimate AADT for interstate highways, expressways, urban roads, and rural roads. Such efforts were often hampered by the lack of relevant detailed information that adequately explained variations in the traffic counts. A study in which geographic information system technology was extensively used to investigate various factors that may contribute to AADT on a road is pre- sented. A variety of land-use and accessibility measurements were devel- oped and tested. Multiple linear regression models were developed. Although most variables were statistically significant, few added enough explanatory power to be practical and useful. Four models that achieved R 2 of 0.66 to 0.82 are presented. The predictive power of the models are also examined and compared. Estimation of annual average daily traffic (AADT) is extremely important in traffic planning and operations for state departments of transportation, because AADT provides information for plan- ning new road construction, determining roadway geometry, man- aging congestion, developing pavement design, ensuring safety, and so forth. Accurate AADT data are crucial to the calibration and validation of travel demand models. AADT is also used to estimate statewide vehicle miles traveled on all roads and is used by local governments and the environmental protection agencies to deter- mine compliance with the 1990 Clean Air Act Amendments. Addi- tionally, AADT is reported annually by the state departments of transportation to the FHWA. Although all local and state governments collect traffic counts, because of budgetary constraints, the data coverage is often limited, especially for local roads or in rural areas. Typically, when count data are unavailable, estimates are made based on comparisons with roads considered to be similar. Such comparisons are inherently subject to large errors and may not be repeated often enough to remain current. Literature on estimating AADT for roads that do not have traf- fic counts is limited. Most of the relevant literature is on either estimating traffic volume for freeways or extrapolating 24- or 48-h traffic count data to obtain AADT. One attempt to estimate AADT for county roads was made by a research group at Purdue University, Indiana. The research group developed a multiple regression method using aggregated data at the county level (1). Data from 89 count stations were used for predicting AADT in 40 counties in Indiana. Four predictors were chosen to estimate AADT. They included county population, location type (urban/rural), access to other roads, and total arterial mileage in a county. An R 2 of 0.75 was achieved. Several other studies have been reported on estimating AADT for state roads using statistical methods (2–6 ). The predictors include county population size, total number of through lanes, state and non- state code indicating jurisdictions, location type (rural/urban), per- sonal income level, vehicle registrations, and so on. A common problem reported was the lack of data for some potentially impor- tant predictors. For example, in a Minnesota study, population was proposed for use within a certain distance of the roadway as a pre- dictor, but the attempt was abandoned because of unavailability of the data (2). Xia et al. (7 ) developed a regression model for estimating AADT for non–state-owned roads in Broward County, Florida. The pre- dictors used in the model included function classification, number of lanes, area type, auto ownership, presence of nonstate roads nearby, and service employment. The model has an adjusted R 2 of 0.5961. The prediction errors ranged from 1.31 to 57 percent. Fifty percent of the test points had an error smaller than 20 percent, whereas 85 percent of the test points had an error smaller than 40 per- cent. The model also underestimated the AADT for about 5 percent for the entire test data set. The AADT model for Broward County was later modified (8). The model did not include the service employment. The adjusted R 2 was slightly higher at 0.6069. The prediction errors ranged from 0.57 to 61.99 percent; 47.2 percent of the predictions had errors smaller than 20 percent, and 83.3 percent had errors smaller than 40 percent. The study presented in this paper is a continued effort of the afore- mentioned two studies. The main differences between this study and the previous studies are the use of a larger data set that also includes all the AADTs for state roads, the replacement of the old state road- way function classification system with the new federal function classification system, and the addition of a more extensive analysis of land-use and accessibility variables. Several multiple regression models were developed and compared. In the remainder of this paper, an overview of the characteristics of the study area is provided first. The data used in the model, spatial analyses performed to derive land-use and accessibility measurements, and model development are then described. Finally, the results are discussed, and conclusions are provided. Contributing Factors of Annual Average Daily Traffic in a Florida County Exploration with Geographic Information System and Regression Models Fang Zhao and Soon Chung Lehman Center for Transportation Research, Department of Civil and Environ- mental Engineering, Florida International University, 10555 West Flagler Street, Miami, FL 33174. Transportation Research Record 1769 113 Paper No. 01-3440

Transcript of Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with...

Page 1: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

Accurate annual average daily traffic (AADT) information is importantto many applications, including roadway design, air quality compliance,travel model validation, and others. Yet complete or even extensive cov-erage of a network to collect traffic count data is impractical because ofthe cost involved. Studies have been conducted to estimate AADT forinterstate highways, expressways, urban roads, and rural roads. Suchefforts were often hampered by the lack of relevant detailed informationthat adequately explained variations in the traffic counts. A study inwhich geographic information system technology was extensively used toinvestigate various factors that may contribute to AADT on a road is pre-sented. A variety of land-use and accessibility measurements were devel-oped and tested. Multiple linear regression models were developed.Although most variables were statistically significant, few added enoughexplanatory power to be practical and useful. Four models that achievedR2 of 0.66 to 0.82 are presented. The predictive power of the models arealso examined and compared.

Estimation of annual average daily traffic (AADT) is extremelyimportant in traffic planning and operations for state departmentsof transportation, because AADT provides information for plan-ning new road construction, determining roadway geometry, man-aging congestion, developing pavement design, ensuring safety,and so forth. Accurate AADT data are crucial to the calibration andvalidation of travel demand models. AADT is also used to estimatestatewide vehicle miles traveled on all roads and is used by localgovernments and the environmental protection agencies to deter-mine compliance with the 1990 Clean Air Act Amendments. Addi-tionally, AADT is reported annually by the state departments oftransportation to the FHWA.

Although all local and state governments collect traffic counts,because of budgetary constraints, the data coverage is often limited,especially for local roads or in rural areas. Typically, when count dataare unavailable, estimates are made based on comparisons with roadsconsidered to be similar. Such comparisons are inherently subject tolarge errors and may not be repeated often enough to remain current.

Literature on estimating AADT for roads that do not have traf-fic counts is limited. Most of the relevant literature is on eitherestimating traffic volume for freeways or extrapolating 24- or 48-h traffic count data to obtain AADT. One attempt to estimateAADT for county roads was made by a research group at Purdue

University, Indiana. The research group developed a multipleregression method using aggregated data at the county level (1).Data from 89 count stations were used for predicting AADT in 40 counties in Indiana. Four predictors were chosen to estimateAADT. They included county population, location type (urban/rural),access to other roads, and total arterial mileage in a county. An R2 of 0.75 was achieved.

Several other studies have been reported on estimating AADT forstate roads using statistical methods (2–6). The predictors includecounty population size, total number of through lanes, state and non-state code indicating jurisdictions, location type (rural/urban), per-sonal income level, vehicle registrations, and so on. A commonproblem reported was the lack of data for some potentially impor-tant predictors. For example, in a Minnesota study, population wasproposed for use within a certain distance of the roadway as a pre-dictor, but the attempt was abandoned because of unavailability ofthe data (2).

Xia et al. (7) developed a regression model for estimating AADTfor non–state-owned roads in Broward County, Florida. The pre-dictors used in the model included function classification, numberof lanes, area type, auto ownership, presence of nonstate roadsnearby, and service employment. The model has an adjusted R2 of0.5961. The prediction errors ranged from 1.31 to 57 percent. Fiftypercent of the test points had an error smaller than 20 percent,whereas 85 percent of the test points had an error smaller than 40 per-cent. The model also underestimated the AADT for about 5 percentfor the entire test data set.

The AADT model for Broward County was later modified (8). Themodel did not include the service employment. The adjusted R2 wasslightly higher at 0.6069. The prediction errors ranged from 0.57 to61.99 percent; 47.2 percent of the predictions had errors smaller than20 percent, and 83.3 percent had errors smaller than 40 percent.

The study presented in this paper is a continued effort of the afore-mentioned two studies. The main differences between this study andthe previous studies are the use of a larger data set that also includesall the AADTs for state roads, the replacement of the old state road-way function classification system with the new federal functionclassification system, and the addition of a more extensive analysisof land-use and accessibility variables. Several multiple regressionmodels were developed and compared. In the remainder of thispaper, an overview of the characteristics of the study area is providedfirst. The data used in the model, spatial analyses performed to deriveland-use and accessibility measurements, and model developmentare then described. Finally, the results are discussed, and conclusionsare provided.

Contributing Factors of Annual AverageDaily Traffic in a Florida CountyExploration with Geographic Information System andRegression Models

Fang Zhao and Soon Chung

Lehman Center for Transportation Research, Department of Civil and Environ-mental Engineering, Florida International University, 10555 West Flagler Street,Miami, FL 33174.

Transportation Research Record 1769 ■ 113Paper No. 01-3440

Page 2: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

CHARACTERISTICS OF THE STUDY AREA

The study area is the entire Broward County in South Florida (Fig-ure 1). The county is situated between Miami-Dade County andPalm Beach County. The population of the county is approximately1.4 million, and the population of the tricounty area is about 4.5 mil-lion. The development pattern in the county is similar to those inmany coastal areas: the central business district (CBD) is locatednear the coast; intense economic developments occur in the I-95 cor-ridor (which also includes the CBD) near the coastal line across theentire county length; and a gradual decrease in economic activityintensity and an increase in typical suburban residential develop-ments occur west of the I-95 corridor. The county borders the Ever-glades National Park, its development limit, in the west. The areasnear the Everglades have not been fully developed. Maps showingemployment and population distribution are given in Figure 2 (areasuninhabited are not shown).

DATA COLLECTION AND SPATIAL ANALYSIS

AADT Data

The response variable in the regression models is AADT in thou-sands. AADT data were obtained from average quarterly trafficcounts in 1998 from the Broward County Metropolitan PlanningOrganization (MPO). The counts were adjusted by seasonal fac-tors developed from traffic data obtained from a number of per-manent count stations on state roads. Four seasonal factors wereapplied uniformly to one of the four areas (east, I-95, central, andwest). After excluding expressway and zero AADTs, there are 898data points, 816 (90.9 percent of the database) of which were usedfor model development, and 82 (9.1 percent of the database) ofwhich were used for model validation and testing. Of the 816 countstations, 184 were on principal arterials, about 278 were on minorarterials, 305 were on collectors, and approximately 49 were onlocal roads. These two sets of data were selected randomly fromthe entire AADT database. The roadway function classes andAADT at various count stations are shown in Figure 3.

114 Paper No. 01-3440 Transportation Research Record 1769

Initial Predictors

Four groups of predictors, or independent variables, were examinedfor potential inclusion in the models: roadway characteristics, socio-economic characteristics, expressway accessibility, and accessibilityto regional employment centers. These variables are described below.

Roadway Characteristics Data

These data are related to the characteristics of the roadway sectionsand include the following:

1. Number of lanes (LANE), the number of lanes on a roadwayin 1998.

2. Area type (AREA), land-use type that is one of the following:CBD = 5.17,CBD Fringe = 3.16,Residential = 3.24,Outlying Business District = 5.63,Rural Area = 1.65, orUndefined = 1.0.

The original nominal values were replaced by the above numericvalues determined from the mean values of AADT in each area group.

3. Function classification (FCLASS ): four function classes, whichwere based on the federal function classification system, were used.Their numeric values were determined from the mean values ofAADT in each function class group:

Urban principal arterial = 3.4,Urban minor arterial = 2.2,Urban collector = 1.0, orUnclassified = 0.6

The data for the above predictors were provided by Broward CountyMPO as a geographic information system (GIS) layer.

FCLASS had the highest correlation with AADT (0.84), followedby LANE (0.77). Area type also had a reasonably high correlationwith AADT (0.49). Preliminary regression analysis showed that itspartial R2 is only 0.002. Also considering that its definition is rather

FIGURE 1 (a) Florida and (b) Broward County (1 mi � 1.61 km).

Page 3: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

subjective, this variable was not selected for inclusion in the finalregression models.

Socioeconomic Characteristics

These variables reflect the socioeconomic characteristics surroundinga count station. Previous studies determined that population andemployment (total and by categories) aggregated within a buffer of afixed size around a count station were not significant variables (7, 8).For this study, several new socioeconomic indicators were developedand tested, which are described as follows.

Zhao and Chung Paper No. 01-3440 115

Employment size in a corridor near a group of count stations: Tocapture the effect of commercial developments in a corridor, countstations that were near each other were grouped together, and a com-mon point on the network (usually an intersection) was created to rep-resent their location. Next, network paths 1.61 km (1 mi) long werecreated originating from such common points. A 30.5-m (100-ft)buffer was then generated around the network paths and used toaggregate the employment in the corridor. The employment data usedwere for 1999 and were purchased from InfoUSA for a separate proj-ect. The database contained the number of employees at each businesslocation and the standard industrial classification code. Employerswere geocoded when possible. The data provided more accurate loca-tional information on the employment sites than aggregate data basedon by traffic analysis zones (TAZs). This indicator, however, turnedout to have a low correlation with AADT and was not used.

Employment, population, and total dwelling units around a countstation: In another effort to account for the effect of local land-usedevelopments on AADT, employment, population, and dwelling unitswere aggregated using a variable-sized buffer around each count sta-tion. The employment data were again the 1999 InfoUSA data. Thedata on population and total dwelling units were aggregated at theTAZ level. The 1998 data were obtained by interpolating 1995 and2000 population and dwelling units forecast by the Broward CountyDepartment of Strategic Planning and Growth Management.

The size of the buffer was based on the count station locationand the function class of the road. The study area was divided intofour subareas based on the original zones used for seasonal factorsand roadway density. The buffer sizes reflected the average spac-ing of roads in a given class of roadways in a particular subarea.They were considered to be the “service” area of a road. The buffersizes based on subareas and roadway function classes are given inTable 1.

The three indicators (employment, population, and dwellingunits) have correlation factors of 0.325, 0.240, and 0.252 withAADT, respectively. However, they are highly correlated to eachother. Regression results showed that the employment indicatorcontributed to the increase of R2 the most. Therefore, the employ-ment indicator (BUFFEMP) was used, and the other two indicatorswere not included in the final models.

Employment and population around a count station were aggregatedusing buffer sizes determined based on functional classification.

(b)(a)

FIGURE 2 Broward County (a) employment and (b) population distribution.

FIGURE 3 Roadway function classes and AADT (1 mi � 1.61 km).

Page 4: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

Accessibility to Expressways

This group of variables measured whether a road has easy access toexpressways. The measurements developed included the following:

1. Minimum distance to expressway from a count station to anexpressway access point, which is measured as network distance inmiles;

2. Minimum travel time in minutes from a count station to anexpressway access point [the travel speed is assumed to be 88.5 km/h(55 mph) for expressways, 56.3 km/h (35 mph) for principal arteri-als, 48.3 km/h (30 mph) for minor arterials, 40.2 km/h (25 mph) forcollectors, and 32.2 km/h (20 mph) for unclassified roads];

3. Number of expressway access points within a 6.44 km (4-mi)radius from a count station; and

4. Direct Access (DIRECTAC), which is a binary variable thatassumes a value of 1 and 0 (1 means a count station is on a road thatconnects to an expressway, and 0 otherwise.).

Correlation and regression analyses showed that the number ofexpressway access points close to a count station was not an impor-tant factor. The minimum distance between a count station and anexpressway access point had a correlation coefficient of 0.20 butagain was not an important factor. DIRECTAC had a high correlation(0.48) with AADT and was considered further.

Regional Accessibility to Employment

Several measurements were developed in an attempt to model theinfluence of regional economic activities on the traffic of a road.There are five general kinds of measurements:

1. Network distance to the regional mean centers of employment(DECNTR): The regional mean center of employment was determinedby finding the spatial mean center of all the TAZs weighted by the TAZtotal employment. This measure was developed to reflect the gradualdecrease in development intensity moving away from the urban core.

2. Network distance to the regional mean centers of population(DPCNTR): The regional mean center of population was determinedby finding the spatial mean center of all the TAZs weighted by theTAZ total population. The purpose for this measure was the same asfor DECNTR.

3. Regional accessibility to population centers defined as

where

RPAccessk = the accessibility of count station k to regional population,

Pi = the population of the ith population center,

RPAccess P ek it

i

N

kiP

= −

=∑ 0 0954

1

.

116 Paper No. 01-3440 Transportation Research Record 1769

tki = the network travel time from count station k to theith population center, and

NP = the number of population centers.

To calculate RPAccessk, 17 population centers were created inBroward County by examining the population data with the aid ofGIS visualization tools such as population distribution and densitymaps, three-dimensional surface maps created from populationdata by TAZs with population as the z coordinate, and contourmaps generated from the surface map.

4. Regional accessibility to employment centers defined as

where

REAccessk = the accessibility of count station k to regionalemployment,

Ej = the employment of the jth employment center,tkj = the network travel time from count station k to the jth

employment center, andNE = the number of employment centers.

Using the same approach for generating population centers, 16employment centers in Broward County were created.

5. Regional accessibility to population and employment wasdefined as the product of the previous two accessibility measures:

The correlation between AADT and RPAccess, REAccess, andRPEAccess was 0.03, 0.27, and 0.20, respectively. Regressionanalysis further showed that REAccess was a better indicator than RPEAccess. Therefore, only REAccess was chosen for inclusionin the models.

Because Broward County is part of the urbanized tricounty area,there was a significant number of trip interchanges among the coun-ties. To account for the internal-external (I-E) trips, a separate set ofregional accessibility was created that included the population andemployment centers within 16.1 km (10 mi) from the county borderin Palm Beach County and those within 19.3 km (12 mi) in Miami-Dade County, respectively. The 16.1-km and 19.3-km limits werechosen because the majority (more than 75 percent) of the I-E tripswere either originated or destined within these limits.

Correlation and regression analyses showed, however, that thebenefit of including the two adjacent counties in the three accessi-bility measurements was negligible. The correlation coefficientswere also low, typically below 0.11. The data processing cost wasnot offset by the improvement to the model with the other two coun-ties included. Therefore, a set of REAccess was created only usingBroward County data and included in the models.

REGRESSION MODELS

From the preliminary analyses, the following variables were includedin the final regression models:

1. FCLASS,2. LANE,3. DIRECTAC,

RPEAccess Pe E ek it

i

N

jt

j

N

ki kjP E

=

=

=∑ ∑0 0954

1

0 0954

1

. .

REAccess E ek jt

j

Nkj

E

= −

=∑ 0 0954

1

.

TABLE 1 Buffer Sizes Based on Count Station Location andFunction Class

Page 5: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

4. REAccess,5. BUFFEMP,6. EMPBUF,7. POPBUF, and8. DPCNTR.

Using these six variables, four regression models were generated.Variables were selected using the stepwise procedure. The modelswere checked for multicollinearity, and outliers were diagnosed.Consequently outliers (about 5.0 percent) were detected from thedata set for model development and the regressions were redone. Allthe statistical analyses were conducted using the Statistical AnalysisSystem (SAS) software. The four models had the following forms:

Model 1: AADT = −9.520386 +8.480001 FCLASS +3.428939 LANE+0.596752 REAccess +2.991573 DIRECTAC +0.069086 BUFFEMP

Model 2: ADDT = −6.15742 + 6.55471 LANE + 0.61433 REAccess + 7.88344 DIRECTAC − 0.34494 DPCNTR

Model 3: ADDT = −4.66034 + 4.95341 LANE + 0.51119 REAccess + 4.52713 DIRECTAC − 0.10689 DPCNTR + 0.00112 POPBUF

Model 4: AADT = −4.26565 + 4.86271 LANE + 0.47286 REAccess + 4.34780 DIRECTAC − 0.10197 DPCNTR + 0.00104 POPBUF + 0.00022820 EMPBUF

For all four models, there was a strong relationship between AADTand the independent variables. There was no multicollinearityamong them. The coefficient of determination, R2, ranged from 0.66to 0.82 for the four models. In other words, at least 66 percent of thesum of squares in AADT could be associated with the variation inthese independent variables. The overall F-test was very significant(Prob. > F = .0001 or smaller).

Among the four models, Model 1 included FCLASS as a variable,which is the most significant predictor with a partial R2 of 0.7103.LANE has a partial R2 of 0.0836. The partial R2 s for REAccess,DIRECTAC, and BUFFEMP (listed in order of their importancebased on their partial R2) are all small at less than 0.02.

Zhao and Chung Paper No. 01-3440 117

A concern about the use of function class was that functionclasses are, to a large degree, determined based on the traffic vol-ume. Therefore, while a strong correlation may exist betweenAADT and function classes, the use of function classes in a modelfails to capture the underlying causes of varying traffic volumes.Additionally, because other criteria were also used in determiningthe function classes, such as anticipated future land developmentsor linkage of facilities of significance, this correlation may be weakfor certain roads.

In an attempt to capture directly the underlying causes of trafficvolume variations, three additional models were tested. Model 2used no information of function classification, but it also had thelowest R2 (0.66). Models 3 and 4 included two more variables—POPBUF and EMPBUF—that depicted the land-use intensity in the“service” areas surrounding different types of facilities. The size ofthese service areas is determined from the function class of the road.The function classes here are used as a way to define the hierarchy ofroads in a transportation network, thus the general service areas ofroads. The inclusion of variables POPBUF and EMPBUF improvedthe models, with a R2 of 0.76 for both.

In all the models, the signs of the coefficients were as expected.For instance, the positive sign for FCLASS means that a road of afunction class higher in the network hierarchy will carry more dailytraffic. The positive sign for LANE indicates that more lanes usuallymean more traffic. The positive sign for REAccess indicates that aroad easily accessible to regional employment centers as measuredby employment weighted by network travel time tends to have ahigher traffic volume. The positive sign for DIRECTAC is consistentwith the expectation that a road with direct access to expresswaystends to have more traffic. The positive sign for POPBUF and EMP-BUF explains the positive relationship between AADT with popula-tion and employment concentrations in the “service” area of a road.The negative sign of DPCNTR means the farther away a road is fromthe center of population of the urbanized area, the less traffic it tendsto carry. The SAS output for the models are given in Figures 4, 5, 6,and 7, respectively.

FIGURE 4 SAS output for Model 1.

Page 6: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

MODEL VALIDATION

A rigorous test of the models’ predictive capabilities involved usingthe models to estimate AADTs for the roads other than those usedfor the calibration and then comparing the estimation with the actualAADTs observed. At the beginning of the analyses, the data set wasdivided into two groups, one used for model calibration and theother for model validation. The 82 testing data were used to exam-ine the models’ predictive capability. Table 2 summarizes the test-ing results. MSE is the mean square of errors, and total error is theerror in percentage for the entire testing data set, which is less than3 percent for all models.

118 Paper No. 01-3440 Transportation Research Record 1769

Table 3 gives the cumulative percent errors for the four models.For instance, for Model 1, 58.54 percent of the testing points had anerror less than 20 percent. The maximum error for each model is alsogiven in the last row. Figure 8 illustrates the error distributions, andFigure 9 illustrates the cumulative percentages of testing pointswithin certain error ranges for the four models.

Comparison of the models shows that Model 1 had the best perfor-mance in terms of prediction errors; about 37 percent of the testingpoints had an error smaller than 10 percent and about 73 percent ofthe testing points had an error smaller than 30 percent, respectively.Model 2 had the worst performance. Models 3 and 4 had better per-formance than that of Model 2. They also had close performances.

FIGURE 5 SAS output for Model 2.

FIGURE 6 SAS output for Model 3.

Page 7: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

This indicated that Model 3 might be adequate because it had onefewer variable, thus reducing the data collection and processing costs.

Data for all the variables are relatively easy to obtain. FCLASS andLANE are always readily available. Determination of accessibility toregional employment centers (REAccess) requires moderate effort,which involves identifying and creating employment centers, build-ing an appropriate network including a turntable, and using networkanalysis tools such as finding the shortest paths. DIRECTAC toexpressways and employment in a buffer zone (BUFFEMP) are botheasy to determine.

The percent difference between the predicted and actual AADT val-ues ranged from 0.3 percent to 155.6 percent for Model 1 and 0 per-cent to 288.1 percent for Model 3. Figure 10 shows the locations oftesting points and the percent errors for Model 1. The largest errors(>100 percent) all occurred on low-volume roads. Three of the fourlargest errors occurred on US-1. Even though US-1 is a minor arte-

Zhao and Chung Paper No. 01-3440 119

rial, the traffic volumes at these locations are only 9,400, 7,100, and4,200, respectively. Most of the north-south traffic is carried by I-95.Additionally, near the coast, employment concentration is relativelylow, also contributing to the low volumes on these roads. In fact, allthese three test points are outliers.

The fourth large error occurs west and south of I-75. It is a minorarterial, but its AADT is only 9,200, and the surrounding area ismostly rural. This examination of the conditions at locations of thelargest errors reveals special cases where function class is high buthigh volume is not likely to occur because of land-use patterns andnetwork configurations. This also suggests the need for caution whenapplying a model to determine if a particular location might representan outlier.

Figure 11 shows the error distribution for Model 3. The largesterrors occur east of I-95. The error that was large in Model 1 andoccurs southwest of I-75 now is small in Model 3, which may indi-

FIGURE 7 SAS output for Model 4.

TABLE 2 Model Testing Result Summary and Comparison

Page 8: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

cate that local land-use variables POPBUF and the distance to popu-lation center have helped to better reflect the actual land-use intensityand the through traffic carried. There is also an interesting pattern oflocations where underestimation and overestimation occur, whichsuggests that certain land-use factors are not being reflected.

CONCLUSIONS

Four multiple regression models for estimating AADT on non-expressway roads were developed for Broward County. They pro-vide a systematic approach to quantifying the factors that contributeto AADT. The models are not expected to be directly applicable toother urban areas because details of the urban form are involved.The methodology, however, may be applied widely.

A large set of predictors was developed and investigated. Up to sixpredictors have been used in the final models. They include FCLASS,LANE, accessibility to regional employment centers, directness ofexpressway access, and population and employment around a count

120 Paper No. 01-3440 Transportation Research Record 1769

station. The models were able to explain 66 percent to 83 percent ofthe total variability depending on the variables used. Generally, themore variables used, the better a model performance was. The choiceof a model, therefore, is likely to be based on data processing cost.However, the model data are generally available and relatively easyto process.

FCLASS and LANE proved to be the most significant predictors.Although function classes as a predictor outperformed other variables,they did not explain the causes that determine AADT because theythemselves were determined by other factors including traffic volume.Model testing had already shown that function classes were not con-sistently correlated with AADT. However, the model that did not useany information related to function classes had the worst performance.Additional land-use variables determined based on function classessignificantly improved the models although they were not as strongpredictors as function class. Such indirect use of function classes maybe acceptable because they reflected the hierarchy of the networksystem, which may be difficult to capture otherwise.

The models in their current forms may not be adequate to meetthe need of engineering design or the calibration of travel demandmodels, but their performances have improved. They may be used

TABLE 3 Cumulative Percentage of Testing Points in DifferentError Ranges

FIGURE 8 Error distribution of four models.

FIGURE 9 Cumulative percentage of testing points in different error ranges.

Page 9: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

FIGURE 10 Locations of testing points and associated error in percentage forModel 1.

FIGURE 11 Locations of testing points and associated error in percentage for Model 3.

Page 10: Contributing Factors of Annual Average Daily Traffic in a Florida County: Exploration with Geographic Information System and Regression Models

for tasks that do not require a high level of accuracy at individualsites such as estimating systemwide vehicle miles traveled.

Future work includes careful examination of the spatial patternsof errors that resulted from different models and the causes forerrors. More effective land-use indicators are still needed, as well asthe study of temporal stability of the regression models.

ACKNOWLEDGMENTS

The authors would like to thank Lina Koulekowski in the BrowardCounty MPO for providing GIS data on AADT and function clas-sification system and Lee-Fang Chow for assistance in GIS analy-sis. Reviewers are also thanked for their constructive commentsand suggestions.

REFERENCES

1. Mohamad, D., K. C. Sinha, T. Kuczek, and C. F. Scholer. Annual Aver-age Daily Traffic Prediction Model for County Roads. Presented at the77th Annual Meeting of the Transportation Research Board, WashingtonD.C., 1998.

122 Paper No. 01-3440 Transportation Research Record 1769

2. Cheng, C. Optimum Sampling for Traffic Volume Estimation. Ph.D. dis-sertation. University of Minnesota, Carlson School of Management,Minneapolis, 1992.

3. Deacon, J. A., J. G. Pigman, and A. Mohenzadeh. Traffic Volume Estimatesand Growth Trends. Report UKTRP-87-32. Kentucky TransportationResearch Program, University of Kentucky, Lexington, 1987.

4. Fricker, J. D., and S. K. Saha. Traffic Volume Forecasting Methods forRural State Highways. Report FHWA/IN/JHRP-86/20. Joint HighwayResearch Project, School of Civil Engineering, Purdue University, WestLafayette, Ind., 1987.

5. Aldrin, M. A Statistical Approach to the Modeling of Daily Car Traffic.Traffic Engineering and Control, Vol. 36, No. 9, Sept. 1995, pp. 489–493.

6. Saha, S. K. The Development of a Procedure to Forecast Traffic Volumeson Urban Segments of the State and Interstate Highway Systems. Ph.D. dis-sertation. School of Civil Engineering, Purdue University, West Lafayette,Ind., 1990.

7. Xia, Q., F. Zhao, Z. Chen, L. D. Shen, and D. Ospina. Estimation ofAnnual Average Daily Traffic for Nonstate Roads in a Florida County. InTransportation Research Record: Journal of the TransportationResearch Board, No. 1660, TRB, National Research Council, Washing-ton, D.C., 1999, pp. 32–40.

8. Shen, L. D., F. Zhao, and D. Ospina. Estimation of Annual Average DailyTraffic for Off-System Roads in Florida. Final Report. Florida Departmentof Transportation, Tallahassee, Fla., 1999.

Publication of this paper sponsored by Committee on Highway Traffic Monitoring.