On Fractile Transformation of Covariates in
Transcript of On Fractile Transformation of Covariates in
On Fractile Transformation of Covariates inRegression1
Bodhisattva SenDepartment of Statistics
Columbia University, New York
ERCIM’1011 December, 2010
1Joint work with Probal Chaudhuri, Indian Statistical Institute, Calcutta1 / 13
Example 1Household Expenditure and Income Data
Investigate the inequality in income and compare theeconomic condition of Poland (blue) and Bulgaria (red)
X = total expenditure; Y = proportion of expenditure onfood as a fraction of X per capita per household
0 100 200 300 400 500 600 700 800
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65Regression Curves
Total Expenditure (in USD)
Prop
. of E
xpen
ditur
e on F
ood
−2 0 2 4 6 8 10 12 14
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65Standardized Regression Curves
Total Expenditure (in USD)
Prop
. of E
xpen
ditur
e on F
ood
Usual regression functions Standardized reg. functions2 / 13
Example 2Data on the sales (in Indian rupees) and profit (as afraction of sales) for companies over different years
Compare the Y = profitability of the companies againstX = sales for years 1997 (red) and 2003 (blue)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5x 106
!0.05
0
0.05
0.1
0.15
0.2Regression Curves
Sales
Prof
it to
Sal
es
!5 0 5 10 15 20 25 30 35!0&02
0
0&02
0&04
0&0(
0&0)
0&1
0&12
0&14Standardized Regression Curves
Sales
Pro=
it to
Sal
es
Usual regression functions Standardized reg. functions
3 / 13
Problem: Comparison of two regression functions
Two bivariate populations (X1,Y1) and (X2,Y2)
We usually look at µi(x) = E(Yi |Xi = x), i = 1,2
Instead, compare the fractile regression functions
mi(t) = E{Yi |Fi(Xi) = t}, t ∈ (0,1)
where Fi is the c.d.f. of Xi
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7Fractile Graphs
Fractiles of Total Expenditure (in USD)
Pro
p. o
f Exp
endi
ture
on
Foo
d
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1!0.12
!0.1
!0.08
!0.06
!0.04
!0.02
0
0.02
0.04
0.06
0.08Fractile Graphs
Fractiles of Sales
Prof
it to
Sal
es
Fractile regression functions in Examples 1 and 24 / 13
Other applications of fractile regression
Hertz-Picciotto and Din-Dzietham (Epidemiology, 1998)compare the infant mortality of African and EuropeanAmericans with gestational age
Nordhaus (PNAS, 2006) compares the dependence of logof “output density” with key geographic variables
Fractile regression enables us to simultaneously comparethe effect of different covariates on one response variable
5 / 13
Why the fractile transformation X1 7→ F1(X1)?
Transformed covariates F1(X1) and F2(X2) both have aUnif (0,1) distribution; thereby adjusting for covariateskewness/data sparsity
Distribution-free nonparametric standardization
Compare m1(t) and m2(t), the means of Y1 and Y2 at thet-th quantile of the covariates
Makes the fractile regression functions invariant under allstrictly increasing transformations of the covariate, e.g., ifX2 = φ(X1), Y1 = Y2, then
E{Y1|F1(X1)} = E{Y2|F2(X2)}
Mahalanobis (Econometrica, 1960), Sen and Chaudhuri(JASA, 2010), ...
6 / 13
Extension to multi-dimension
Questions:How do we standardize the distribution of the covariatesthat will enable a more meaningful comparison of theregression functions?
Suppose (X1,Y ) and (X2,Y ) in Rd+1 for d ≥ 1, X2 = g(X1)and g : Rd 7→ Rd is an (unknown) invertible function. Howto standardize the covariates and conclude that the tworegression functions are essentially the same? 7 / 13
Notation
(X,Y ) is a random vector having a continuous distributionon Rd+1, d ≥ 1, where X = (X1,X2, . . . ,Xd ) ∈ Rd
Standardization of the covariate: T : P×Rd → E ⊂ Rd suchthat x 7→ T(P,x) ≡ T(X,x) is an invertible map from XP, thesupport of P, onto E , for every X ∼ P ∈ P, a class ofdistributions on Rd .
Tls(P,x) = Γ(P)−1/2{x− µ(P)}, Γ(P) = diag(σ21, . . . , σ
2d )
The standardized regression function is then defined as
mX(t) = E{Y |T(P,X) = t} for t ∈ E .
G: group of one-one transformations acting on the space ofall predictors X ∈ P. We say that T is invariant under G ifT(g(X),g(x)) = T(X,x), for all x ∈ Rd and g ∈ G.
8 / 13
Fractile Standardization
For X ∼ P, define RP : Rd 7→ (0,1)d , as
RP(x) =(F1(x1),F2|1(x2|x1), . . . ,Fd |1,...,d−1(xd |x1, . . . , xd−1)
),
where F1(x1) = P(X1 ≤ x1),F2|1(x2) = P(X2 ≤ x2|X1 = x1), . . .
Fractile regression: mX(t) = E{Y |RP(X) = t}, t ∈ (0,1)d
Distributional standardization: RP(X) ∼ Uniform(0,1)d
Multivariate analogue of X1 7→ F1(X1)
9 / 13
Invariance
Consider the group F ,x 7→ (g1(x1), . . . ,gd (xd )), wheregi : Ri → R, is a ↑ func. in xi for every fixed (x1, . . . , xi−1),and (g1, . . . ,gi) : Ri → Ri is invertible for every i
Invariance: for g ∈ F , RX(x) = Rg(X)(g(x)) for all x ∈ Rd
{all coordinate-wise increasing transformations} ⊂ F
If we want the standardized regression function to beinvariant under the group action F , then thestandardization T(X, ·) has to be a function of RP
Furthermore, if we assume that T(X,X) ∼ Unif (0,1)d andT(X, ·) ∈ F then T(X,x) = RP(x) for all x, for all X ∼ P ∈ P
10 / 13
Computation of RP
X1,X2, . . . ,Xn i.i.d. P
RP requires estimation of conditional distribution functions
may use a kernel estimate of the multivariate density of X1,and then use it to get the various conditional densities
fn;1,2,...,d (x) =1
n(h1,nh2,n . . . hd ,n)
n∑i=1
K(
x− Xi
hn
)fn;j|1,...,j−1(xj |x1, . . . , xj−1) =
fn;1,...,j(x1, . . . , xj)
fn;1,...,j−1(x1, . . . , xj−1)
Standardized covariates: Rn(X1),Rn(Xn), . . . ,Rn(Xn)
Under appropriate conditions, supx ‖Rn(x)− RP(x)‖ P→ 0.
Curse of dimensionality! 11 / 13
Computation of fractile regression
Smooth estimate of fractile regression:
m̂n(t) =n∑
i=1
YiWn,i(t), t ∈ (0,1)d
Nadaraya-Watson type weight: Wn,i(t) =K(
t−Rn(Xi )hn
)∑n
j=1 K(
t−Rn(Xj )hn
)
Y on X1 and X2 Y on X2 and X112 / 13
SummaryUsual comparison of regression functions not alwayspossible and meaningful
RP acheives distributional standardization
RP has nice invariance properties, but computationallychallanging for d large
Alternatives: marginal standardization, centered rankfunction (multivariate distribution transform), etc.
Thank You!
Questions?
13 / 13