一些常见的可识别性
Published:
符号解释:
$Y$ : 结局变量,或者dependent variable
$A$:处理变量,或者independent variable, 假设能取两个值,0(即control),1(即treatment)
$X$:协变量
$Y(0), Y(1)$:潜在结果
$\bot$:独立
一、随机对照实验
估计目标 (Estimand):
平均因果效应(Average Treatment Effect,ATE),即$E[Y(1)-Y(0)]$
可识别性假设:
Consistency:$Y(a)=Y$, 如果$A=a$,其中$a$取值0或1
随机化实验:$Y(a)\bot A$
可识别性:
$E[Y(a)]=E[Y(a)\mid A=a]=E[Y\mid A=a]$
因此$E[Y(1)-Y(0)]=E[Y \mid A=1]-E[Y \mid A=0]$
参考文献:
Imbens, Guido W., and Donald B. Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.
二、没有未观测混杂的观察性研究
Estimand:
ATE: $E[Y(1)-Y(0)]$
可识别性假设
Consistency: $Y=Y(a)$ 如果 $A=a$
可忽略性: $(Y(0), Y(1))\bot A \mid X=x$
Positivity: P(A=a\mid X=x)>0 如果$(A,X)$ 在 $(a,x)$ 处的density大于0
可识别性:
g-formula:
\[\begin{aligned} E[Y(a)]&=E[E[Y(a) \mid X]]\\ &=\int E[Y(a) \mid X=x]f(x) dx\\ &=\int E[Y(a)\mid A=a, X=x]f(x) dx \\ &=\int E[Y\mid A=a, X=x]f(x) dx \end{aligned}\]所以
\[E[Y(1)]-E[Y(1)]\int (E[Y\mid A=a, X=x]-E[Y\mid A=0,X=x])f(x) dx\]Inverse probability weighting:
$E[Y(a)]=E[\frac{I(A=a)Y}{P(A=a\mid X)}]$
一个简单的推导:
\[\begin{aligned} E\big[\frac{I(A=a)Y}{P(A=a|X)}\big]&=E\big[E[\frac{I(A=a)Y}{P(A=a|X)}|X]\big]\\ &=E\big[\frac{1}{P(A=a|X)}E[{I(A=a)Y}|X]\big] \\ &=E\big[\frac{1}{P(A=a|X)}E[{I(A=a)Y(a)}|X]\big] \\ &=E[E[Y(a)|X]]\\ &=E[Y(a)] \end{aligned}\]因此$E[Y(1)-Y(0)]=E\big[\frac{I(A=1)Y}{P(A=1\mid X)}\big]-E\big[\frac{I(A=0)Y}{P(A=0\mid X)}\big]$。 其中,$P(A=1\mid X)$经常被称作倾向得分(propensity score),记作e(X)。这种方法也可以被叫做倾向得分加权。
倾向得分匹配/分层:
可以证明,propensity score是一个balance score:
$(Y(1),Y(0))\bot A\mid e(X)$,即在倾向得分相同的人群中,处理的分配可以看成是随机的,因此可以通过对倾向得分进行匹配或者分层的方式来估计因果效应
参考文献
G-formula:
Robins, James. “A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect.” Mathematical modelling 7.9-12 (1986): 1393-1512.
Inverse Probability weighting:
Robins, James M., Miguel Angel Hernan, and Babette Brumback. “Marginal structural models and causal inference in epidemiology.” Epidemiology 11.5 (2000): 550-560.
Propensity score:
Rosenbaum, Paul R., and Donald B. Rubin. “The central role of the propensity score in observational studies for causal effects.” Biometrika 70.1 (1983): 41-55.
Abadie, Alberto, and Guido W. Imbens. “Matching on the estimated propensity score.” Econometrica 84.2 (2016): 781-807.
三、工具变量
在观察性研究中,或者不完美的随机对照实验中,会有一些无法观测的混杂因素。此时因果效应可以通过借助工具变量进行估计。工具变量被广泛应用于经济学的研究中,但是最初是被用于估计线性结构方程模型来消除遗漏变量偏误。Angrist, Imbens, Rubin三人在JASA上的文章使用因果的框架研究了工具变量,他们的结果表明,在一定假设下complier average treatment effect是可以识别的(经济学里似乎更喜欢叫local average treatment effect),他们的可识别性结果需要借助单调性假设。工具变量的另一个研究角度是通过施加类似t同质性的假设来识别ATE。
符号:
$Y$ : 结局变量,或者dependent variable
$Z$ :工具变量,假设能取两个值,0和1
$A$:处理变量,或者independent variable, 假设能取两个值,0(即control),1(即treatment)
$X$:协变量
$Y(z,a), A(z)$: 潜在结果
Estimand:
$E[Y(1)-Y(0)|A(1)=1, A(0)=0]$
根据$A(z)$的取值,人群中一共有四类人,分别是:
- Always-taker: $A(1)=1, A(0)=1$
- Nener-taker: $A(1)=0, A(0)=0$
- Complier: $A(1)=1, A(0)=0$
- Defier: $A(0)=1, A(1)=0$
这么起名字的原因是,在Angrist等人的研究中,$Z$代表随机化的分组,而$A$代表受试者实际采用的treatment。因此将$A(1)=1, A(0)=0$的人称为依从者(Complier)。
可识别性假设
相关性: $Z\bot A$不成立
排他性:$Y(z_1,a)=Y(z_2,a)=Y(a)$
随机性: $(Y(z,a), A(z’))\bot Z$
满足假设1、2、3的变量一般被成为工具变量。此外,我们还需要假设4单调性:
$A(1)\geq A(0)$。
可识别性:
$E[Y(1)-Y(0)\mid A(1)=1, A(0)=0]=\frac{E[Y\mid Z=1]-E[Y\mid Z=0]}{E[A\mid Z=1]-E[A\mid Z=0]}$
一个简单的推导:
首先,$Z$对Y的因果作用$E[Y(1,A(1))-Y(0,A(0))]$,可以写成人群中四部分人的加权平均
\[\begin{align} E[Y(1,A(1))-Y(0, A(0))]=& E[Y(1,A(1))-Y(0, A(0))|A(0)=0,A(1)=0]P(A(0)=0,A(1)=0)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=1,A(1)=1]P(A(0)=1,A(1)=1)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=0,A(1)=1]P(A(0)=0,A(1)=1)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=1,A(1)=0]P(A(0)=1,A(1)=0) \end{align}\]其次,由于不存在Defier,因此$P(A(0)=1,A(1)=0)=0$,又由排他性我们知道$E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=0]=E[Y(0)-Y(0)\mid A(0)=0,A(1)=0]=0$, $E[Y(1,A(1))-Y(0, A(0))\mid A(0)=1,A(1)=1]=E[Y(1)-Y(1)\mid A(0)=1,A(1)=1]=0$.所以
\[E[Y(1,A(1))-Y(0, A(0))] = E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=1]P(A(0)=0,A(1)=1)\]即
\[E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=1]=\frac{E[Y(1,A(1))-Y(0, A(0))]}{P(A(0)=0,A(1)=1)}\]又由于$Z$是随机分配的,所以$E[Y(1,A(1))-Y(0, A(0))]=E[Y\mid Z=1]-E[Y\mid Z=0]$。由单调性假设和$Z$的随机性我们知道$P(A(0)=0,A(1)=1)=P[A(1)-A(0)=1]=E[A(1)-A(0)]=E[A\mid Z=1]-E[A\mid Z=0]$,因此
\[E[Y(1)-Y(0)\mid A(1)=1, A(0)=0]=\frac{E[Y\mid Z=1]-E[Y\mid Z=0]}{E[A\mid Z=1]-E[A\mid Z=0]}\]参考文献
Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. “Identification of causal effects using instrumental variables.” Journal of the American statistical Association 91.434 (1996): 444-455.
关于施加同质性假设从而识别ATE的可见:
Hernán, Miguel A., and James M. Robins. “Instruments for causal inference: an epidemiologist’s dream?.” Epidemiology (2006): 360-372.
Wang, Linbo, and Eric Tchetgen Tchetgen. “Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80.3 (2018): 531-550.