一些常见的可识别性

Published:

符号解释:

$Y$ : 结局变量,或者dependent variable

$A$:处理变量,或者independent variable, 假设能取两个值,0(即control),1(即treatment)

$X$:协变量

$Y(0), Y(1)$:潜在结果

$\bot$:独立

一、随机对照实验

估计目标 (Estimand):

平均因果效应(Average Treatment Effect,ATE),即$E[Y(1)-Y(0)]$

可识别性假设:

  1. Consistency:$Y(a)=Y$, 如果$A=a$,其中$a$取值0或1

  2. 随机化实验:$Y(a)\bot A$

可识别性:

$E[Y(a)]=E[Y(a)\mid A=a]=E[Y\mid A=a]$

因此$E[Y(1)-Y(0)]=E[Y \mid A=1]-E[Y \mid A=0]$

参考文献:

Imbens, Guido W., and Donald B. Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.

二、没有未观测混杂的观察性研究

Estimand:

ATE: $E[Y(1)-Y(0)]$

可识别性假设

Consistency: $Y=Y(a)$ 如果 $A=a$

可忽略性: $(Y(0), Y(1))\bot A \mid X=x$

Positivity: P(A=a\mid X=x)>0 如果$(A,X)$ 在 $(a,x)$ 处的density大于0

可识别性:

g-formula:

\[\begin{aligned} E[Y(a)]&=E[E[Y(a) \mid X]]\\ &=\int E[Y(a) \mid X=x]f(x) dx\\ &=\int E[Y(a)\mid A=a, X=x]f(x) dx \\ &=\int E[Y\mid A=a, X=x]f(x) dx \end{aligned}\]

所以

\[E[Y(1)]-E[Y(1)]\int (E[Y\mid A=a, X=x]-E[Y\mid A=0,X=x])f(x) dx\]

Inverse probability weighting:

$E[Y(a)]=E[\frac{I(A=a)Y}{P(A=a\mid X)}]$

一个简单的推导:

\[\begin{aligned} E\big[\frac{I(A=a)Y}{P(A=a|X)}\big]&=E\big[E[\frac{I(A=a)Y}{P(A=a|X)}|X]\big]\\ &=E\big[\frac{1}{P(A=a|X)}E[{I(A=a)Y}|X]\big] \\ &=E\big[\frac{1}{P(A=a|X)}E[{I(A=a)Y(a)}|X]\big] \\ &=E[E[Y(a)|X]]\\ &=E[Y(a)] \end{aligned}\]

因此$E[Y(1)-Y(0)]=E\big[\frac{I(A=1)Y}{P(A=1\mid X)}\big]-E\big[\frac{I(A=0)Y}{P(A=0\mid X)}\big]$。 其中,$P(A=1\mid X)$经常被称作倾向得分(propensity score),记作e(X)。这种方法也可以被叫做倾向得分加权。

倾向得分匹配/分层:

可以证明,propensity score是一个balance score:

$(Y(1),Y(0))\bot A\mid e(X)$,即在倾向得分相同的人群中,处理的分配可以看成是随机的,因此可以通过对倾向得分进行匹配或者分层的方式来估计因果效应

参考文献

G-formula:

Robins, James. “A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect.” Mathematical modelling 7.9-12 (1986): 1393-1512.

Inverse Probability weighting:

Robins, James M., Miguel Angel Hernan, and Babette Brumback. “Marginal structural models and causal inference in epidemiology.” Epidemiology 11.5 (2000): 550-560.

Propensity score:

Rosenbaum, Paul R., and Donald B. Rubin. “The central role of the propensity score in observational studies for causal effects.” Biometrika 70.1 (1983): 41-55.

Abadie, Alberto, and Guido W. Imbens. “Matching on the estimated propensity score.” Econometrica 84.2 (2016): 781-807.

三、工具变量

在观察性研究中,或者不完美的随机对照实验中,会有一些无法观测的混杂因素。此时因果效应可以通过借助工具变量进行估计。工具变量被广泛应用于经济学的研究中,但是最初是被用于估计线性结构方程模型来消除遗漏变量偏误。Angrist, Imbens, Rubin三人在JASA上的文章使用因果的框架研究了工具变量,他们的结果表明,在一定假设下complier average treatment effect是可以识别的(经济学里似乎更喜欢叫local average treatment effect),他们的可识别性结果需要借助单调性假设。工具变量的另一个研究角度是通过施加类似t同质性的假设来识别ATE。

符号:

$Y$ : 结局变量,或者dependent variable

$Z$ :工具变量,假设能取两个值,0和1

$A$:处理变量,或者independent variable, 假设能取两个值,0(即control),1(即treatment)

$X$:协变量

$Y(z,a), A(z)$: 潜在结果

Estimand:

$E[Y(1)-Y(0)|A(1)=1, A(0)=0]$

根据$A(z)$的取值,人群中一共有四类人,分别是:

  • Always-taker: $A(1)=1, A(0)=1$
  • Nener-taker: $A(1)=0, A(0)=0$
  • Complier: $A(1)=1, A(0)=0$
  • Defier: $A(0)=1, A(1)=0$

这么起名字的原因是,在Angrist等人的研究中,$Z$代表随机化的分组,而$A$代表受试者实际采用的treatment。因此将$A(1)=1, A(0)=0$的人称为依从者(Complier)。

可识别性假设

  1. 相关性: $Z\bot A$不成立

  2. 排他性:$Y(z_1,a)=Y(z_2,a)=Y(a)$

  3. 随机性: $(Y(z,a), A(z’))\bot Z$

满足假设1、2、3的变量一般被成为工具变量。此外,我们还需要假设4单调性:

$A(1)\geq A(0)$。

可识别性:

$E[Y(1)-Y(0)\mid A(1)=1, A(0)=0]=\frac{E[Y\mid Z=1]-E[Y\mid Z=0]}{E[A\mid Z=1]-E[A\mid Z=0]}$

一个简单的推导:

首先,$Z$对Y的因果作用$E[Y(1,A(1))-Y(0,A(0))]$,可以写成人群中四部分人的加权平均

\[\begin{align} E[Y(1,A(1))-Y(0, A(0))]=& E[Y(1,A(1))-Y(0, A(0))|A(0)=0,A(1)=0]P(A(0)=0,A(1)=0)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=1,A(1)=1]P(A(0)=1,A(1)=1)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=0,A(1)=1]P(A(0)=0,A(1)=1)\\ &+E[Y(1,A(1))-Y(0, A(0))|A(0)=1,A(1)=0]P(A(0)=1,A(1)=0) \end{align}\]

其次,由于不存在Defier,因此$P(A(0)=1,A(1)=0)=0$,又由排他性我们知道$E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=0]=E[Y(0)-Y(0)\mid A(0)=0,A(1)=0]=0$, $E[Y(1,A(1))-Y(0, A(0))\mid A(0)=1,A(1)=1]=E[Y(1)-Y(1)\mid A(0)=1,A(1)=1]=0$.所以

\[E[Y(1,A(1))-Y(0, A(0))] = E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=1]P(A(0)=0,A(1)=1)\]

\[E[Y(1,A(1))-Y(0, A(0))\mid A(0)=0,A(1)=1]=\frac{E[Y(1,A(1))-Y(0, A(0))]}{P(A(0)=0,A(1)=1)}\]

又由于$Z$是随机分配的,所以$E[Y(1,A(1))-Y(0, A(0))]=E[Y\mid Z=1]-E[Y\mid Z=0]$。由单调性假设和$Z$的随机性我们知道$P(A(0)=0,A(1)=1)=P[A(1)-A(0)=1]=E[A(1)-A(0)]=E[A\mid Z=1]-E[A\mid Z=0]$,因此

\[E[Y(1)-Y(0)\mid A(1)=1, A(0)=0]=\frac{E[Y\mid Z=1]-E[Y\mid Z=0]}{E[A\mid Z=1]-E[A\mid Z=0]}\]

参考文献

Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. “Identification of causal effects using instrumental variables.” Journal of the American statistical Association 91.434 (1996): 444-455.

关于施加同质性假设从而识别ATE的可见:

Hernán, Miguel A., and James M. Robins. “Instruments for causal inference: an epidemiologist’s dream?.” Epidemiology (2006): 360-372.

Wang, Linbo, and Eric Tchetgen Tchetgen. “Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80.3 (2018): 531-550.

四、Proximal Causal Inference

五、Difference in Difference

六、Regression Discontinuity

七、Causal Inference in complex longitudinal settings