Stata/線性模型
外觀
< Stata
我們生成一個簡單的假資料集
clear set obs 1000 gen u = invnorm(uniform()) gen x = invnorm(uniform()) gen y = 1 + x + u
reg y x eret list /*gives the list of all stored results */ predict yhat /*gives the predicted value of y*/ predict res, res /*gives the residuals*/
leanout 是一個簡化輸出的 字首[1]。此命令不顯示無用的輔助統計資訊,而是關注置信區間而不是零假設檢驗。
ssc install leanout{{typo help inline|reason=similar to cleanout|date=September 2022}}
leanout : reg y x
有時您想對同一子樣本進行多元迴歸。這不是很明顯,因為當模型中缺少某個變數時,觀察值會被刪除。確保使用同一子樣本的一種方法是使用 'e(sample)' 命令,它返回所有使用觀察值的列表。在下面的示例中,qui 將 'e(sample)' 的結果儲存在變數 'samp1' 和 'samp2' 中,我們執行模型,以 'samp1==1 & samp2 == 1' 為條件。因此,我們確信這兩個估計都是使用相同的觀察值完成的。
. clear . set obs 1000 . gen u = invnorm(uniform()) . gen x = invnorm(uniform()) . gen y1 = 1 + x + u if uniform() < .8 . gen y2 = 1 + x + u if uniform() < .9 . qui reg y1 x . gen samp1 = e(sample) . ta samp1 . qui reg y2 x . gen samp2 = e(sample) . ta samp2 . eststo clear . eststo : qui : reg y1 x if samp1 & samp2 . eststo : qui : reg y2 x if samp1 & samp2 . esttab , star(* 0.1 ** 0.05 *** 0.01) se
以下是一個工具變數設定的 資料生成過程。u 與 x 相關聯,這會導致內生性。z 與 u 獨立且與 x 相關聯,這使其有資格作為 x 的有效工具。
clear set obs 1000 gen u = invnorm(uniform()) gen z = invnorm(uniform()) gen x = invnorm(uniform()) + z + u gen y = 1 + 2*x + u
很容易看出標準最小二乘估計是有偏的,而 IV 估計是無偏的。
eststo clear eststo : reg y x eststo : ivreg y (x=z) esttab , se
您可以使用 overid 或 ivreg2 執行過度識別檢驗
clear set obs 1000 gen u = invnorm(uniform()) gen z1 = invnorm(uniform()) gen z2 = invnorm(uniform()) gen x = invnorm(uniform()) + z1 - 2*z2 + u gen y = 2*x + u ivreg y (x=z1 z2) overid ivreg2 y (x=z1 z2)
. clear
. set obs 1000
. local s11 = 1
. local s12 = .5
. local s22 = 1
. local s13 = .5
. local s23 = .5
. local s33 = 1
. forvalues k = 1/3{
2. tempvar u`k'
3. gen `u`k'' = invnorm(uniform())
4. }
. gen eta1 = `s11' * `u1'
. gen eta2 = `s12' * `u1' + `s22' * `u2'
. gen eta3 = `s13' * `u1' + `s23' * `u2' + `s33' * `u3'
. gen x = invnorm(uniform())
. forvalues k=1/3{
2. gen z`k' = invnorm(uniform())
3. }
. gen y1 = 1 + 2*x + z1 + eta1
. gen y2 = - 1 + x + z2 + eta2
. gen y3 = 4 + z3 + eta3
. global eq1 = "y1 x z1"
. global eq2 = "y2 x z2"
. global eq3 = "y3 x z3"
. reg $eq1
. reg $eq2
. reg $eq3
. sureg (toto1 : $eq1) (toto2 : $eq2) (toto3 : $eq3)
- xtset
- xtreg
- xtabond
- xtabond2
- ivreg2
- xtivreg2
- ivendog
- ivhettest
- overid[檢查拼寫] : 過度識別檢驗
- xtoverid : 過度識別檢驗
- xttest2
- ivgmm0
- xtarsim
- xtdpd
- xtdpdsys
我們假設 。其中 f 與 x 和 z 獨立,u 與 x 和 z 獨立。
. clear . set obs 1000 . gen id = _n . gen f = invnorm(uniform()) . gen z = uniform() . expand 10 . gen u = invnorm(uniform()) . gen x = uniform() . gen y = 1 + x + z + f + u . eststo clear . eststo : qui : reg y x z . eststo : qui : reg y x z, robust . eststo : qui : reg y x z, cluster(id) . eststo : qui : xtreg y x z, i(id) re . eststo : qui : xtreg y x z, i(id) mle . eststo : qui : xtmixed y x z || id : , mle . esttab * , se
Layard 和 Nickel 失業率資料集。
. use http://fmwww.bc.edu/ec-p/data/macro/abdata.dta, clear (Layard & Nickell, Unemployment in Britain, Economica 53, 1986 from Ox dist)
您還可以生成假資料
clear
set obs 10000
set seed 123456
gen id = _n
gen f= invnorm(uniform())
forvalues t=1/5{
gen u`t' = invnorm(uniform())
}
gen y1 = f/.3 + u1
forvalues t=2/5{
local z=`t'-1
gen y`t' = .7 * y`z' + f + u`t'
}
save wide, replace
reshape long y, i(id) j(year)
drop u* f
tsset siren an
save long, replace
很容易看出標準隨機效應和固定效應模型是有偏的,但工具隨機效應和固定效應模型是無偏的
eststo clear eststo : qui : xtreg y l.y, re eststo : qui : xtreg y l.y, fe eststo : qui : xtivreg y (l.y= l2.d.y) , re eststo : qui : xtivreg y (l.y= l2.y) , fd esttab ,se
eststo clear eststo : qui : xi : xtabond2 y l.y, gmmstyle(l.y, lag(2 .) equation(level)) nomata robust eststo : qui : xi : xtabond2 y l.y, gmmstyle(l.y, lag(2 .) equation(level)) ivstyle( , e(diff)) nomata robust eststo : qui : xi : xtabond2 y l.y, iv(l.y l2.y l3.y, equation(diff)) nomata robust esttab , se
- ↑ Nathaniel Beck "leanout: A prefix to regress (and similar commands) to produce less output that is more useful" Stata Journal, forthcoming http://politics.as.nyu.edu/docs/IO/2576/sj_driver.pdf