第2章 ARMA過程

In [2]:

Copied!





import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import acf, pacf
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import acf, pacf
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA

ARMA過程の性質¶

$\theta_{0}=1, \varepsilon_{t}\sim\mathrm{W.N.}(\sigma^{2})$とする

MA(q) process $$ y_{t}=\mu+\sum_{i=0}^{q}\theta_{i}\varepsilon_{t-i}\sim\mathrm{MA}(q) $$
AR(p) process $$ y_{t}=c+\sum_{i=1}^{p}\phi_{i}y_{t-i}+\varepsilon_{t}\sim\mathrm{AR}(p) $$
ARMA(p,q) process $$ y_{t}=c+\sum_{i=1}^{p}\phi_{i}y_{t-i}+\sum_{i=0}^{q}\theta_{i}\varepsilon_{t-i}\sim\mathrm{ARMA}(p,q) $$

モデル	平均	分散	自己相関	定常性
MA過程	$\mathbb{E}[y_{t}]=\mu$	$\gamma_{0}=\sigma^{2}\sum_{i=0}^{q}\theta_{i}^{2}$ $\gamma_{k}=\sigma^{2}\sum_{i=0}^{q-k}\theta_{i}\theta_{i+k},\ (1\leq k\leq q)$ $\gamma_{k}=0,\ (k\geq q+1)$	$\rho_{k}=\dfrac{\sum_{i=0}^{q-k}\theta_{i}\theta_{i+k}}{\sum_{i=0}^{q}\theta_{i}^{2}},\ (1\leq k\leq q)$ $\rho_{k}=0,\ (k\geq q+1)$	常に定常
定常AR過程	$\mathbb{E}[y_{t}]=\dfrac{c}{1-\sum_{i=1}^{p}\phi_{i}}$	$\gamma_{0}=\dfrac{\sigma^{2}}{1-\sum_{i=1}^{p}\phi_{i}\rho_{i}}$ $\gamma_{k}=\sum_{i=1}^{p}\phi_{i}\gamma_{k-i},\ (k\geq 1)$	$\rho_{k}=\sum_{i=1}^{p}\phi_{i}\rho_{k-i}$	常に定常
定常ARMA過程	$\mathbb{E}[y_{t}]=\dfrac{c}{1-\sum_{i=1}^{p}\phi_{i}}$	$\gamma_{k}=\sum_{i=1}^{p}\phi_{i}\gamma_{k-i},\ (k\geq q+1)$	$\rho_{k}=\sum_{i=1}^{p}\phi_{i}\rho_{k-i},\ (k\geq q+1)$	常に定常

ARMA過程の定常性と反転可能性¶

AR過程の定常可能性
- AR特性方程式 $$ 1-\sum_{i=1}^{p}\phi_{i}z^{i}=0 $$ について、すべての解で$|z|>1$となるとき、AR過程は定常となる。
- AR過程をMA表現できるとき、AR過程は定常となる。
MA過程の反転可能性
- MA特性方程式 $$ \sum_{i=0}^{q}\theta_{i}z^{i}=0 $$ について、すべての解で$|z|>1$となるとき、MA過程は反転可能となる。
- MA過程がAR($\infty$)表現できるとき、MA過程は定常となる。
ARMA過程の定常・反転可能性
- ARMA過程のAR過程の部分が定常であればARMA過程は定常となる。
- ARMA過程のMA過程の部分が反転可能であればARMA過程は反転可能となる。

問題¶

2.1¶

AR(2)過程 $$ y_{t}=c+\phi_{1}y_{t-1}+\phi_{2}y_{t-2}+\varepsilon_{t} $$ の定常条件を調べる。特性方程式は $$ 1-\phi_{1}z-\phi_{2}z^{2}=0 $$ であるから、この($\phi_{2}$の値にもよるが)2次方程式が$|z|>1$のみに解を持つ条件を調べればよい。

$\phi_{2}=0$のとき、$\phi_{1}z-1=0$となるので、これが$|z|>1$の解を持つには$\phi_{1}\ne0$かつ$|z|=|1/\phi_{1}|>1$となればよい。これを整理すると$0<|\phi_{1}|<1$となる。
$\phi_{2}>0$のとき、$z=(-\phi_{1}\pm\sqrt{\phi_{1}^{2}+4\phi_{2}})/(2\phi_{2})$となる。これより$(-\phi_{1}+\sqrt{\phi_{1}^{2}+4\phi_{2}})/(2\phi_{2})>1$かつ$(-\phi_{1}-\sqrt{\phi_{1}^{2}+4\phi_{2}})/(2\phi_{2})<-1$が成り立てばよく、整理すると

In [2]:

Copied!

df = pd.read_excel('data/economicdata.xls').rename(columns={'Unnamed: 0': 'date'}).set_index('date'); df
df = pd.read_excel('data/economicdata.xls').rename(columns={'Unnamed: 0': 'date'}).set_index('date'); df

Out[2]:

	nikkei225	topix	indprod	exrate	cpi	saunemp	intrate
date
1975-01-01	3767.09	276.09	47.33	29.13	52.625	1.7	12.67
1975-02-01	4100.97	299.81	46.86	29.70	52.723	1.8	13.00
1975-03-01	4300.08	313.50	46.24	29.98	53.114	1.8	12.92
1975-04-01	4435.26	320.57	47.33	29.80	54.092	1.8	12.02
1975-05-01	4506.24	329.65	47.33	29.79	54.385	1.8	11.06
...	...	...	...	...	...	...	...
2004-12-01	11061.32	1110.39	99.91	87.46	98.109	4.5	0.00
2005-01-01	11394.84	1144.09	103.41	88.88	97.913	4.5	0.00
2005-02-01	11545.30	1159.71	101.41	87.72	97.620	4.6	0.00
2005-03-01	11809.38	1191.82	101.01	86.78	97.913	4.5	0.00
2005-04-01	11395.64	1158.22	102.71	85.88	98.011	4.5	0.00

364 rows × 7 columns

In [21]:

Copied!





n_lag = 20

fig, axes = plt.subplots(1, 2, figsize=[12, 6])

axes[0].bar(
    np.arange(1, n_lag + 1),
    acf(df.indprod.apply(np.log).diff().dropna(), nlags=20)[1:]
)
axes[0].axhline(1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[0].axhline(-1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[0].axhline(0.0, color='gray', lw=0.5)
axes[0].set_xlabel('lag')
axes[0].set_ylabel('auto corr')

axes[1].bar(
    np.arange(1, n_lag + 1),
    pacf(df.indprod.apply(np.log).diff().dropna(), nlags=20)[1:]
)
axes[1].axhline(1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[1].axhline(-1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[1].axhline(0.0, color='gray', lw=0.5)
axes[1].set_xlabel('lag')
axes[1].set_ylabel('auto corr')

plt.show()
n_lag = 20

fig, axes = plt.subplots(1, 2, figsize=[12, 6])

axes[0].bar(
    np.arange(1, n_lag + 1),
    acf(df.indprod.apply(np.log).diff().dropna(), nlags=20)[1:]
)
axes[0].axhline(1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[0].axhline(-1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[0].axhline(0.0, color='gray', lw=0.5)
axes[0].set_xlabel('lag')
axes[0].set_ylabel('auto corr')

axes[1].bar(
    np.arange(1, n_lag + 1),
    pacf(df.indprod.apply(np.log).diff().dropna(), nlags=20)[1:]
)
axes[1].axhline(1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[1].axhline(-1.96 / np.sqrt(df.shape[0]), color='gray', ls='dashed', lw=0.5)
axes[1].axhline(0.0, color='gray', lw=0.5)
axes[1].set_xlabel('lag')
axes[1].set_ylabel('auto corr')

plt.show()

No description has been provided for this image

In [48]:

Copied!





res = sm.tsa.AutoReg(
    df.indprod.apply(np.log).diff().dropna().values * 100,
    lags=4, trend='c'
).fit()
res = sm.tsa.AutoReg(
    df.indprod.apply(np.log).diff().dropna().values * 100,
    lags=4, trend='c'
).fit()

In [51]:

Copied!

res.aic / (df.shape[0] - 1), res.bic / (df.shape[0] - 1)
res.aic / (df.shape[0] - 1), res.bic / (df.shape[0] - 1)

Out[51]:

(3.2003689191951596, 3.264556066112321)

In [58]:

Copied!





orders = [(4,0,0), (0,0,3), (1,0,1), (2,0,1), (1,0,2), (2,0,2)]
res_lst = []
for order in orders:
    res = ARIMA(
        df.indprod.apply(np.log).diff().dropna().values * 100,
        order=order
    ).fit()
    res_lst.append(res)
orders = [(4,0,0), (0,0,3), (1,0,1), (2,0,1), (1,0,2), (2,0,2)]
res_lst = []
for order in orders:
    res = ARIMA(
        df.indprod.apply(np.log).diff().dropna().values * 100,
        order=order
    ).fit()
    res_lst.append(res)

/home/yoneda/.local/lib/python3.10/site-packages/statsmodels/tsa/statespace/sarimax.py:966: UserWarning: Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.
  warn('Non-stationary starting autoregressive parameters'
/home/yoneda/.local/lib/python3.10/site-packages/statsmodels/tsa/statespace/sarimax.py:978: UserWarning: Non-invertible starting MA parameters found. Using zeros as starting parameters.
  warn('Non-invertible starting MA parameters found.'

In [64]:

Copied!

for i in range(len(orders)):
    # print(f'AIC: {res_lst[i].aic / (df.shape[0] - 1):.4f}')
    print(f'BIC: {res_lst[i].bic / (df.shape[0] - 1):.4f}')
for i in range(len(orders)):
    # print(f'AIC: {res_lst[i].aic / (df.shape[0] - 1):.4f}')
    print(f'BIC: {res_lst[i].bic / (df.shape[0] - 1):.4f}')

BIC: 3.3053
BIC: 3.3022
BIC: 3.3770
BIC: 3.3404
BIC: 3.2958
BIC: 3.3110

In [69]:

Copied!

res_lst[0].summary()
res_lst[0].summary()

Out[69]:

SARIMAX Results
Dep. Variable:	y	No. Observations:	363
Model:	ARIMA(4, 0, 0)	Log Likelihood	-582.230
Date:	Fri, 14 Jul 2023	AIC	1176.460
Time:	11:19:00	BIC	1199.826
Sample:	0	HQIC	1185.748
	- 363
Covariance Type:	opg

	coef	std err	z	P>\|z\|	[0.025	0.975]
const	0.2112	0.083	2.555	0.011	0.049	0.373
ar.L1	-0.3567	0.046	-7.837	0.000	-0.446	-0.267
ar.L2	0.0799	0.051	1.562	0.118	-0.020	0.180
ar.L3	0.3074	0.055	5.620	0.000	0.200	0.415
ar.L4	0.2057	0.049	4.165	0.000	0.109	0.303
sigma2	1.4459	0.100	14.426	0.000	1.249	1.642

Ljung-Box (L1) (Q):	0.04	Jarque-Bera (JB):	2.45
Prob(Q):	0.85	Prob(JB):	0.29
Heteroskedasticity (H):	1.26	Skew:	-0.12
Prob(H) (two-sided):	0.20	Kurtosis:	3.33

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

In [ ]: