lagsarlm                package:spdep                R Documentation

_S_p_a_t_i_a_l _s_i_m_u_l_t_a_n_e_o_u_s _a_u_t_o_r_e_g_r_e_s_s_i_v_e _l_a_g _m_o_d_e_l _e_s_t_i_m_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum likelihood estimation of spatial simultaneous
     autoregressive lag and mixed models of the form:


                       y = rho W y + X beta + e


     where rho is found by 'optimize()' first, and beta and other
     parameters by generalized least squares subsequently
     (one-dimensional search using optim performs badly on some
     platforms). In the mixed model, the spatially lagged independent
     variables are added to X. Note that interpretation of the fitted
     coefficients should use impact measures, because of the feedback
     loops induced by the data generation process for this model.

_U_s_a_g_e:

     lagsarlm(formula, data = list(), listw, 
             na.action, type="lag", method="eigen", quiet=TRUE, 
             zero.policy=FALSE, interval=c(-1,0.999), tol.solve=1.0e-10, 
             tol.opt=.Machine$double.eps^0.5, withLL=FALSE, 
             fdHess=NULL, optimHess=FALSE, trs=NULL, 
             searchInterval=FALSE)

_A_r_g_u_m_e_n_t_s:

 formula: a symbolic description of the model to be fit. The details 
          of model specification are given for 'lm()'

    data: an optional data frame containing the variables in the model.
           By default the variables are taken from the environment
          which the function  is called.

   listw: a 'listw' object created for example by 'nb2listw'

na.action: a function (default 'options("na.action")'), can also be
          'na.omit' or 'na.exclude' with consequences for residuals and
          fitted values - in these cases the weights list will be
          subsetted to remove NAs in the data. It may be necessary to
          set zero.policy to TRUE because this subsetting may create
          no-neighbour observations. Note that only weights lists
          created without using the glist argument to 'nb2listw' may be
          subsetted.

    type: default "lag", may be set to "mixed"; when "mixed", the
          lagged intercept is dropped for spatial weights style "W",
          that is row-standardised weights, but otherwise included

  method: "eigen" (default) - the Jacobian is computed as the product 
          of (1 - rho*eigenvalue) using 'eigenw', and "spam" or
          "Matrix" for strictly symmetric weights lists of styles "B"
          and "C", or made symmetric by similarity (Ord, 1975, Appendix
          C) if possible for styles "W" and "S", using code from the
          spam or Matrix packages to calculate the determinant. 

   quiet: default=TRUE; if FALSE, reports function values during
          optimization.

zero.policy: if TRUE assign zero to the lagged value of zones without 
          neighbours, if FALSE (default) assign NA - causing
          'lagsarlm()' to terminate with an error

interval: search interval for autoregressive parameter when not using
          method="eigen"; default is c(-1,1); method="Matrix" will
          attempt to search for an appropriate interval

tol.solve: the tolerance for detecting linear dependencies in the
          columns of matrices to be inverted - passed to 'solve()'
          (default=1.0e-10). This may be used if necessary to extract
          coefficient standard errors (for instance lowering to 1e-12),
          but errors in 'solve()' may constitute indications of poorly
          scaled variables: if the variables have scales differing much
          from the autoregressive coefficient, the values in this
          matrix may be very different in scale, and inverting such a
          matrix is analytically possible by definition, but
          numerically unstable; rescaling the RHS variables alleviates
          this better than setting tol.solve to a very small value

 tol.opt: the desired accuracy of the optimization - passed to
          'optimize()' (default=square root of double precision machine
          tolerance)

  withLL: default FALSE; if TRUE, calculate likelihood ratio statistics
          for right hand side variables when using sparse matrix
          methods in addition to appriximating the coefficient
          covariance matrix with a numerical Hessian

  fdHess: default NULL, then set to (method != "eigen") internally; use
          'fdHess' to compute an approximate Hessian using finite
          differences when using sparse matrix methods; may be used to
          make a coefficient covariance matrix when the number of
          observations is large; may be turned off to save resources if
          need be, but required for impact measures

optimHess: default FALSE, use 'fdHess' from 'nlme', if TRUE, use
          'optim' to calculate Hessian at optimum

     trs: default NULL, if given, a vector of powered spatial weights
          matrix traces output by 'trW'; when given, insert the
          asymptotic analytical values into the numerical Hessian
          instead of the approximated values; may be used to get around
          some problems raised when the numerical Hessian is poorly
          conditioned, generating NaNs in subsequent operations; the
          use of trs is recommended

searchInterval: Default FALSE; when the Matrix method is used, a search
          may be made to approximate the ends of the line search
          interval.

_D_e_t_a_i_l_s:

     The asymptotic standard error of rho is only computed when
     method=eigen, because the full matrix operations involved would be
     costly for large n typically associated with the choice of
     method="spam" or "Matrix". The same applies to the coefficient
     covariance matrix. Taken as the asymptotic matrix from the
     literature, it is typically badly scaled, and with the elements
     involving rho being very small, while other parts of the matrix
     can be very large (often many orders of magnitude in difference).
     It often happens that the 'tol.solve' argument needs to be set to
     a smaller value than the default, or the RHS variables can be
     centred or reduced in range.

     Versions of the package from 0.4-38 include numerical Hessian
     values where asymptotic standard errors are not available. This
     change has been introduced to permit the simulation of
     distributions for impact measures. Likelihood ratio test output
     for right hand side variables may be obtained in addition by
     setting withLL=TRUE. The warnings made above with regard to
     variable scaling also apply in this case.

     Note that the fitted() function for the output object assumes that
     the response  variable may be reconstructed as the sum of the
     trend, the signal, and the noise (residuals). Since the values of
     the response variable are known, their spatial lags are used to
     calculate signal components (Cressie 1993, p. 564). This differs
     from other software, including GeoDa, which does not use knowledge
     of the response  variable in making predictions for the fitting
     data.

_V_a_l_u_e:

     A list object of class 'sarlm' 

    type: "lag" or "mixed"

     rho: simultaneous autoregressive lag coefficient

coefficients: GLS coefficient estimates

 rest.se: asymptotic standard errors if ase=TRUE, otherwise approximate
          numeriacal Hessian-based values

      LL: log likelihood value at computed optimum

      s2: GLS residual variance

     SSE: sum of squared GLS errors

parameters: number of parameters estimated

lm.model: the 'lm' object returned when estimating for rho=0

  method: the method used to calculate the Jacobian

    call: the call used to create this object

residuals: GLS residuals

lm.target: the 'lm' object returned for the GLS fit

fitted.values: Difference between residuals and response variable

  se.fit: Not used yet

 formula: model formula

     ase: TRUE if method=eigen

     LLs: if ase=FALSE and withLL=TRUE (for method="spam" or "Matrix"),
          the log likelihood values of models estimated dropping each
          of the independent variables in turn, used in the summary
          function as a substitute for variable coefficient
          significance tests

  rho.se: if ase=TRUE, the asymptotic standard error of rho, otherwise
          approximate numeriacal Hessian-based value

  LMtest: if ase=TRUE, the Lagrange Multiplier test for the absence of
          spatial autocorrelation in the lag model residuals

  resvar: the asymptotic coefficient covariance matrix for (s2, rho, B)

zero.policy: zero.policy for this model

 aliased: the aliased explanatory variables (if any)

listw_style: the style of the spatial weights used

interval: the line search interval used to find rho

  fdHess: the numerical Hessian-based coefficient covariance matrix for
          (rho, B) if computed

optimHess: if TRUE and fdHess returned, 'optim' used to calculate
          Hessian at optimum

  insert: if TRUE and fdHess returned, the asymptotic analytical values
          are inserted into the numerical Hessian instead of the
          approximated values, and its size increased to include the
          first row/column for sigma2

LLNullLlm: Log-likelihood of the null linear model

na.action: (possibly) named vector of excluded or omitted observations
          if non-default na.action argument used


     The internal sar.lag.mixed.* functions return the value of the log
     likelihood function at rho.

_A_u_t_h_o_r(_s):

     Roger Bivand Roger.Bivand@nhh.no, with thanks to Andrew  Bernat
     for contributions to the asymptotic standard error code.

_R_e_f_e_r_e_n_c_e_s:

     Cliff, A. D., Ord, J. K. 1981 _Spatial processes_, Pion; Ord, J.
     K. 1975 Estimation methods for models of spatial interaction,
     _Journal of the American Statistical Association_, 70, 120-126;
     Anselin, L. 1988 _Spatial econometrics: methods and models._
     (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software
     program for the analysis of spatial data, version 1.80. Regional
     Research Institute, West Virginia University, Morgantown, WV
     (<URL: www.spacestat.com>); Anselin L, Bera AK (1998) Spatial
     dependence in linear regression models with an introduction to
     spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of
     applied economic statistics. Marcel Dekker, New York, pp. 237-289;
     Cressie, N. A. C. 1993 _Statistics for spatial data_, Wiley, New
     York.

_S_e_e _A_l_s_o:

     'lm', 'errorsarlm',  'eigenw',  'predict.sarlm', 'impacts.sarlm',
     'residuals.sarlm'

_E_x_a_m_p_l_e_s:

     data(oldcol)
     COL.lag.eig <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb, style="W"), method="eigen", quiet=FALSE)
     summary(COL.lag.eig, correlation=TRUE)
     COL.lag.eig$fdHess
     COL.lag.eig$resvar
     W <- as(as_dgRMatrix_listw(nb2listw(COL.nb)), "CsparseMatrix")
     trMatc <- trW(W, type="mult")
     COL.lag.eig1 <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb, style="W"), fdHess=TRUE, trs=trMatc)
     COL.lag.eig1$fdHess
     system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb), method="Matrix", quiet=FALSE))
     summary(COL.lag.M)
     impacts(COL.lag.M, listw=nb2listw(COL.nb))
     system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb), method="Matrix", quiet=FALSE, withLL=TRUE))
     summary(COL.lag.M)
     system.time(COL.lag.sp <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb), method="spam", quiet=FALSE))
     summary(COL.lag.sp)
     COL.lag.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb, style="B"))
     summary(COL.lag.B, correlation=TRUE)
     COL.mixed.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb, style="B"), type="mixed", tol.solve=1e-9)
     summary(COL.mixed.B, correlation=TRUE)
     COL.mixed.W <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
      nb2listw(COL.nb, style="W"), type="mixed")
     summary(COL.mixed.W, correlation=TRUE)
     NA.COL.OLD <- COL.OLD
     NA.COL.OLD$CRIME[20:25] <- NA
     COL.lag.NA <- lagsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
      nb2listw(COL.nb), na.action=na.exclude, tol.opt=.Machine$double.eps^0.4)
     COL.lag.NA$na.action
     COL.lag.NA
     resid(COL.lag.NA)
     data(boston)
     gp2mM <- lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + 
     I(RM^2) +  AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), 
     data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix")
     summary(gp2mM)
     W <- as(as_dgRMatrix_listw(nb2listw(boston.soi)), "CsparseMatrix")
     trMatb <- trW(W, type="mult")
     gp2mMi <- lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + 
     I(RM^2) +  AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), 
     data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix", 
     trs=trMatb)
     summary(gp2mMi)

