mblm                  package:mblm                  R Documentation

_F_i_t_t_i_n_g _M_e_d_i_a_n-_B_a_s_e_d _L_i_n_e_a_r _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function is used to fit linear models based on Theil-Sen
     single median, or Siegel repeated medians.

_U_s_a_g_e:

     mblm(formula, dataframe, repeated = TRUE)

_A_r_g_u_m_e_n_t_s:

 formula: A formula of type 'y ~ x' (only linear models are accepted)

dataframe: Optional dataframe 

repeated: If set to true, model is computed using repeated medians. If
          false, a single median estimators are calculated

_D_e_t_a_i_l_s:

     Theil-Sen single median method computes slopes of lines crossing
     all possible pairs  of points, when x coordinates differ. After
     calculating these n(n-1)/2 slopes (these value are true only if x
     is distinct), the median of them is taken as slope estimator.
     Next, the intercepts of n lines, crossing each point and having
     calculated slope are calculated. The median from them is intercept
     estimator.

     Siegel repeated medians is more complicated. For each point, the
     slopes between it and the others are calcuated (resulting n-1
     slopes) and the median is taken. This results in n medians and
     median from this medians is slope estimator. Intercept is
     calculated in similar way, for more information please take a look
     in function source.

     The breakdown point of Theil-Sen method is about 29%, Siegel
     extended it to 50%, so these regression methods are very robust.
     Additionally, if the errors are normally distributed and no
     outliers are present, the estimators are very similar to classic
     least squares.

_V_a_l_u_e:

     An object of class 'c("mblm","lm")', containing minimal set of
     data to perform basic operations, such as in case of lm model.
     Additionally, the return value contains 2 fields:

  slopes: The slopes (in single median), or medians of slopes (in
          repeated medians) between tested point pairs

intercepts: The intercepts calculated

_N_o_t_e:

     This function should have compatibility with all 'lm' methods, but
     it is not guaranteed that they will work or have any cognitive
     value (this method is nonparametric). The compatibility was only
     introduced to use some basic methods from 'lm' without programming
     new functions.

_A_u_t_h_o_r(_s):

     Lukasz Komsta, some fixes by Sven Garbade

_R_e_f_e_r_e_n_c_e_s:

     Theil, H. (1950) A rank invariant method for linear and polynomial
     regression analysis. Nederl. Akad. Wetensch. Proc. Ser. A 53,
     386-392 (Part I), 521-525 (Part II), 1397-1412 (Part III).

     Sen, P.K. (1968). Estimates of Regression Coefficient Based on
     Kendall's tau. J. Am. Stat. Ass. 63, 324, 1379-1389.

     Siegel, A.F. (1982). Robust Regression Using Repeated Medians.
     Biometrika, 69, 1, 242-244.

_S_e_e _A_l_s_o:

     'lm', 'summary.mblm', 'confint.mblm'

_E_x_a_m_p_l_e_s:

     set.seed(1234)
     x <- 1:100+rnorm(100)
     y <- x+rnorm(100)
     y[100] <- 200
     fit <- mblm(y~x)
     fit
     summary(fit)
     fit2 <- lm(y~x)
     plot(x,y)
     abline(fit)
     abline(fit2,lty=2)
     plot(fit)
     residuals(fit)
     fitted(fit)
     plot(density(fit$slopes))
     plot(density(fit$intercepts))
     anova(fit)
     anova(fit2)
     anova(fit,fit2)
     confint(fit)
     AIC(fit,fit2)

