bioenv                 package:vegan                 R Documentation

_B_e_s_t _S_u_b_s_e_t _o_f _E_n_v_i_r_o_n_m_e_n_t_a_l _V_a_r_i_a_b_l_e_s _w_i_t_h
_M_a_x_i_m_u_m (_R_a_n_k) _C_o_r_r_e_l_a_t_i_o_n _w_i_t_h _C_o_m_m_u_n_i_t_y _D_i_s_s_i_m_i_l_a_r_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Function finds the best subset of environmental variables, so that
     the Euclidean distances of scaled environmental variables have the
     maximum (rank) correlation with community dissimilarities.

_U_s_a_g_e:

     ## Default S3 method:
     bioenv(comm, env, method = "spearman", index = "bray",
            upto = ncol(env), ...)
     ## S3 method for class 'formula':
     bioenv(formula, data, ...)

_A_r_g_u_m_e_n_t_s:

    comm: Community data frame. 

     env: Data frame of continuous environmental variables. 

  method: The correlation method used in 'cor.test'.

   index: The dissimilarity index used for community data in 'vegdist'. 

    upto: Maximum number of parameters in studied subsets.

formula, data: Model 'formula' and data.

     ...: Other parameters passed to function.

_D_e_t_a_i_l_s:

     The function calculates a community dissimilarity matrix using
     'vegdist'.  Then it selects all possible subsets of environmental
     variables, 'scale's the variables, and calculates Euclidean
     distances for this subset using 'dist'.  Then it finds the
     correlation between community dissimilarities and environmental
     distances, and for each size of subsets, saves the best result. 
     There are 2^p-1 subsets of p variables, and exhaustive search may
     take a very, very, very long time (parameter 'upto' offers a
     partial relief). 

     The function can be called with a model 'formula' where the LHS is
     the data matrix and RHS lists the environmental variables. The
     formula interface is practical in selecting or transforming
     environmental variables.

     Clarke & Ainsworth (1993) suggested this method to be used for
     selecting the best subset of environmental variables in
     interpreting results of nonmetric multidimensional scaling (NMDS).
     They recommended a parallel display of NMDS of community
     dissimilarities and NMDS of Euclidean distances from the best
     subset of scaled environmental variables.  They warned against the
     use of Procrustes analysis, but to me this looks like a good way
     of comparing these two ordinations.

     Clarke & Ainsworth wrote a computer program BIO-ENV giving the
     name to the current function. Presumably BIO-ENV was later
     incorporated in Clarke's PRIMER software (available for Windows). 
     In addition, Clarke & Ainsworth suggested a novel method of rank
     correlation which is not available in the current function.

_V_a_l_u_e:

     The function returns an object of class 'bioenv' with a 'summary'
     method.

_A_u_t_h_o_r(_s):

     Jari Oksanen. The code for selecting all possible subsets was
     posted to the R mailing list by Prof. B. D. Ripley in 1999.

_R_e_f_e_r_e_n_c_e_s:

     Clarke, K. R & Ainsworth, M. 1993. A method of linking
     multivariate community structure to environmental variables.
     _Marine Ecology Progress Series_, 92, 205-219.

_S_e_e _A_l_s_o:

     'vegdist', 'dist', 'cor' for underlying routines, 'isoMDS' for
     ordination, 'procrustes' for Procrustes analysis, 'protest' for an
     alternative, and 'rankindex' for studying alternatives to the
     default Bray-Curtis index.

_E_x_a_m_p_l_e_s:

     # The method is very slow for large number of possible subsets.
     # Therefore only 6 variables in this example.
     data(varespec)
     data(varechem)
     sol <- bioenv(wisconsin(varespec) ~ log(N) + P + K + Ca + pH + Al, varechem)
     sol
     summary(sol)

