stepacross               package:vegan               R Documentation

_S_t_e_p_a_c_r_o_s_s _a_s _F_l_e_x_i_b_l_e _S_h_o_r_t_e_s_t _P_a_t_h_s _o_r _E_x_t_e_n_d_e_d _D_i_s_s_i_m_i_l_a_r_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Function 'stepacross' tries to replace dissimilarities with
     shortest paths stepping across intermediate  sites while regarding
     dissimilarities above a threshold as missing data ('NA'). With
     'path = "shortest"' this is the flexible shortest path (Williamson
     1978, Bradfield & Kenkel 1987), and with 'path = "extended"' an
     approximation known as extended dissimilarities (De'ath 1999). The
     use of 'stepacross' should improve the ordination with high beta
     diversity, when there are many sites with no species in common.

_U_s_a_g_e:

     stepacross(dis, path = "shortest", toolong = 1, trace = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

     dis: Dissimilarity data inheriting from class 'dist' or a an
          object, such as a matrix, that can be converted to a
          dissimilarity matrix. Functions 'vegdist' and 'dist' are some
          functions producing suitable dissimilarity data. 

    path: The method of stepping across (partial match) Alternative
          '"shortest"' finds the shortest paths, and '"extended"' 
          their approximation known as extended dissimilarities.

 toolong: Shortest dissimilarity regarded as 'NA'. The function uses a
          fuzz factor, so that dissimilarities close to the limit will
          be made 'NA', too. 

   trace: Trace the calculations.

     ...: Other parameters (ignored).

_D_e_t_a_i_l_s:

     Williamson (1978) suggested using flexible shortest paths to
     estimate dissimilarities between sites which have nothing in
     common, or no shared species. With 'path = "shortest"' function
     'stepacross' replaces dissimilarities that are 'toolong' or longer
     with 'NA', and tries to find shortest paths between all sites
     using remaining dissimilarities. Several dissimilarity indices are
     semi-metric which means that they do not obey the triangle
     inequality d[ij] <= d[ik] + d[kj], and shortest path algorithm can
     replace these dissimilarities as well, even when they are shorter
     than 'toolong'. 

     De'ath (1999) suggested a simplified method known as extended
     dissimilarities, which are calculated with 'path = "extended"'. In
     this method, dissimilarities that are 'toolong' or longer are
     first made 'NA', and then the function tries replace these 'NA'
     dissimilarities with a path through single stepping stone points.
     If not all 'NA' could be  replaced with one pass, the function
     will make new passes with updated dissimilarities as long as all
     'NA' are replaced with extended dissimilarities. This mean that in
     the second and further passes, the remaining 'NA' dissimilarities
     are allowed to have more than one stepping stone site, but
     previously replaced dissimilarities are not updated. Further, the
     function does not consider dissimilarities shorter than 'toolong',
     although some of these could be replaced with a shorter path in
     semi-metric indices, and used as a part of other paths. In optimal
     cases, the extended dissimilarities are equal to shortest paths,
     but in several cases they are longer.  

     As an alternative to defining too long dissimilarities with
     parameter 'toolong', the input dissimilarities can contain 'NA's.
     If 'toolong' is zero or negative, the function does not make any
     dissimilarities into 'NA'. If there are no 'NA's in the input  and
     'toolong = 0', 'path = "shortest"' will find shorter paths for
     semi-metric indices, and 'path = "extended"' will do nothing.
     Function 'no.shared' can be used to set dissimilarities to 'NA'.

     If the data are disconnected or there is no path between all
     points, the result will contain 'NA's and a warning is issued.
     Several methods cannot handle 'NA' dissimilarities, and this
     warning should be taken seriously. Function 'distconnected' can be
     used to find connected groups and remove rare outlier observations
     or groups of observations.

     Alternative 'path = "shortest"' uses Dijkstra's method for finding
     flexible shortest paths, implemented as priority-first search for
     dense graphs (Sedgewick 1990). Alternative 'path = "extended"'
     follows De'ath (1999), but implementation is simpler than in his
     code.

_V_a_l_u_e:

     Function returns an object of class 'dist' with extended
     dissimilarities (see functions 'vegdist' and 'dist').  The value
     of 'path' is appended to the 'method' attribute.

_N_o_t_e:

     The function changes the original dissimilarities, and not all
     like this. It may be best to  use  the function only when you
     really _must_:  extremely high beta diversity where a large
     proportion of dissimilarities are at their upper limit (no species
     in common). 

     Semi-metric indices vary in their degree of violating the triangle
     inequality. Morisita and Horn-Morisita indices of 'vegdist' may be
     very strongly semi-metric, and shortest paths can change these
     indices very much. Mountford index violates basic rules of
     dissimilarities: non-identical sites have zero dissimilarity if
     species composition of the poorer site is a subset of the richer.
     With Mountford index, you can find three sites i, j, k so that
     d[ik] = 0 and d[jk] = 0, but d[ij] > 0. The results of
     'stepacross' on Mountford index can be very weird. If 'stepacross'
     is needed, it is best to try to use it with more metric indices
     only.

_A_u_t_h_o_r(_s):

     Jari Oksanen

_R_e_f_e_r_e_n_c_e_s:

     Bradfield, G.E. & Kenkel, N.C. (1987). Nonlinear ordination using
     flexible shortest path adjustment of ecological distances.
     _Ecology_ 68, 750-753.

     De'ath, G. (1999). Extended dissimilarity: a method of robust
     estimation of ecological distances from high beta diversity data.
     _Plant Ecol._ 144, 191-199.

     Sedgewick, R. (1990). _Algorithms in C_. Addison Wesley. 

     Williamson, M.H. (1978). The ordination of incidence data. _J.
     Ecol._ 66, 911-920.

_S_e_e _A_l_s_o:

     Function 'distconnected' can find connected groups in disconnected
     data, and function 'no.shared' can be used to set dissimilarities
     as 'NA'.

_E_x_a_m_p_l_e_s:

     # There are no data sets with high beta diversity in vegan, but this
     # should give an idea.
     data(dune)
     dis <- vegdist(dune)
     edis <- stepacross(dis)
     plot(edis, dis, xlab = "Shortest path", ylab = "Original")
     ## Manhattan distance have no fixed upper limit.
     dis <- vegdist(dune, "manhattan")
     is.na(dis) <- no.shared(dune)
     dis <- stepacross(dis, toolong=0)

