cosine                  package:lsa                  R Documentation

_C_o_s_i_n_e _M_e_a_s_u_r_e (_M_a_t_r_i_c_e_s)

_D_e_s_c_r_i_p_t_i_o_n:

     Calculates the cosine measure between two vectors or between all
     column vectors of a matrix.

_U_s_a_g_e:

     cosine(x, y = NULL)

_A_r_g_u_m_e_n_t_s:

       x: A vector or a matrix (e.g., a document-term matrix).

       y: Optional: a vector with compatible dimensions to 'x'. If
          `NULL', all column vectors of 'x' are correlated.

_D_e_t_a_i_l_s:

     'cosine()' calculates a similarity matrix between all column
     vectors of a matrix 'x'. This matrix might be a document-term
     matrix, so columns would be expected to be documents and rows to
     be terms.

     When executed on two vectors 'x' and 'y',  'cosine()' calculates
     the cosine similarity between them.

_V_a_l_u_e:

     Returns a n*n similarity matrix of cosine values, comparing all  n
     column vectors against each other. Executed on two vectors, their 
     cosine similarity value is returned.

_N_o_t_e:

     The cosine measure is nearly identical with the pearson
     correlation  coefficient (besides a constant factor)
     'cor(method="pearson")'.  For an investigation on the differences
     in the context of textmining see (Leydesdorff, 2005).

_A_u_t_h_o_r(_s):

     Fridolin Wild fridolin.wild@wu-wien.ac.at

_R_e_f_e_r_e_n_c_e_s:

     Leydesdorff, L. (2005) _Similarity Measures, Author Cocitation
     Analysis,and Information Theory_. In: JASIST 56(7), pp.769-772.

_S_e_e _A_l_s_o:

     'cor'

_E_x_a_m_p_l_e_s:

     ## the cosinus measure between two vectors

     vec1 = c( 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
     vec2 = c( 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0 )
     cosine(vec1,vec2) 

     ## the cosine measure for all document vectors of a matrix

     vec3 = c( 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0 )
     matrix = cbind(vec1,vec2, vec3)
     cosine(matrix)

