lung                 package:pvclust                 R Documentation

_D_N_A _M_i_c_r_o_a_r_r_a_y _D_a_t_a _o_f _L_u_n_g _T_u_m_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     DNA Microarray data of 73 lung tissues including 67 lung tumors.
     There are 916 observations of genes for each lung tissue.

_U_s_a_g_e:

     data(lung)

_F_o_r_m_a_t:

     data frame of size (916, 73).

_D_e_t_a_i_l_s:

     This dataset has been modified from original data. Each one
     observation of duplicate genes has been removed. See 'source'
     section in this help for original data source.

_S_o_u_r_c_e:

     <URL: http://genome-www.stanford.edu/lung_cancer/adeno/>

_R_e_f_e_r_e_n_c_e_s:

     Garber, M. E. et al. (2001) "Diversity of gene expression in
     adenocarcinoma of the lung", _Proceedings of the National Academy
     of Sciences_, 98, 13784-13789.

_E_x_a_m_p_l_e_s:

     ## Reading the data
     data(lung)

     ## Multiscale Bootstrap Resampling
     lung.pv <- pvclust(lung, nboot=100)

     ## CAUTION: nboot=100 may be too small for actual use.
     ##          We suggest nboot=1000 or larger.
     ##          plot/print functions will be useful for diagnostics.

     ## Plot the result
     plot(lung.pv, cex=0.8, cex.pv=0.7)

     ask.bak <- par()$ask
     par(ask=TRUE)

     pvrect(lung.pv, alpha=0.9)
     msplot(lung.pv, edges=c(51,62,68,71))

     par(ask=ask.bak)

     ## Print a cluster with high p-value
     lung.pp <- pvpick(lung.pv, alpha=0.9)
     lung.pp$clusters[[2]]

     ## Print its edge number
     lung.pp$edges[2]

     ## We recommend parallel computing for large dataset as this one
     ## Not run: 
     library(snow)
     cl <- makeCluster(10, type="MPI")
     lung.pv <- parPvclust(cl, lung, nboot=1000)
     ## End(Not run)

