cd                 package:mvbutils                 R Documentation

_O_r_g_a_n_i_z_i_n_g _R _w_o_r_k_s_p_a_c_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     'cd' allows you to set up and move through a
     hierarchically-organized set of R workspaces, each corresponding
     to a directory. While working at any level of the hierarchy, all
     higher levels are attached on the search path, so you can see
     objects in the "parents". You can easily switch between workspaces
     in the same session, you can move objects around in the hierarchy,
     and you can do several hierarchy-wide things such as searching,
     even on parts of the hierarchy that aren't currently attached.

_U_s_a_g_e:

      cd()
      cd(to)
      cd(to, execute.First = TRUE, execute.Last = TRUE)

_A_r_g_u_m_e_n_t_s:

      to: the path of a task to move to or create, as an unquoted
          string. If omitted, you'll be given a menu. See DETAILS.

 execute.First: should the '.First.task' code be executed on
          attachment? Yes, unless there's a bug in it.

 execute.Last: should the '.Last.task' code be executed on detachment?
          Yes, unless there's a bug in it.

_D_e_t_a_i_l_s:

     R workspaces can become very cluttered, so that it becomes
     difficult to keep track of what's what (I have seen workspaces
     with over 1000 objects in them...). If you work on several
     different projects, it can be awkward to work out where to put
     "shared" functions- or to remember where things are, if you come
     back to a project after some months away. And if you just want to
     test out a bit of code without leaving permanent clutter, but
     while still being able to "see" your important objects, how do you
     do it? 'cd' helps with all such problems, by letting you organize
     all your projects into a single tree structure, regardless of
     where they are stored on disk. Each workspace is referred to (for
     historical reasons) as a "task".

     To use the 'cd' system, you will need to start R in the _same_
     workspace every time. This will become your ROOT or home task,
     from which all other tasks stem. There need not be much in this
     workspace except for an object called 'tasks' (see below), though
     you can use it for shared functions that you don't want to
     organize into a package or quasi-package. From the ROOT task, your
     first action in a new R session will normally be to use 'cd' to
     switch to a real task. The 'cd' command is used both to switch
     between existing tasks, and to create new ones.

     To set yourself up for working with 'cd', it's probably a good
     idea to make the ROOT task a completely new blank workspace. [In
     MS-Windows, I'd suggest putting it near the top of the disk
     directory structure, too.] Start R in this workspace, type
     'library( mvbutils)', and then start linking your existing
     projects into the hierarchy. To link in a project, just type
     'cd()' and a menu will appear. The first time, there will be only
     one option: "CREATE NEW TASK". Select it (or type 0 to quit if you
     are feeling nervous), and you will be prompted for a "task name",
     by which R will always subsequently refer to the task. Keep the
     name short; it doesn't have to be related to the location of the
     disk directory where the .Rdata lives. Avoid spaces and weird
     characters- use periods as separators. Task names are
     case-sensitive. Next, you'll be asked which disk directory this
     task refers to. By default, 'cd' expects that you are creating a
     new task, and therefore suggests putting the directory immediately
     below the current task directory. However, if you are linking in
     an existing project, you'll need to supply the directory name.
     Next, you'll be returned to the R command prompt- but the prompt
     will have changed, so that the ">" is preceded by the task name.
     If you type 'search()', you'll see your ROOT task in position 2,
     below .GlobalEnv as usual. Despite the name, though, the new
     .GlobalEnv contains the project you've just linked, and if you
     type 'ls()', you should see some familiar objects. Now type
     'cd(0)' to move back to the ROOT task (note the changed prompt),
     type 'search()' and 'ls()' again to orient yourself, and proceed
     as before to link the rest of your pre-existing tasks into the
     hierarchy. When you type 'cd()', the menu will have more choices.
     If you select an existing task rather than creating a new one, you
     will switch straightaway to that workspace; watch the prompt.

     Once you have a hierarchy set up, you can switch the current
     workspace within the hierarchy by calling e.g. 'cd(existing.task)'
     (note the lack of quotes), or by calling 'cd()' and picking off
     the menu. You can move through several levels of the hierarchy at
     once, using a path specifier such as 'cd(mytask/data/funcs)' or
     'cd(../child.of.sibling)'. Path specifiers are just like Unix or
     DOS disk paths with "/" as the separator, so that "." means
     "current task" and ".." means "parent". However, the character 0
     must be used to denote the ROOT task, so that you have to type
     'cd(0/different.task)' rather than 'cd(/different.task'). You can
     display the entire hiearchy by calling 'cdtree(0)', or graphically
     via 'plot( cdtree( 0))'.

     When you first set up your task hierarchy, you'll also want to
     create or modify the '.First' function in your ROOT task. At a
     minimum, this should call 'library( mvbutils)', but you may also
     want to set some options controlling the behaviour of 'cd' (see
     the OPTIONS section). If you use other features of 'mvbutils' such
     as the function-editing interface, there will be further options
     to be set in '.First'.

     You can create a fully hierarchical structure, with subtasks
     within subtasks within tasks, etc. Even if your projects don't
     naturally look like this, you may find the facility useful. When I
     create a new project, I tend to start with just one level of
     hierarchy, containing data, function code, and results. When this
     gets unspeakably messy, I often create one (or more) subtasks,
     usually putting the basic data at the top level, and functions and
     results at the lower level. Apart from tidiness, this provides
     some degree of protection against overwriting the original data.
     And when even this gets too messy- in one task, I have more than
     150 functions, and it is very easy to generate 100s of analysis
     results- I create another level, keeping "established" functions
     at the second tier and using the third tier for temporary
     workspace and results. There are no hard-and-fast rules here, of
     course, and different people use R in very different ways.

     A task can have '.First.task' and/or '.Last.task' functions, which
     get called immediately after 'cd'ing into the task from its
     parent, or immediately before 'cd'ing back to its parent,
     respectively (see ARGUMENTS). These can be useful for dynamic
     loading, etc., and facilitate the use of tasks as informal
     libraries/packages (see also 'flatdoc').

     Formal and semi-formal packages can also be tasks. For example, my
     'mvbutils', 'debug' and 'handy' packages are loaded by 'library'
     when I start R (in my '.First'), but are also linked into my task
     hierarchy (see also 'mlibrary'). This lets me 'cd(debug)' when I
     intend to create or remove a number of functions in 'debug'. It's
     also useful to be able to have subtasks below such packages, e.g.
     so that probably-obsolete functions can be moved there pending a
     final decision. When you 'cd' to something that's already attached
     as a package, 'search()' will show e.g. "PLACEHOLDER:debug" in the
     position formerly occupied by the package; when you 'cd' back up,
     the package will return to the placeholder slot.

     With the 'cd' approach, the idea is really that everything
     (including functions) is stored in a single .Rdata file, rather
     than as separate files to be 'source'd in when you switch to a
     project; this reflects my own preference for how to work. Before
     'cd' moves to a new project, either up or down the hierarchy, the
     current workspace is automatically 'save.image'd (using default
     arguments). Nevertheless, I don't think 'cd' is incompatible with
     other ways of working, as long as the .Rdata file (actually the
     'tasks' object) is not destroyed from session to session. At any
     rate, some people who work by 'source'ing large code files still
     seem to find 'cd' useful. With the .Rdata-only approach, it is
     highly advisable to have some way of keeping separate text
     backups, at least of function code. The 'fixr' editing system is
     geared up to this, and I presume other systems such as ESS are
     too.

_O_p_t_i_o_n_s:

     Various 'options()' can be set, as follows. Remember to put these
     into your '.First' function, too.

     'write.mvb.tasks=TRUE' causes a sourceable text representation of
     the 'tasks' object to be maintained in each directory, in the file
     'tasks.r'. This helps in case you accidentally wipe out the .Rdata
     file and lose track of where the child tasks live. To create these
     text representations for the first time throughout the hierarchy,
     call 'cd.write.mvb.tasks(0)'. You need to put the the 'options'
     call in your '.First'.

     'options(abbreviate.cdprompt(n))' controls the length of the
     prompt string. Only the first 'n' characters of all ancestral task
     names will be shown. For example, 'n=1' would replace the prompt
     'long.task.name/data/funcs>' with 'l/d/funcs>'.

_N_o_t_e:

     'cd' always calls 'save.image' before attaching a child task on
     top or moving back up the hierarchy. 'cd' also calls 'setwd' so
     that file searches will default to the task directory (see also
     'task.home').

     'cd' is only meant to be called interactively, and has only been
     tested in that context.

     The mechanism underlying the tree structure is very simple: each
     task that has any subtasks will contain a character vector called
     'tasks', whose names are the R names of the tasks, and whose
     elements are the corresponding disk directories. Your ROOT task
     need contain no more than a '.First' function and a 'tasks'
     object. If you decide to move a disk directory, you can manually
     change the corresponding element of 'tasks'. If you are moving a
     whole task hierarchy, e.g. when migrating to a new machine,
     consult 'cd.change.all.paths'.

     'cd' will issue a warning and refuse to move back up the hierarchy
     if it detects a non-task attached in position 2. You will need to
     manually detach any such objects before 'cd'ing back up. 'cd' is
     not really designed to work with 'attach', and fans of the latter
     may encounter problems; if so, please let me know.

     To make sure that 'library' always loads packages below ROOT, the
     '.First.lib' code in 'mvbutils' makes a minor hack to 'library',
     setting the default 'pos' argument to call 'lib.pos()'.

     Two objects in the 'mvb.session.info' environment (see 'search()')
     help keep track of what parts of the hierarchy are currently
     attached; '.First.top.search' and '.Path'. The former is set when
     'mvbutils' loads, and the latter is updated by 'cd'. Attached
     tasks can be identified by having a 'path' attribute consisting of
     a NAMED character vector. Normal packages also have a 'path'
     attribute, but lacking names.

_A_u_t_h_o_r(_s):

     Mark Bravington

_S_e_e _A_l_s_o:

     'move', 'task.home', 'cdtree', 'cdfind', 'cditerate',
     'cd.change.all.paths', 'cd.write.mvb.tasks', 'cdprompt', 'fixr'

