R Workspace management

R [R Project] is an open source scripting language used for statistical computing and data analysis. Availability of surplus graphics packages helps in visualizing big data graphically. Many vibrant online communities involved in development of packages. This increases the strength of R. Google's R Style Guide defined coding standards.

R have an interactive GUI window, when R commands are typed in, it will give results immediately. It helps in rapid development and eliminates the need of compilation. Sometime interactive prompt may be inconvenient if set of commands need to be repeated with slight modifications. R has a solution for it, the source function. Required set of commands need to be written on a .R file and it can be sourced. Source function executes commands in .R file one by one and exits once end of file is reached.

Source function is ok for medium sized project, when project size grows; modularity is essential for easy maintenance and debugging. Large sized projects need some standard way of handing files in addition with source.

1. Write one function in one file. File name and function name should match, it helps in easy identification.

2. Print version of the file before the beginning of function so that while sourcing version of the code can be verified.

3. Include plenty of comments at the beginning of the file. Give input/output of the function, description of the function. Change log- What changes are made in each version.

4. RWorkspace


Lib: Packages used in the code can be stored here, so that when code are moved to new system development/usage of code can be started immediately without searching internet for packages.
RSource: contain one folder for each project. Project folder contains R code and required data files in Data sub directory.
LoadProjectName.R LoadProjectName function defined in this file loads (source) all other function needed for the project and sets the required environment variables. When this function is called all updated code will be loaded in to the R environment, this saves lot of effort and development time because in fairly large project it's a common mistake to update the file and not loading them to environment.

5. Utility Function
Certain common function can be stored in utility folder. Such as functions to read and write data, db connections, etc.
ReadInputData.R This is a wrapper function for read.csv, it concatenates basePath(Set in LoadProjectName function), Data(as data files are stored in this directory always), filename (argument) resulting string is passed to read.csv. Returns the output of read.csv to the calling function.
WriteOutputData.R This is a wrapper function for write.csv. Writes argument dataframe to a csv file in Data folder. File name is a combination of argument variable name and time stamp.

6. To run code, R current working directory need to be set to RWorkspace directory. In windows it's easy to achieve this by editing the "start in" parameter of R shortcut or setwd() function can be used to set the working directory in RGUI.

source("LoadProjectName.R")
LoadProjectName()
Main()

source("LoadProjectName.R") loads the LoadProjectName function to R environment. By calling LoadProjectName(), all other required function are loaded in to R environment. Main() main function of the project, which triggers all other required methods.

Chick here for sample RWorkspace