Code
Below are some simple steps to get your lab projects up and running. The instructions assume you are using RStudio
and have git
installed on your computer. If you have an existing project, start at the second section.
Creating a new project from scratch
- Open
RStudio
- Select
File
->New Project...
- Select
New Directory
, thenNew Project
- Choose a sensible project name. Think of this as a forever choice (it can be changed, but it’s a hassle). Use underscores or CamelCase instead of spaces (e.g.,
surface_geometry
orSurfaceGeometry
). Don’t use generic names likeChapter1
; remember that the whole lab is using the same GitHub organization. If in doubt, consult with another lab member who has experience creating projects. - Click
Browse...
to choose a place for the project on your computer - Make sure the
Create a git repository
is selected; it should be by default - Click
Create Project
Starting a RStudio project from an existing project
- Make sure the folder your project is in has a sensible name; this will become your project name. Think of this as a forever choice (it can be changed, but it’s a hassle). Use underscores or CamelCase instead of spaces (e.g.,
surface_geometry
orSurfaceGeometry
). Don’t use generic names likeChapter1
; remember that the whole lab is using the same GitHub organization. If in doubt, consult with another lab member who has experience creating projects. - Open
RStudio
- Select
File
->New Project...
- Select
Existing Directory
, thenBrowse...
to find your existing project folder. - Click
Create Project
Organizing a project
- Create (or make sure you have) an
analysis.R
file withNew Script
. Some people like to useRMarkDown
to blend notes and code, which is a personal preference. - Create (or make sure you have) a
README.md
file withNew File
. Note the file extension is.md
- Create (or make sure you have) the following folders with the
New Folder
button (keep lowercase except for theR
folder):
Folder | Purpose | Why it’s special |
---|---|---|
data |
For raw data only | It is read only! Never save or manipulate anything in this folder |
R |
For R code that is sourced from the main analysis.R script |
Reduced clutter in the main analysis script |
output |
For saving anything produced by the analysis script (e.g., tables, cleaned datasets, figures, etc.) | The folder can be deleted and regenerated by the analysis script |
figs |
This is also an output folder, but is often easier to keep separate. For saving figures only |
The folder can be deleted and regenerated by the analysis script |
doc |
This is for manuscripts and presentations, and typically isn’t versioned controlled by adding doc to the .gitignore |
A general folder for stuff |
- If this is an existing project, move files into the various folders. You will need to change file paths in your code to load and save in the correct places. So expect a few errors when running your newly-rearranged code.
Versioning a project
- If this was an existing project without git setup (i.e., you don’t see the
Git
tab top-right in RStudio), you’ll need to go to menuTools
->Version Control
->Project Setup...
. Selectgit
as your version control system, and thenYes
to the prompt, andYes
to restarting RStudio - Go to the
Git
tab in top-right panel - Looks at the staging list. Is there anything weird in there? (e.g., on a Mac you might get
.DS_Store
files). If so, add these to the.gitignore
file so that they’re not versioned. - Add anything you don’t want versioned or seen by other people to
.gitignore
. For example, I usually start by adding thedocs
folder, because it tends to get full of unnecessary junk. You can always add it to versioning later. - IMPORTANT: Add large folders and files to
.gitignore
. Anything over ~50-100MB. So, geotifs, folders with lots of images / video, and so on. There are many reasons for this, but the most basic is that GitHub won’t accept files >100MB. If you accidentally add a large file using git and want to reverse it, talk to Josh. Here a fairly normal.gitignore
:
.Rproj.user
.Rhistory
.RData
.Ruserdata
.DS_Store
/doc
/data/images
- Once you’re happy, stage everything using the check boxes and click
Commit
. Double check there isn’t any junk being committed in the list, write a comment message like “first commit!” and clickCommit
Adding project to GitHub
- For projects done while in the lab, go to jmadinlab. If you want to set the project up outside the lab, go to your own GitHub page.
- Click the green
New
(repository) button - Use the same project name as for RStudio (e.g.,
surface_geometry
orSurfaceGeometry
). Leave the project asPrivate
. Click the greenCreate repository
button. (Don’t initialize with a README!) - Copy the two lines under
...or push an existing repository from the command line
- Go back to RStudio and click the cog/settings symbol in the
Git
panel, chooseShell...
- Paste the two lines from the GitHub webpage and press enter. Note that you’ll get an error if you haven’t yet committed the project at least once (above section).
- That’s it. You’ll noticed that the push and pull arrows in the Git panel in RStudio will now be highlighted.
- Finally, add project collaborators to your GitHub project by going to
Settings
, thenManage Access
and click the greenInvite teams or people
button. Enter their GitHub usernames. - When your research is published, please make the GitHub repository public (under
Settings
->Manage Access
) and make sure the link to arelease
of your repo is published with your paper. To make arelease
, click thetags
button at the GitHub project page andcreate a new release
. Use something likev1.0.0
for the tag.
Datasets
Most datasets are small enough to be published with the GitHub project (above). Some journals require that data are submitted to online data repositories so they have a DOI
, in which case follow the journal’s instructions. Zenodo and FigShare are easy-to-use, free options.
Please also add your data set to the lab’s dataset registry: [Add dataset]