Beginner's guide to R: Get your data into R

In part 2 of our hands-on guide to the hot data-analysis environment, we provide some tips on how to import data in various formats, both local and on the Web.

Categories or values? Because of R's roots as a statistical tool, when you import non-numerical data, R may assume that character strings are statistical factors -- things like "poor," "average" and "good" -- or "success" and "failure."

But your text columns may not be categories that you want to group and measure, just names of companies or employees. If you don't want your text data to be read in as factors, add stringsAsFactor=FALSE to read.table, like this:

mydata <- read.table("filename.txt", sep="\t", header=TRUE, stringsAsFactor=FALSE)

If you'd prefer, R allows you to use a series of menu clicks to load data instead of 'reading' data from the command line as just described. To do this, go to the Workspace tab of RStudio's upper-right window, find the menu option to "Import Dataset," then choose a local text file or URL.

As data are imported via menu clicks, the R command that RStudio generated from your menu clicks will appear in your console. You may want to save that data-reading command into a script file if you're using this for significant analysis work, so that others -- or you -- can reproduce that work.

The 3-minute YouTube video below, recorded by UCLA statistics grad student Miles Chen, shows an RStudio point-and-click data import.

UCLA statistics grad student Miles Chen shows an RStudio point-and-click data import.

Copying data snippets

If you've got just a small section of data already in a table -- a spreadsheet, say, or a Web HTML table -- you can control-C copy those data to your Windows clipboard and import them into R.

The command below handles clipboard data with a header row that's separated by tabs, and stores the data in a data frame (x):

x <- read.table(file = "clipboard", sep="\t", header=TRUE)

You can read more about using the Windows clipboard in R at the R For Dummies website.

On a Mac, the pipe ("pbpaste") function will access data you've copied with command-c, so this will do the equivalent of the previous Windows command:

x <- read.table(pipe("pbpaste"), sep="\t")

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags R Language

More about AdvancedAppleExcelGoogleMicrosoftSASSPSSTwitterUCLA

Show Comments