Importing csv file in r – R tutorial

1273

R has the ability to import and export data in many formats. In case if you are looking to import CSV data in R, then here is the trick for you. Creating a data frame in R is easy.  The keyboard data entry works well for small datasets. For larger datasets, it is probably good to use the methods of importing data from existing text files, Excel spreadsheets, statistical packages, or database management systems.

I also recommend you to save your xlsx or xls file to CSV and import the same in R. Now let us see import CSV data in R.

We can import data from delimited text files or CSV files using read.table() function, this
reads a file in table format and saves it as a data frame. Here is the syntax

mydataframe <- read.table(file, header=logical_value, sep="delimiter", row.names="name")

Here, header is a logical value indicating whether the first row contains variable names (TRUE or FALSE), sep specifies the delimiter separating data values mostly it is ,, and row.names is an optional parameter specifying one or more variables to represent row identifiers.

Example

In the following example, let us import salary data.

salary <- read.table("employee.csv", header=TRUE, sep=",", row.names="EMPLOYEEID")

The above code reads a comma-delimited file named employee.csv from the current working directory, gets the variable names from the first line of the file, specifies the variable EMPLOYEEID as the row identifier, and saves the results as a data frame named grades.

Note that the sep parameter allows you to import files that use a symbol other than a comma to delimit the data values. You could read tab-delimited files with sep="\t". The default is sep="", which denotes one or more spaces, tabs, newlines, or carriage returns.

By default, character variables are converted to factors. This behaviour may not always be desirable (for example, a variable containing respondents’ comments). You can suppress this behaviour in a number of ways. Including the option stringsASFactors=FALSE will turn this behaviour off for all character variables. Alternatively, you can use the colClasses optionto specify a class (for example, logical, numeric, character, factor) for each column.

The read.table() function has many additional options for fine-tuning the data import. Use the command help(read.table) for details.

Previous articleMaulana Tariq Jamil – Ramadhan 2020 Episode 01 (Audio Lecture)
Next articleImporting SPSS data in R – R tutorial
A.Sulthan, Ph.D.,
Author and Assistant Professor in Finance, Ardent fan of Arsenal FC. Always believe "The only good is knowledge and the only evil is ignorance - Socrates"
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments