How do I get the number of rows of a data.frame in R?
After reading a dataset:
dataset <- read.csv("forR.csv")
- How can I get R to give me the number of cases it contains?
- Also, will the returned value include of exclude cases omitted with
I also recommend taking a look at `str()` as it provides other useful details about your object. Can often explain why a column isn't behaving as it should (factor instead of numeric, etc).
Please read the R guide of Owen first (http://cran.r-project.org/doc/contrib/Owen-TheRGuide.pdf), and if possible, Introduction to R (http://cran.r-project.org/doc/manuals/R-intro.pdf). Both are on the official website of R. You're incredibly lucky you actually get an answer. On the r-help list one would redirect you to the manual in less elegant terms. No offense meant.
@Joris - Point taken (without offence), but it was my impression that SE sites were designed to foster problem/solution learning in a way not afforded by manuals. Additionally, this question will now be available for other beginners. Thanks for the links though.
If you're looking for pure code solutions, stackoverflow might be more appropriate. Although, all the R gurus present @ SO are also here (not counting myself). :)
I disagree with your assertion that this question will be helpful for other beginners, *especially* if they don't skim the manual. They will just create a duplicate question.
@JorisMeys: thanks for the link to the R guide.. hadn't come across that yet in my learning of R and it's exactly what I'd been looking for.
And, four years later, this is the second hit I got on Google trying to find an answer to this question. No need for me to create a duplicate (@JoshuaUlrich).
datasetwill be a data frame. As I don't have
forR.csv, I'll make up a small data frame for illustration:
set.seed(1) dataset <- data.frame(A = sample(c(NA, 1:100), 1000, rep = TRUE), B = rnorm(1000)) > head(dataset) A B 1 26 0.07730312 2 37 -0.29686864 3 57 -1.18324224 4 91 0.01129269 5 20 0.99160104 6 90 1.59396745
To get the number of cases, count the number of rows using
> nrow(dataset)  1000 > NROW(dataset)  1000
To count the data after omitting the
NA, use the same tools, but wrap
> NROW(na.omit(dataset))  993
The difference between
NCOL()and their lowercase variants (
nrow()) is that the lowercase versions will only work for objects that have dimensions (arrays, matrices, data frames). The uppercase versions will work with vectors, which are treated as if they were a 1 column matrix, and are robust if you end up subsetting your data such that R drops an empty dimension.
complete.cases()returns a logical vector [
FALSE] indicating if any observations are
NAfor any rows.
> sum(complete.cases(dataset))  993