How do I get the number of rows of a data.frame in R?
After reading a dataset:
dataset <- read.csv("forR.csv")
- How can I get R to give me the number of cases it contains?
- Also, will the returned value include of exclude cases omitted with
na.omit(dataset)
?
Please read the R guide of Owen first (http://cran.r-project.org/doc/contrib/Owen-TheRGuide.pdf), and if possible, Introduction to R (http://cran.r-project.org/doc/manuals/R-intro.pdf). Both are on the official website of R. You're incredibly lucky you actually get an answer. On the r-help list one would redirect you to the manual in less elegant terms. No offense meant.
@Joris - Point taken (without offence), but it was my impression that SE sites were designed to foster problem/solution learning in a way not afforded by manuals. Additionally, this question will now be available for other beginners. Thanks for the links though.
If you're looking for pure code solutions, stackoverflow might be more appropriate. Although, all the R gurus present @ SO are also here (not counting myself). :)
I disagree with your assertion that this question will be helpful for other beginners, *especially* if they don't skim the manual. They will just create a duplicate question.
@JorisMeys: thanks for the link to the R guide.. hadn't come across that yet in my learning of R and it's exactly what I'd been looking for.
And, four years later, this is the second hit I got on Google trying to find an answer to this question. No need for me to create a duplicate (@JoshuaUlrich).
@Richard Just noticed that (6 years on) this question has 100 upvotes and is consequently well within the top 0.1% of questions on the site. I find this very interesting.
dataset
will be a data frame. As I don't haveforR.csv
, I'll make up a small data frame for illustration:set.seed(1) dataset <- data.frame(A = sample(c(NA, 1:100), 1000, rep = TRUE), B = rnorm(1000)) > head(dataset) A B 1 26 0.07730312 2 37 -0.29686864 3 57 -1.18324224 4 91 0.01129269 5 20 0.99160104 6 90 1.59396745
To get the number of cases, count the number of rows using
nrow()
orNROW()
:> nrow(dataset) [1] 1000 > NROW(dataset) [1] 1000
To count the data after omitting the
NA
, use the same tools, but wrapdataset
inna.omit()
:> NROW(na.omit(dataset)) [1] 993
The difference between
NROW()
andNCOL()
and their lowercase variants (ncol()
andnrow()
) is that the lowercase versions will only work for objects that have dimensions (arrays, matrices, data frames). The uppercase versions will work with vectors, which are treated as if they were a 1 column matrix, and are robust if you end up subsetting your data such that R drops an empty dimension.Alternatively, use
complete.cases()
andsum
it (complete.cases()
returns a logical vector [TRUE
orFALSE
] indicating if any observations areNA
for any rows.> sum(complete.cases(dataset)) [1] 993
License under CC-BY-SA with attribution
Content dated before 6/26/2020 9:53 AM
Chase 10 years ago
I also recommend taking a look at `str()` as it provides other useful details about your object. Can often explain why a column isn't behaving as it should (factor instead of numeric, etc).