Opening shapefile in R?
I need to open a shapefile from ArcMap in R to use it for further geostatistical analysis. I've converted it into ASCII text file, but in R it is recognized as data.frame. Coordinates function doesn't work as soon as x and y are recognized as non-numeric.
Could you help to deal with it?
Use the shapefile directly. You can do this easily with the
sfpackages, and read the shape in an object. For both packages you need to provide
dsn- the data source, which in the case of a shapefile is the directory, and
layer- which is the shapefile name, minus extension:
# Read SHAPEFILE.shp from the current working directory (".") require(rgdal) shape <- readOGR(dsn = ".", layer = "SHAPEFILE") require(sf) shape <- read_sf(dsn = ".", layer = "SHAPEFILE")
(For rgdal, in OSX or Linux you can't use the '~' shorthand for the home directory as the data source (
dsn) directory - otherwise you'll get an unhelpful "Cannot open data source" message. The
sfpackage doesn't have this limitation, among some other advantages.)
This will give you an object which is a Spatial*DataFrame (points, lines or polygons) - the fields of the attribute table are then accessible to you in the same way as an ordinary dataframe, i.e.
shape$IDfor the ID column.
If you want to use the ASCII file you imported, then you should simply convert the text (character) x and y fields to numbers, e.g.:
shape$x <- as.numeric(as.character(shape$x)) shape$y <- as.numeric(as.character(shape$y)) coordinates(shape) <- ~x + y
Edit 2015-01-18: note that rgdal is a bit better than maptools (which I initially suggested here), primarily because it reads and writes projection information automatically.
- the nested
as.numeric(as.character())functions - if your ASCII text was read as a factor (likely), this ensures that you get the numeric values instead of the factor levels.
sfhave confusing ways of accessing different file and database types (e.g. for a GPX file, the dsn is the filename, and layers the individual components such as waypoints, trackpoints, etc), and careful reading of online examples is needed.
R should parse numeric fields so, I would imagine that there is a special character type in x and y. In addition, on import, unless specified differently, character fields will be coerced into a factor. As such, a simple "as.numeric" deceleration will not work. I would also use "readORG" in "rgdal" rather than maptools.
@Jeffrey, readOGR is definitely the better way to go - see some discussions on later R questions here on gis.SE. Good point on factor coercion; will update with nested `as.character` to get around the problem.
You could use ~, but you'd have to call path.expand on the directory, e.g. readOGR(dsn=path.expand("~/Downloads/cb_2016_us_zcta510_500k/"), layer="cb_2016_us_zcta510_500k")
Somehow I still needed a clarification this actually correct answer: `dsn="directory where the shapefile, projection file, etc are located"` `layer="name of the file without .shp extention"`
- the nested
I agree with the Simbamangu and gissolved in terms of retaining the shapefile but want to direct your attention specifically to the rgdal library. Follow the link suggested by gissolved for the NCEAS and follow through with the directions for rgdal. It can be challenging to install on some machines but it can substantially improve results when it comes to projections.
The maptools library is excellent and allows you to define the projection for the shapefile you are reading in, but to do so you need to know how to specify that projection in the proj4 format. an example might look something like:
project2<-"+proj=eqdc +lat_0=0 +lon_0=0 +lat_1=33 +lat_2=45 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs" #USA Contiguous Equidistant Conic Projection data.shape<-readShapePoly("./MyMap.shp",IDvar="FIPS",proj4string=CRS(project2)) plot(data.shape)
If you want to go this route, then I recommend http://spatialreference.org as the place to go to figure out what your projection looks like in the proj4 format. If that looks like a hassle to you, rgdal will make it easy by reading the ESRI shapefile's .prj file (the file that contains ESRI's projection definition for the shapefile. To use rgdal on the same file you would simply write:
library(rgdal) data.shape<-readOGR(dsn="C:/Directory_Containing_Shapefile",layer="MyMap") plot(data.shape)
You can likely skate by without doing this if you are just working with a single shapefile, but as soon as you start looking at multiple data sources or overlaying with Google Maps, keeping your projections in good shape becomes essential.
For some helpful walkthroughs on spatial data in R, including a bunch of stuff on importing and working with point patterns, I have some old course materials online at https://csde.washington.edu/workshop/point-patterns-and-raster-surfaces/ (more workshops can be found here) that might help you see how these methods compare in practice.
+1 for spatial reference information ... especially for emphasizing keeping projections sorted out!
@csfowler, I tried to use the readOGR but it is not importing the .prj file. Any idea why? I am at UW as well, in the biology department.
I think you shouldn't convert the shapefile to an ASCII but instead use the shapefile directly with one of the spatial extensions. Here you can find a three ways to read (and write) a shapefile http://www.nceas.ucsb.edu/scicomp/usecases/ReadWriteESRIShapeFiles. The R-spatial project will probably also interest you http://cran.r-project.org/web/packages/sp/index.html.
You can use
sflibrary to open Shapefiles directly in
R. It's faster than
rgdallibrary, check here: Simple Features for R - Benchmarks. For further information about the
sfpackage check the project homepage r-spatial.
# Load library library('sf') # Load shapefile shapename <- read_sf('~/path/to/file.shp')
An easy solution in 2017 is the
shapefile()function in the
rasterlibrary. Actually,as the help file says, is a "simple wrapper function around readOGR and writeOGR (rgdal package)"
#Load library library(raster) #Load shapefile shp <- shapefile("myshapefile")
UPDATE: This is still a good option in 2019.
One more alternative is to use fastshp library which offers::
Routines for handling of large ESRI shapefiles (.shp). This includes reading, thinning of points and matching of points to containing shapes. The main aim for this package is to provide the speed to support large shapefiles (millions of points). It is several orders of maginute faster than some other shapefile packages.
Here is my question on SE on how to use it with ggplot2:
I find it a bit annoying that the read.shp function does not result in an sp object. Given that the spatial R community is converging on this as the de facto standard for handling spatial objects, I find this somewhat sloppy. Given sufficient RAM and a 64bit OS, reading large data is not much of an issue. With 8GB RAM I have read 30M points and 2.5M polygons using rgdal with no issues. Here is some direction on using sp objects with ggplot2: https://github.com/hadley/ggplot2/wiki/plotting-polygon-shapefiles