### How to add non-linear trend line to a scatter plot in R?

• Karina Khusainova

8 years ago

I have a scatter plot. How can I add non-linear trend line? Do you already have the equation of the trend curve or does adding it include computing its equation from the data?

• Jeromy Anglim

8 years ago

Let's create some data.

```n <- 100
x <- seq(n)
y <- rnorm(n, 50 + 30 * x^(-0.2), 1)
Data <- data.frame(x, y)
```

The following shows how you can fit a loess line or the fit of a non-linear regression.

```plot(y ~ x, Data)

# fit a loess line
loess_fit <- loess(y ~ x, Data)
lines(Data\$x, predict(loess_fit), col = "blue")

# fit a non-linear regression
nls_fit <- nls(y ~ a + b * x^(-c), Data, start = list(a = 80, b = 20,
c = 0.2))
lines(Data\$x, predict(nls_fit), col = "red")
```  about the plotting, for those encountering order problems, this advice is useful

• Vincent Zoonekynd

8 years ago

If you use `ggplot2` (the third plotting system, in R, after base R and lattice), this becomes:

``````library(ggplot2)
ggplot(Data, aes(x,y)) + geom_point() + geom_smooth()
`````` You can choose how the data is smoothed: see `?stat_smooth` for details and examples. Nice graph and explanation! But what means the shadow area? The shaded area is the confidence interval around the smoothed line. You could have found this out by yourself by accessing the R help file for `stat_smooth` by typing `?stat_smooth` as Vincent stated. :-)

• Jason Morgan

8 years ago

Without knowing exactly what you are looking for, using the `lattice` package you can easily add a loess curve with `type="smooth"`; e.g.,

``````> library(lattice)
> x <- rnorm(100)
> y <- rnorm(100)
> xyplot(y ~ x, type=c("smooth", "p"))
``````

See `help("panel.loess")` for arguments that can be passed to the loess fitting routine in order to change, for instance, the degree of the polynomial to use. Update

To change the color of the loess curve, you can write a small function and pass it as a `panel` parameter to `xyplot`:

``````x <- rnorm(100)
y <- rnorm(100)

panel_fn <- function(x, y, ...)
{
panel.xyplot(x, y, ...)
panel.xyplot(x, y, type="smooth", col="red", ...)
}

xyplot(y ~ x, panel=panel_fn)
``````  how would you make the line a different color? @EngrStudent I updated my answer.

• Patrick Caldon

8 years ago

Your question is a bit vague, so I'm going to make some assumptions about what your problem is. It would help a lot if you could put up a scatterplot and describe the data a bit. Please, if I'm making bad assumptions then ignore my answer.

First, it's possible that your data describe some process which you reasonably believe is non-linear. For instance, if you're trying to do regression on the distance for a car to stop with sudden braking vs the speed of the car, physics tells us that the energy of the vehicle is proportional to the square of the velocity - not the velocity itself. So you might want to try polynomial regression in this case, and (in R) you could do something like `model <- lm(d ~ poly(v,2),data=dataset)`. There's a lot of documentation on how to get various non-linearities into the regression model.

On the other hand, if you've got a line which is "wobbly" and you don't know why it's wobbly, then a good starting point would probably be locally weighted regression, or `loess` in R. This does linear regression on a small region, as opposed to the whole dataset. It's easiest to imagine a "k nearest-neighbour" version, where to calculate the value of the curve at any point, you find the k points nearest to the point of interest, and average them. Loess is just like that but uses regression instead of a straight average. For this, use `model <- loess(y ~ x, data=dataset, span=...)`, where the `span` variable controls the degree of smoothing.

On the third hand (running out of hands) - you're talking about trends? Is this a temporal problem? If it is, be a little cautious with over interpreting trend lines and statistical significance. Trends in time series can appear in "autoregressive" processes, and for these processes the randomness of the process can occasionally construct trends out of random noise, and the wrong statistical significance test can tell you it's significant when it's not!

• Jim Robertson

6 years ago

Putting scatter plot sample points and smooth curve on same graph:

``````  library(graphics)
## Create some x,y sample points falling on hyperbola, but with error:
xSample = seq(0.1, 1.0, 0.1)
ySample = 1.0 / xSample
numPts <- length(xSample)
ySample <- ySample + 0.5 * rnorm(numPts) ## Add some noise

## Create x,y points for smooth hyperbola:
xCurve <- seq(0.1, 1.0, 0.001)
yCurve <- 1.0 / xCurve

plot(xSample, ySample, ylim = c(0.0, 12.0))   ## Plot the sample points
lines(xCurve, yCurve, col = 'green', lty = 1) ## Plot the curve
`````` License under CC-BY-SA with attribution

Content dated before 6/26/2020 9:53 AM
• {{ error }}