Why Does the Transaction Log Keep Growing or Run Out of Space?
This one seems to be a common question in most forums and all over the web, it is asked here in many formats that typically sound like this:
In SQL Server -
- What are some reasons the transaction log grows so large?
- Why is my log file so big?
- What are some ways to prevent this problem from occurring?
- What do I do when I get myself on track with the underlying cause and want to put my transaction log file to a healthy size?
A Shorter Answer:
You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery mode of
Fulland have not taken a log backup (or aren't taking them frequently enough).
If it is a recovery model issue, the simple answer could be Switch to
Simplerecovery mode if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay in
There could be other reasons but these are the most common. This answer begins to dive into the most common two reasons and gives you some background information on the why and how behind the reasons as well as explores some other reasons.
A Longer Answer: What Scenarios can cause the log to keep Growing? There are many reasons, but usually these reasons are of the following two patterns: There is a misunderstanding about recovery models or there are long running transactions. Read on for details.
Top reason 1/2: Not Understanding Recovery Models
(Being in Full Recovery Mode and Not Taking Log Backups - This is the most common reason - the vast majority of those experiencing this issue are.)
While this answer is not a deep dive in SQL Server recovery models, the topic of recovery models is critical to this problem.
In SQL Server, there are three recovery models:
Bulk-Loggedfor now we'll sort of say it is a hybrid model and most people who are in this model are there for a reason and understand recovery models.
The two we care about and their confusion are the cause of the majority of the cases of people having this issue are
Intermission: Recovery in General
Before we talk about Recovery Models: let's talk about recovery in general. If you want to go even deeper with this topic, just read Paul Randal's blog and as many posts on it as you want. For this question, though:
One purpose of the transaction log file is for crash/restart recovery. For the rolling forward and rolling back of work that was either done (rolling forward/redo) before a crash or restart and the work that was started but not finished after a crash or restart (rolling back/undo). It is the job of the transaction log to see that a transaction started but never finished (rolled back or crash/restart happened before the transaction committed). In that situation It is the log's job to say "Hey.. this never really finished, let's roll it back" during recovery. It is also the log's job to see that you did finish something and that your client application was told it was finished (even if it hadn't yet hardened to your data file) and say "Hey.. this really happened, let's roll it forward, let's make it like the applications think it was" after a restart. Now there is more but that is the main purpose.
Point in Time Recovery
The other purpose for a transaction log file is to be able to give us the ability to recover to a point in time due to an "oops" in a database or to guarantee a recovery point in the event of a hardware failure involving the data and/or log files of a database. If this transaction log contains the records of transactions that have been started and finished for recovery, SQL Server can and does then use this information to get a database to where it was before an issue happened. But that isn't always an available option for us. For that to work we have to have our database in the right recovery model, and we have to take log backups.
Onto the recovery models:
Simple Recovery Model
So with the above introduction, it is easiest to talk about
Simple Recoverymodel first. In this model, you are telling SQL Server: "I am fine with you using your transaction log file for crash and restart recovery..." (You really have no choice there. Look up ACID properties and that should make sense quickly.) "...but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file."
SQL Server listens to this request in Simple Recovery and it only keeps the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation - which means it gets re-used.
Full Recovery Model
Full Recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup. In this case when SQL Server reaches the point where it would be safe to truncate the log file in Simple Recovery Model, it will not do that. Instead It lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.
Switching from Simple to Full has a Gotcha.
There are rules and exceptions here. We'll talk about long running transactions in depth below.
But one caveat to keep in mind for Full Recovery Mode is this: If you just switch into
Full Recoverymode, but never take an initial Full Backup, SQL Server will not honor your request to be in
Full Recoverymodel. Your transaction log will continue to operate as it has in
Simpleuntil you switch to Full Recovery Model AND take your first
Full Recovery Model without log backups is bad.
So, what is the most common reason for uncontrolled log growth? Answer: Being in Full Recovery mode without having any log backups.
This happens all the time to people.
Why is this such a common mistake?
Why does it happen all the time? Because each new database gets its initial recovery model setting by looking at the model database.
Model's initial recovery model setting is always
Full Recovery Model- until and unless someone changes that. So you could say the "default Recovery Model" is
Full. Many people are not aware of this and have their databases running in
Full Recovery Modelwith no log backups, and therefore a transaction log file much larger than necessary. This is why it is important to change defaults when they don't work for your organization and its needs)
Full Recovery Model with too few log backups is bad.
You can also get yourself in trouble here by not taking log backups frequently enough.
Taking a log backup a day may sound fine, it makes a restore require less restore commands, but keeping in mind the discussion above, that log file will continue to grow and grow until you take log backups.
How do I find out what log backup frequency I need?
You need to consider your log backup frequency with two things in mind:
- Recovery Needs - This should hopefully be first. In the event that the drive housing your transaction log goes bad or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minute, end of discussion.
- Log Growth - If your organization is fine to lose more data because of the ability to easily recreate that day you may be fine to have a log backup much less frequently than 15 minutes. Maybe your organization is fine with every 4 hours. But you have to look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?
Top reason 2/2: Long Running Transactions
("My recovery model is fine! The log is still growing!)
This can also be a cause of uncontrolled and unrestrained log growth. No matter the recovery model, but it often comes up as "But I'm in Simple Recovery Model - why is my log still growing?!"
The reason here is simple: if SQL is using this transaction log for recovery purposes as I described above, then it has to see back to the start of a transaction.
If you have a transaction that takes a long time or does a lot of changes, the log cannot truncate on checkpoint for any of the changes that are still in open transactions or that have started since that transaction started.
This means that a big delete, deleting millions of rows in one delete statement is one transaction and the log cannot do any truncating until that whole delete is done. In
Full Recovery Model, this delete is logged and that could be a lot of log records. Same thing with Index optimization work during maintenance windows. It also means that poor transaction management and not watching for and closing open transactions can really hurt you and your log file.
What can I do about these long running transactions?
You can save yourself here by:
- Properly sizing your log file to account for the worst case scenario - like your maintenance or known large operations. And when you grow your log file you should look to this guidance (and the two links she sends you to) by Kimberly Tripp. Right sizing is super critical here.
- Watching your usage of transactions. Don't start a transaction in your application server and start having long conversations with SQL Server and risk leaving one open too long.
- Watching the implied transactions in your DML statements. For example:
UPDATE TableName Set Col1 = 'New Value'is a transaction. I didn't put a
BEGIN TRANthere and I don't have to, it is still one transaction that just automatically commits when done. So if doing operations on large numbers of rows, consider batching those operations up into more manageable chunks and giving the log time to recover. Or consider the right size to deal with that. Or perhaps look into changing recovery models during a bulk load window.
Do these two reasons also apply to Log Shipping?
Short answer: yes. Longer answer below.
Question: "I'm using log shipping, so my log backups are automated... Why am I still seeing transaction log growth?"
Answer: read on.
What is Log Shipping?
Log shipping is just what it sounds like - you are shipping your transaction log backups to another server for DR purposes. There is some initialization but after that the process is fairly simple:
- A job to backup the log on one server,
- a job to copy that log backup and
- a job to restore it without recovery (either
STANDBY) on the destination server.
There are also some jobs to monitor and alert if things don't go as you have them planned.
In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in full recovery mode without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.
General troubleshooting via status codes
There are reasons other than these two, but these are the most common. Regardless of the cause: there is a way you can analyze your reason for this unexplained log growth/lack of truncation and see what they are.
By querying the
sys.databasescatalog view you can see information describing the reason your log file may be waiting on truncate/reuse.
There is a column called
log_reuse_waitwith a lookup ID of the reason code and a
log_reuse_wait_desccolumn with a description of the wait reason. From the referenced books online article are the majority of the reasons (the ones you are likely to see and the ones we can explain reasons for. The missing ones are either out of use or for internal use) with a few notes about the wait in italics:
0 = Nothing
What it sounds like.. Shouldn't be waiting
1 = Checkpoint
Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup
You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
3 = Active backup or restore
A backup or restore operation is running on the database
4 = Active transaction
There is an active transaction that needs to complete (either way -
COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring
Either a mirror is getting behind or under some latency in a high performance mirroring situation or mirroring is paused for some reason
6 = Replication
There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons. You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader
7 = Database snapshot creation
You are creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created
8 = Log Scan
I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An AlwaysOn Availability Groups secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..
Page splits will increase logging. A significant reason (in my experience) that hasn't been mentioned for large growth that may require frequently shrinking which has been resolved in a lot of my cases would be to use proper index choices including proper FillFactor mgmt. I use the following settings, with close observation. FF settings: (0/100) tables with high-reads/low writes, (90) for slightly modified, (80) medium-reads/low-med writes, (70) high-writes, (60) I hardly reach this level or something else might be wrong. Use then properly index management schedules matching data volume.
@SnapJag should checkout the "Performance considerations" section at https://docs.microsoft.com/en-us/sql/relational-databases/indexes/specify-fill-factor-for-an-index before changing index fill-factor from the default of 0.
Since I'm not really satisfied with any of the answers over on Stack Overflow, including the most heavily up-voted suggestion, and because there are a few things I'd like to address that Mike's answer does not, I thought I would provide my input here too. I placed a copy of this answer there as well.
Making a log file smaller should really be reserved for scenarios where it encountered unexpected growth which you do not expect to happen again. If the log file will grow to the same size again, not very much is accomplished by shrinking it temporarily. Now, depending on the recovery goals of your database, these are the actions you should take.
First, take a full backup
Never make any changes to your database without ensuring you can restore it should something go wrong.
If you care about point-in-time recovery
(And by point-in-time recovery, I mean you care about being able to restore to anything other than a full or differential backup.)
Presumably your database is in
FULLrecovery mode. If not, then make sure it is:
ALTER DATABASE yourdb SET RECOVERY FULL;
Even if you are taking regular full backups, the log file will grow and grow until you perform a log backup - this is for your protection, not to needlessly eat away at your disk space. You should be performing these log backups quite frequently, according to your recovery objectives. For example, if you have a business rule that states you can afford to lose no less than 15 minutes of data in the event of a disaster, you should have a job that backs up the log every 15 minutes. Here is a script that will generate timestamped file names based on the current time (but you can also do this with maintenance plans etc., just don't choose any of the shrink options in maintenance plans, they're awful).
DECLARE @path NVARCHAR(255) = N'\\backup_share\log\yourdb_' + CONVERT(CHAR(8), GETDATE(), 112) + '_' + REPLACE(CONVERT(CHAR(8), GETDATE(), 108),':','') + '.trn'; BACKUP LOG foo TO DISK = @path WITH INIT, COMPRESSION;
\\backup_share\should be on a different machine that represents a different underlying storage device. Backing these up to the same machine (or to a different machine that uses the same underlying disks, or a different VM that's on the same physical host) does not really help you, since if the machine blows up, you've lost your database and its backups. Depending on your network infrastructure it may make more sense to backup locally and then transfer them to a different location behind the scenes; in either case, you want to get them off the primary database machine as quickly as possible.
Now, once you have regular log backups running, it should be reasonable to shrink the log file to something more reasonable than whatever it's blown up to now. This does not mean running
SHRINKFILEover and over again until the log file is 1 MB - even if you are backing up the log frequently, it still needs to accommodate the sum of any concurrent transactions that can occur. Log file autogrow events are expensive, since SQL Server has to zero out the files (unlike data files when instant file initialization is enabled), and user transactions have to wait while this happens. You want to do this grow-shrink-grow-shrink routine as little as possible, and you certainly don't want to make your users pay for it.
Note that you may need to back up the log twice before a shrink is possible (thanks Robert).
So, you need to come up with a practical size for your log file. Nobody here can tell you what that is without knowing a lot more about your system, but if you've been frequently shrinking the log file and it has been growing again, a good watermark is probably 10-50% higher than the largest it's been. Let's say that comes to 200 MB, and you want any subsequent autogrowth events to be 50 MB, then you can adjust the log file size this way:
USE [master]; GO ALTER DATABASE Test1 MODIFY FILE (NAME = yourdb_log, SIZE = 200MB, FILEGROWTH = 50MB); GO
Note that if the log file is currently > 200 MB, you may need to run this first:
USE yourdb; GO DBCC SHRINKFILE(yourdb_log, 200); GO
If you don't care about point-in-time recovery
If this is a test database, and you don't care about point-in-time recovery, then you should make sure that your database is in
ALTER DATABASE yourdb SET RECOVERY SIMPLE;
Putting the database in
SIMPLErecovery mode will make sure that SQL Server re-uses portions of the log file (essentially phasing out inactive transactions) instead of growing to keep a record of all transactions (like
FULLrecovery does until you back up the log).
CHECKPOINTevents will help control the log and make sure that it doesn't need to grow unless you generate a lot of t-log activity between
Next, you should make absolute sure that this log growth was truly due to an abnormal event (say, an annual spring cleaning or rebuilding your biggest indexes), and not due to normal, everyday usage. If you shrink the log file to a ridiculously small size, and SQL Server just has to grow it again to accommodate your normal activity, what did you gain? Were you able to make use of that disk space you freed up only temporarily? If you need an immediate fix, then you can run the following:
USE yourdb; GO CHECKPOINT; GO CHECKPOINT; -- run twice to ensure file wrap-around GO -- 200 MB DBCC SHRINKFILE(yourdb_log, 200); GO
Otherwise, set an appropriate size and growth rate. As per the example in the point-in-time recovery case, you can use the same code and logic to determine what file size is appropriate and set reasonable autogrowth parameters.
Some things you don't want to do
Back up the log with
TRUNCATE_ONLYoption and then
SHRINKFILE. For one, this
TRUNCATE_ONLYoption has been deprecated and is no longer available in current versions of SQL Server. Second, if you are in
FULLrecovery model, this will destroy your log chain and require a new, full backup.
Detach the database, delete the log file, and re-attach. I can't emphasize how dangerous this can be. Your database may not come back up, it may come up as suspect, you may have to revert to a backup (if you have one), etc. etc.
Use the "shrink database" option.
DBCC SHRINKDATABASEand the maintenance plan option to do the same are bad ideas, especially if you really only need to resolve a log problem issue. Target the file you want to adjust and adjust it independently, using
ALTER DATABASE ... MODIFY FILE(examples above).
Shrink the log file to 1 MB. This looks tempting because, hey, SQL Server will let me do it in certain scenarios, and look at all the space it frees! Unless your database is read only (and it is, you should mark it as such using
ALTER DATABASE), this will absolutely just lead to many unnecessary growth events, as the log has to accommodate current transactions regardless of the recovery model. What is the point of freeing up that space temporarily, just so SQL Server can take it back slowly and painfully?
Create a second log file. This will provide temporarily relief for the drive that has filled your disk, but this is like trying to fix a punctured lung with a band-aid. You should deal with the problematic log file directly instead of just adding another potential problem. Other than redirecting some transaction log activity to a different drive, a second log file really does nothing for you (unlike a second data file), since only one of the files can ever be used at a time. Paul Randal also explains why multiple log files can bite you later.
Instead of shrinking your log file to some small amount and letting it constantly autogrow at a small rate on its own, set it to some reasonably large size (one that will accommodate the sum of your largest set of concurrent transactions) and set a reasonable autogrow setting as a fallback, so that it doesn't have to grow multiple times to satisfy single transactions and so that it will be relatively rare for it to ever have to grow during normal business operations.
The worst possible settings here are 1 MB growth or 10% growth. Funny enough, these are the defaults for SQL Server (which I've complained about and asked for changes to no avail) - 1 MB for data files, and 10% for log files. The former is much too small in this day and age, and the latter leads to longer and longer events every time (say, your log file is 500 MB, first growth is 50 MB, next growth is 55 MB, next growth is 60.5 MB, etc. etc. - and on slow I/O, believe me, you will really notice this curve).
Please don't stop here; while much of the advice you see out there about shrinking log files is inherently bad and even potentially disastrous, there are some people who care more about data integrity than freeing up disk space.
You can also see the content of your log file. To do that, you can use the undocumented
fn_dblog, or a transaction log reader, such as ApexSQL Log.
It doesn't show index reorganization, but it shows all DML and various DDL events:
DROP, trigger enable/disable, grant/revoke permissions, object rename.
Disclaimer: I work for ApexSQL as a Support Engineer
This is the most frequently faced issue for almost all the DBAs where the logs grows and fills out the disk.
•What are some reasons the transaction log grows so large?
- Long Active Transaction
- High logging transactions like Index rebuild, re-organise, Bulk Insert, Deletes etc.
- Any HA like Replication, Mirroring configured which holds the log and does not allow it release the log space
•Why is my log file so big?
Check for the
log_reuse_wait_desc column in
sys.databasestable to know what is holding the logs from truncating:
select name, log_reuse_wait_desc from sys.databases
•What are some ways to prevent this problem from occurring?
Log backups will help you control the log growth unless there is something that is holding up the logs from being reused.
•What do I do when I get myself on track with the underlying cause and want to put my transaction log file to a healthy size?
If you have identified what actually is causing it then try to fix it accordingly as explained in below page.
Having proper log backups scheduled is the best way of dealing with log growth unless for an unusual situation.