How can I monitor the progress of an import of a large .sql file?

  • I am importing a 7 GB foobar.sql to restore a table in a local database.

    $ mysql -h localhost -u root 'my_data' < foobar.sql
    
    $ mysql --version
    /usr/local/mysql/bin/mysql  Ver 14.12 Distrib 5.0.96, for apple-darwin9.8.0 (i386) using readline 5.1
    

    How can I monitor its progress?

    The answers to this question show this is a clear deficiency of the mysql client

  • Rob

    Rob Correct answer

    8 years ago

    If you're just importing from a dump file from the CLI on *nix, e.g.

    mysql -uxxx -pxxx dbname < /sqlfile.sql
    

    then first install pipe viewer on your OS then try something like this:

    pv sqlfile.sql | mysql -uxxx -pxxxx dbname
    

    which will show a progress bar as the program runs.

    It's very useful and you can also use it to get an estimate for mysqldump progress.

    pv dumps the sqlfile.sql and passes them to mysql (because of the pipe operator). While it is dumping, it shows the progress. The cool thing is that mysql takes the data only as fast as it can progress it, so pv can show the progress of the import. I do not have any proof. But it seems so. I guess there is some buffer used, but at some point I think mysql does not read any more data when it is still busy processing.

    Pipe Viewer screenshot

    I would guess that mysql might have a buffer, in which some data can be piped in, without being fully "processed" (i.e. if it errors out, pv may have slightly over-reported what actually gets in). But in general, this is how pipes work. It's the same reason you can do ```sudo hd /dev/sda1 | less``` and not have your entire system partition in memory.

    @snapfractalpop `pv` won't be overly accurate in many cases because some chunks of SQL will take more time to process than others. A line that constitutes a simple insert will run a lot faster than one that creates on index on a table that already has many rows, for instance. *But* a a rough idea of progress the output should be helpful unless the read buffer used by `mysql` is particularly large (for a 7Gb input the buffer would need to be very large to render `pv`'s output not useful at all.

    @DavidSpillett indeed. Your comment mirrors my sentiment. Basically, pv is crude, but effective. What I like most about it is how general it is. Such is the beauty of unix pipes (thank you McIlroy).

    @rob This is awesome dude, could you also provide an example with `mysqldump`?

    Very nice solution ! If the password is manual, pv does not wait for it to display its progression though

    How do you get this to work with `mysqlimport`, which requires the table name as the file name?

    @josue-alexander-ibarra You may find a solution to your request in this StackOverflow article

    Work perfect and useful

    never used pipes before in bash so it was difficult to understand to me what the line of code does. But this tutorial made it very easy to understand the concept: https://www.youtube.com/watch?v=mTwBlPqRZO8 Also on ubuntu it's just `apt install pv` to get pipeviewer installed.

    i love it! that really saves my day when importing 50+GB sql dumps....

  • If you've already started the import, you can execute this command in another window to see the current size of your databases. This can be helpful if you know the total size of the .sql file you're importing.

    SELECT table_schema "Data Base Name", sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MiB" 
    FROM information_schema.TABLES GROUP BY table_schema;  
    

    Credit to: http://forums.mysql.com/read.php?108,201578,201578


    The MySQL 8.0 Reference states the following about the accuracy:

    DATA_LENGTH

    For MyISAM, DATA_LENGTH is the length of the data file, in bytes.

    For InnoDB, DATA_LENGTH is the approximate amount of memory allocated for the clustered index, in bytes. Specifically, it is the clustered index size, in pages, multiplied by the InnoDB page size.

     

    INDEX_LENGTH

    For MyISAM, INDEX_LENGTH is the length of the index file, in bytes.

    For InnoDB, INDEX_LENGTH is the approximate amount of memory allocated for non-clustered indexes, in bytes. Specifically, it is the sum of non-clustered index sizes, in pages, multiplied by the InnoDB page size.

    My table is now at 12 GiB according to the commands from this answer, and still importing. My sqldump file is only 5 GiB. I would be interested in an explanation for this discrepancy

    @lucidbrot is your sql file gzipped?

  • When you execute a mysqldump of a single database, all tables are dumped in alphabetical order.

    Naturally, the reload of the mysqldump into a database would also be in alphabetical order.

    You could just do a SHOW PROCESSLIST; and find out the DB Connection running the mysqldump. When the dump is reloaded, the DB Connection will vanish.

    If you want to know what tables are in the dumpfile, run this against foobar.sql

    cat foobar.sql | grep "^CREATE TABLE" | awk '{print $3}'
    

    UPDATE 2012-05-02 13:53 EDT

    Sorry for not noticing that there is only one table.

    If the table is MyISAM, the only way to monitor is from the OS point of view. The reason? The table is write-locked throughout the reload. What do you look for? The size of the .MYD and .MYI files. Of course, you need to compare that with what the table size was before on the other DB server you imported from.

    If the table is InnoDB and you have innodb_file_per_table enabled, the only way to monitor is from the OS point of view. The reason? The table is write-locked throughout the reload. What do you look for? The size of the .ibd file. Of course, you need to compare that with what the table size was before on the other DB server you imported from.

    If the table is InnoDB and you have innodb_file_per_table disabled, not even the OS point of view can help.

    UPDATE 2012-05-02 13:56 EDT

    I addressed something like this last year : How do I get % progress for "type db.sql | mysql"

    UPDATE 2012-05-02 14:09 EDT

    Since a standard mysqldump write-locks the table like this:

    LOCK TABLES `a` WRITE;
    /*!40000 ALTER TABLE `a` DISABLE KEYS */;
    INSERT INTO `a` VALUES (123),(451),(199),(0),(23);
    /*!40000 ALTER TABLE `a` ENABLE KEYS */;
    UNLOCK TABLES;
    

    then, there is no way to get a progress from with mysql until the table lock is released.

    If you can get LOCK TABLES and UNLOCK TABLES commented out of the dumpfile...

    • if the table is MyISAM, SELECT COUNT(*) would work
    • if the table is InnoDB, SELECT COUNT(*) would probably slow down/halt the load until count is done

    That worked. Thanks. One last question is, by experience, do you know if the importing time is roughly **linear** with respect to the `.MYD` and `.MYI` file sizes?

    Table reload is linear. Index rebuilds are linear. Years ago, it was not as I ventured this as a question to MySQL ( http://lists.mysql.com/mysql/202489 ) and I mentioned it in the DBA StackExchange ( http://dba.stackexchange.com/a/2697/877 )

  • Every 2 seconds you will see the processes running.

    watch 'echo "show processlist;" | mysql -uuser -ppassword';
    

    If you want it less frequent then add -n x where x is the number of seconds. 5 seconds would be:

    watch -n 5 'echo "show processlist;" | mysql -uuser -ppassword';
    

    Can you post a example output? Also, does it just show the process or does it really indicate the progress of the import, which was really I was asking for?

    This is such a helpful code. Thankyou

    This is really helpful after SQL file upload and starts restoring queries

    Even though it does not show the progress in % it does show it is actually working and not stuck.

  • If you just want to check if it is stalled you can query

    show processlist; 
    

    and see what is being executed.

  • As a solution for someone who can't get pv to work or for whom pv tells lies. You can monitor the size of ibdata1 file in /var/lib/mysql which contains the data. This will end up the same size (or thereabouts) of the filesize in your source server.

    If there are many tables you can also watch them appear one by one in /var/lib/mysql/< database name>.

    I happened to use this fact recently when a long term database had built up a log file of around 20G over a period of three or four years. I noticed the transfer was taking ages and used this technique to monitor progress.

    I think that it is highly unlikely that the day will dawn when a database does not involve a file somewhere or other. Meanwhile, you can monitor the file to see how a transfer is progressing. The method I suggested has been something you could do in one form or another since the first sql database was written. I never intended to suggest that it was any kind of "official" technique that a manual jockey could fall back on. It assumes a general level of proficiency with computers in general and unix in particular.

  • If your DB is otherwise quiet (i.e. there are not other users active) and you want to just see read/write activity why not just do something like:

    mysqladmin -h<host>-uroot -p<yourpass> extended -r -i 10 |grep 'row'
    

    You will see number of reads/writes/inserts/waits/updates.

    If you are inserting for example you will see something like:

    Innodb_rows_inserted                          | 28958 
    

    Where 28958 is the number of rows inserted for your interval (10 seconds in my case).

  • For someone who is looking for the pipe viewer example using mysqldump you would just doing something like this:

    mysqldump -hxxx -uxxx -p dbname | pv -W > dump.sql
    

    The -W flag just tells pv to wait for the first byte to come before showing the progress (after the prompt)

  • I use https://github.com/Xfennec/progress for it and monitor via watch

    watch progress
    

    After firing the import zcat example.sql.gz | mysql -u root -proot -h localhost example

  • I had a 500 MB SQL file to import. It took me around 2 hours. The mysqld CPU usage was next to 100 % at the start of the import process. But after a few minutes the CPU usage was down to 15 %.

    I tried many tweaks but only this one helped me: innodb_flush_log_at_trx_commit = 0

    After applying this setting and restarting mysql the import took just 3 minutes! CPU utilization was 100 % all the time.

    If you like to use this setting, you will need to edit the "/etc/mysql/my.cnf" file and restart the mysql server using "sudo service mysql restart".

    Here are the settings of my "my.conf" file:

        [mysqld]
        innodb_log_buffer_size = 256M
        innodb_fast_shutdown = 0
        innodb-doublewrite = OFF
        innodb_io_capacity = 1000
        innodb_flush_log_at_trx_commit = 0
    

    Please note: The "innodb_flush_log_at_trx_commit = 0" will do a commit only every second. So it's not ACID conform, but for a bulk import acceptable. After the import you can set the value of "innodb_flush_log_at_trx_commit" back to 1 and restart your database. Link to mySQL Documentation

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM