Convert a .xlsx (MS Excel) file to .csv on command line with semicolon separated fields

  • I realize that this is not an entirely unix/linux related question. But since this is something I'll do on linux, I hope someone has an answer.

    I have an online excel file (.xlsx) which gets updated periodically (by someone else). I want to write a script and put it in as a cronjob in order to to process that excel sheet. But to do that, I need to convert that into a text file (so a .csv) with semicolon separated columns. It can't be comma separated unfortunately since some columns have commas in them. Is it at all possible to do this conversion from shell? I have Open office installed and I can do this by using its GUI, but want to know if it is possible to do this from command line. Thanks!

    PS: I have a Mac machine as well, so if some solution can work there, thats good as well. :)

  • OpenOffice comes with the unoconv program to perform format conversions on the command line.

    unoconv -f csv filename.xlsx
    

    For more complex requirements, you can parse XLSX files with Spreadsheet::XLSX in Perl or openpyxl in Python. For example, here's a quickie script to print out a worksheet as a semicolon-separated CSV file (warning: untested, typed directly in the browser):

    perl -MSpreadsheet::XLSX -e '
        $\ = "\n"; $, = ";";
        my $workbook = Spreadsheet::XLSX->new()->parse($ARGV[0]);
        my $worksheet = ($workbook->worksheets())[0];
        my ($row_min, $row_max) = $worksheet->row_range();
        my ($col_min, $col_max) = $worksheet->col_range();
        for my $row ($row_min..$row_max) {
            print map {$worksheet->get_cell($row,$_)->value()} ($col_min..$col_max);
        }
    ' filename.xlsx >filename.csv
    

    unoconv did not come with my OO, but I installed it and it works great (converts to comma separated file, not semicolon though)! Thanks! I will still need to figure out how I will get my fields that contain commas. But thanks anyways.

    @allrite Oh, I'd missed the requirement of semicolons as separators. My suggestion to do the processing in Python or Perl still stands. But I've also added a script (untested) to convert to CSV with `;` as the separator.

    Thanks! I used Spreadsheet::XLSX, but used the code in the CPAN link you provided. It works :)

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM