Thursday, December 6, 2012

PHP + OpenOffice|LibreOffice = EXCEL & PDF Generator

To use OpenOffice from command-line to convert files is like this:

soffice -invisible -convert-to xls myfile.xml

In this case convert a xml file to xls. If you want a PDF file just change to:

soffice -invisible -convert-to pdf myfile.doc

You can use any kind of files supported by OpenOffice as input, and can define the parameter -convert-to to any format supported by Open Office.

I will now just talk about xml to xls, because is very most trick. And to PDF is simple like that above, nothing more to say about it.

To generate a xml valid to Open Office is not easy because the character encode, in my case portuguese iso-8859-15. I had a nightmare, but at end was more simple than seemed.

I save my data in DB utf8_encode then I have to use this function stringToExcelXML to decode and convert to XML special characters format.

Below my PHP code that build the XML compatible with OpenOffice and MSOffice:


error_reporting(E_ALL);
include '../include/config.php';

try {
 $rows = MySQL::Create('SELECT * FROM `Client` WHERE `Name` LIKE ?name ORDER BY `Name`')
  ->Parameter('name', '%'.$_GET['name'].'%', 'string')
  ->Query();
} catch(MySQLException $ex) {
 echo $ex->getMessage();
}

$outputFileName = 'export_'. date ('Ymd_His');

function stringToExcelXML($string){
    $string = htmlspecialchars(utf8_decode($string));
    $string = str_replace("\x01", "\x20", $string);
    return $string;
}

$f = fopen('../export/'. $outputFileName .'.xml', "w");
fwrite($f, '<?xml version="1.0" encoding="iso-8859-15"?>'. chr(13));
fwrite($f, '<?mso-application progid="Excel.Sheet"?>'. chr(13));
fwrite($f, '<Workbook');
fwrite($f, '   xmlns="urn:schemas-microsoft-com:office:spreadsheet"');
fwrite($f, '   xmlns:o="urn:schemas-microsoft-com:office:office"');
fwrite($f, '   xmlns:x="urn:schemas-microsoft-com:office:excel"');
fwrite($f, '   xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"');
fwrite($f, '   xmlns:html="http://www.w3.org/TR/REC-html40">'. chr(13));
fwrite($f, '  <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">'. chr(13));
fwrite($f, '    <Author>Candidatures</Author>'. chr(13));
fwrite($f, '    <LastAuthor>Candidatures</LastAuthor>'. chr(13));
fwrite($f, '    <Created>'. date('Y-m-d\TH:i:s\Z') .'</Created>'. chr(13));
fwrite($f, '    <Company>Candidatures</Company>'. chr(13));
fwrite($f, '    <Version>1</Version>'. chr(13));
fwrite($f, '  </DocumentProperties>'. chr(13));
fwrite($f, '  <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">'. chr(13));
fwrite($f, '    <WindowHeight>6795</WindowHeight>'. chr(13));
fwrite($f, '    <WindowWidth>8460</WindowWidth>'. chr(13));
fwrite($f, '    <WindowTopX>120</WindowTopX>'. chr(13));
fwrite($f, '    <WindowTopY>15</WindowTopY>'. chr(13));
fwrite($f, '    <ProtectStructure>False</ProtectStructure>'. chr(13));
fwrite($f, '    <ProtectWindows>False</ProtectWindows>'. chr(13));
fwrite($f, '  </ExcelWorkbook>'. chr(13));
fwrite($f, '  <Styles>'. chr(13));
fwrite($f, '    <Style ss:ID="Default" ss:Name="Normal">'. chr(13));
fwrite($f, '      <Alignment ss:Vertical="Bottom" />'. chr(13));
fwrite($f, '      <Borders />'. chr(13));
fwrite($f, '      <Font />'. chr(13));
fwrite($f, '      <Interior />'. chr(13));
fwrite($f, '      <NumberFormat />'. chr(13));
fwrite($f, '      <Protection />'. chr(13));
fwrite($f, '    </Style>'. chr(13));
fwrite($f, '    <Style ss:ID="s21">'. chr(13));
fwrite($f, '      <Font x:Family="Swiss" ss:Bold="1" />'. chr(13));
fwrite($f, '    </Style>'. chr(13));
fwrite($f, '  </Styles>'. chr(13));
fwrite($f, '  <Worksheet ss:Name="Clients">'. chr(13));
fwrite($f, '    <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="'. (isset($rows) ? count($rows) : 0) .'"');
fwrite($f, '         x:FullColumns="1" x:FullRows="1">'. chr(13));
fwrite($f, '      <Row>'. chr(13));
fwrite($f, '        <Cell ss:StyleID="s21">'. chr(13));
fwrite($f, '          <Data ss:Type="String">Id</Data>'. chr(13));
fwrite($f, '        </Cell>'. chr(13));
fwrite($f, '        <Cell ss:StyleID="s21">'. chr(13));
fwrite($f, '          <Data ss:Type="String">Name</Data>'. chr(13));
fwrite($f, '        </Cell>'. chr(13));
fwrite($f, '      </Row>'. chr(13));
if (isset($rows)) {
  foreach($rows as $row) {
    fwrite($f, '      <Row>'. chr(13));
    fwrite($f, '        <Cell ss:StyleID="s21">'. chr(13));
    fwrite($f, '          <Data ss:Type="String">'. $row['Id'] .'</Data>'. chr(13));
    fwrite($f, '        </Cell>'. chr(13));
    fwrite($f, '        <Cell ss:StyleID="s21">'. chr(13));
    fwrite($f, '          <Data ss:Type="String">'. stringToExcelXML($row['Name']) .'</Data>'. chr(13));
    fwrite($f, '        </Cell>'. chr(13));
    fwrite($f, '      </Row>'. chr(13));
  }
}
fwrite($f, '    </Table>'. chr(13));
fwrite($f, '    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">'. chr(13));
fwrite($f, '      <Print>'. chr(13));
fwrite($f, '        <ValidPrinterInfo />'. chr(13));
fwrite($f, '        <HorizontalResolution>600</HorizontalResolution>'. chr(13));
fwrite($f, '        <VerticalResolution>600</VerticalResolution>'. chr(13));
fwrite($f, '      </Print>'. chr(13));
fwrite($f, '      <Selected />'. chr(13));
fwrite($f, '      <Panes>'. chr(13));
fwrite($f, '        <Pane>'. chr(13));
fwrite($f, '          <Number>3</Number>'. chr(13));
fwrite($f, '          <ActiveRow>5</ActiveRow>'. chr(13));
fwrite($f, '          <ActiveCol>1</ActiveCol>'. chr(13));
fwrite($f, '        </Pane>'. chr(13));
fwrite($f, '      </Panes>'. chr(13));
fwrite($f, '      <ProtectObjects>False</ProtectObjects>'. chr(13));
fwrite($f, '      <ProtectScenarios>False</ProtectScenarios>'. chr(13));
fwrite($f, '    </WorksheetOptions>'. chr(13));
fwrite($f, '  </Worksheet>'. chr(13));

fclose($f);

$excelFile = '../export/'. $outputFileName .'.xls';

while (true) {
  sleep(1);
  if (file_exists($excelFile)) {
    sleep(100);
    break;
  }
}

header('Content-Disposition: attachment; filename='. $outputFileName .'.xls');
header('Content-Type: application/octet-stream');
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-length: '. filesize($excelFile));
ob_clean();
flush();
readfile($excelFile);
unlink($excelFile);

exit();
?>

Attention with worksheet name <Worksheet ss:Name="Clients"> and the number of columns <Table ss:ExpandedColumnCount="2" ... and the encoding="iso-8859-15".

This block of code below will wait the OpenOffice finish the creation of file XLS.

$excelFile = '../export/'. $outputFileName .'.xls';

while (true) {
  sleep(1);
  if (file_exists($excelFile)) {
    sleep(100);
    break;
  }
}

This service made with shell script need still running all time to build XLSs using OpenOffice:

/var/www/services/openoffice_to_xls/openoffice_to_xls.sh
#!/bin/bash

cd /var/www/htdocs/export

while true
do
  for f in `ls *.xml`
  do
    sleep 1
    echo "$f"
    soffice -invisible -convert-to xls "$f"
    sleep 15
    chmod a+rwx *.xls
    rm -f "$f"
  done
  sleep 1
done

Turn the core.sh file as executable:

chmod +x /var/www/services/openoffice_to_xls/core.sh

A start script will be useful:

/var/www/services/openoffice_to_xls/start.sh
#!/bin/bash

rm -f /var/www/services/openoffice_to_xls/nohup.out
nohup /var/www/services/openoffice_to_xls/core.sh &

Turn the start.sh file as executable:

chmod +x /var/www/services/openoffice_to_xls/start.sh

Now just start the service:

/var/www/services/openoffice_to_xls/start.sh

All of this is very trick and to works with MSOffice 2010 and OpenXML... and to this work exists the PHPExcel that is much more elegant:

phpexcel.codeplex.com

But to build office legacy files perhaps these tricks will be useful to do workarounds.

Have fun and good nightmares.