Core installation

The installation of the core Docxpresso package is pretty simple and can be done in a few steps:

  • Download the package from the website.
  • Uncompress it somewhere in your server directory tree. It is strongly recommended that the library should not be directly accesible to outside users for obvious security reasons.
  • Check that the [path/Docxpresso package]/log folder is writable.
  • Edit the config.ini file that you can find in the package root directory:
    • Include the license code, if you already have one, in the key variable. You can always do that at a later time if you just intend to test the library or use it for personal means.

You should also make sure that the following modules and extensions are installed:

  • PHP 5.3 or higher (although we recommend PHP 7.*)
  • ZipArchive (usually installed by default)
  • DOMDocument (usually installed by default)
  • Tidy (although not strictly required is highly recommended).

Run any of the tutorial examples (explicitely setting the real path to the library) to check that everything work as expected. If you have already installed a reasonable recent PHP version (5.3 or newer) that should be it and you may start right away to generate your own documents!!

Full installation

If you need:

  • to generate documents in PDF, RTF or DOCX format or
  • update automatically the Table of Contents (TOC) in any format,

you need a “full installation” so you should follow these additional steps:

  • Install a recent version of LibreOffice (recommended) or OpenOffice (no support for .docx output) directly from their websites.
  • Further edit the config.ini file:
    • Set the os option to WINDOWS.
    • Set the path to your LibreOffice or OpenOffice soffice executable file, usually C:Program Files (x86)LibreOffice 4programsoffice for LibreOffice and 64 bit systems or C:Program FilesLibreOffice 4programsoffice for Libre Office and 32 bit systems (change LibreOffice by OpenOffice or OpenOffice Beta if you are using Open Office), although you should always check the real path in your system to avoid malfunctioning of the application.
  • Install the corresponding Docxpresso.oxt extension via the unopkg command (the unopkg executable is located in the same directory as soffice):
    • This file can be found either in the extensions/LibreOffice or extensions/OpenOffice folder of the package, depending on the conversion engine that you plan to use.
    • Open a shell with administration privileges and navigate to the folder of your LibreOffice or OpenOffice installation.
    • Make sure that there is no currently any Libre or Open Office instance running on your system.
    • Run the following command: unopkg add –shared [pathDocxpresso package]extensions[LibreOffice or OpenOffice]Docxpresso.oxt. Where [pathDocxpresso package] stands for the full path to your Docxpresso installation. In order to do so you need administration privileges. You may ommit the –shared flag (not recommended) in the command and avoid the need to use administration privileges but in that case the extension will only be available to the user that performed the installation.

After all of these we are almost done. We only need to:

  • Rename the docxpresso._vbs_, docxpresso._bat_ and soffice._vbs_ to docxpresso.vbs, docxpresso.bat and soffice.vbs respectively (we have renamed these files to avoid virus warnings while downloading the package).
  • Navigate to the package root directory and run docxpresso.vbs “C:Program FilesLibreOffice 4programsoffice” or whatever may be the real path to your soffice executable:
    • Close the command line interface and check with the task manager that there is a php.exe process running (docxpresso.php) with description CLI
    • Check that the [path/Docxpresso package]csv folder is writable.

if you are upgrading your current Docxpresso installation you should first remove the previous version of the Docxpresso.oxt extension. You may do so with the
command:
sudo unopkg remove –shared Docxpresso.oxt
If you did not install the extension as a shared extension you may simply remove the
“sudo” and “–shared” bits of the above command.

And that should be it!!

Why LibreOffice or OpenOffice

As explained in the introduction the Docxpresso package internally uses the ODF standard (.odt) for document generation so unless one is only interested in generating .odt files (or .doc files compatible with recent versions of Word) one needs to convert the ODF XML code into PDF, DOCX or RTF format.

Docxpresso uses the powerful conversion engine available in LibreOffice and/or OpenOffice to carry out that task.

Therefore if you want, for example, to generate PDF output you need a reasonably recent version of LibreOffice or OpenOffice installed in your system.

We strongly recommend to use LibreOffice 4.3 or higher.

You may also install two versions of the package on the same server, one with LibreOffice and the other with OpenOffice, and use them indistinctly depending on the process.

How to boost performance

As explained above the generation of PDF and other formats different from .odt or .doc requires the use of Libre or Open Office. This task is controlled via the docxpresso.php script that should be run in the background as previously explained. The inner workings of this format transformation are as follows:

  1. The “core” of the Docxpresso package generates a temporary .odt document that is stored in the final destination of the requested document.
  2. The package creates a .csv file in the csv library folder with all the required info: requested file name and format as well as other document options.
  3. The unnatended docxpresso.php process scans that csv directory for documents to be generated with the requested options and makes a call to the Office macros installed with the Docxpresso.oxt extension.
  4. Those macros do the final processing and generate the final document.
  5. The docxpresso.php do the final cleaning of all temporary files.

Therefore the Docxpresso package should call Libre or Open Office for each generation of documents in .pdf, .docx or .rtf formats with the corresponding time, CPU and RAM consumption. For example, although this may depend on the platform, there is an average 0.4s delay in loading the Libre or Open Office package.

This time may be dramatically decreased if there is already a running copy of Libre or Open Office in the system.

Although this can be manually done, the library can do the job for you just by setting the auto config parameter to true.

This may have a little cost in the form of a continuous use of RAM by the application (around 50MB) but completely eliminates the 0.4s time overhead allowing that the generation of a typical pdf of a few pages takes only a few hundredths of a second.

There is another configurable parameter in config.ini: refresh that can be manually tuned and it is very conservatively set to 20 by default. We now pass to explain its origins.

Libre and Open Office do not clean its Graphics Memory Cache until they are closed. So if we have a copy of the suite running indefinetily in the background the memory consumption may grow in time and ultimately eat all of our precious RAM resources.

The refresh parameter control after how many document generation cycles the RAM memory assigned to Libre or Open Office should be flushed. This is done by closing and reopening the Office package, a process that may take a few hundrendths of a second (typically around 0.15s)

Although the added RAM consumption for a simple document may be very small and the refresh parameter could be securely set to, let us say, 1.000 with only few MBs of added memory consumption a huge pdf with hundreds of high quality images can rapidly increase the RAM consumption after 10/20 runs so we have set up a default value of 20 as a conservative default estimate.

Monitoring Docxpresso

Although maximum care has been taken to offer a stable production environment, whenever there are processes in the background that run uninterruptedly, like the format conversor, for long periods of time once in a while problems may pop up.

Docxpresso for Windows comes bundle with an optional monitoring script that periodically checks that the package is properly working by detecting if there are pending conversions.

There are three parameters that govern the monitoring and that can be tuned in monitor.vbs (beware that to avoid warnings while downloading all required vbs files have been renamed with the “_vbs_” extension so you should change it before running any of them):

  • dir: that is the full path to the csv dir folder included in the package.
  • sleepTimer: the time in milliseconds between consecutive checks.
  • maxRepetitions: an integer that bounds the number of times that the monitoring software detects a pending conversion before relaunching the library processes.

Whenever there is a conversion running for more than sleepTimer * (maxRepetitions + 1) milliseconds (this can be due to either the hanging of Libre or Open Office or simply because the conversion is taken too long because of a unexpectedly large file or because the system has somehow slowed down) the monitoring script assumes that there is a problem with the conversion and restart the docxpresso and soffice executables.

For example, setting sleepTimer to 2000 and maxRepetitions to 14 we make sure that the system will never spend more than 30 seconds in a single process.

Whenever the application is restarted automatically by the monitoring script an entry with all the required data is written in monitor.log (that can be found in the same csv folder).

In order to run the monitoring script in Windows (please, first check that Docxpresso is properly running) you should:

  1. Rename monitor._vbs_ and windowsMonitor._vbs_ as monitor.vbs and windowsMonitor.vbs
  2. Open windowsMonitor.vbs and edit:
    • The full path to the csv folder: dir parameter.
    • The sleepTimer and maxRepetitions parameters to suit your particular needs.
  3. Run monitor.vbs from the command line. it is very important that the process should be run by the same user that run the docxpresso.vbs script to avoid conflicts.

You may also, of course, launch monitor.vbs as an automatic task but always taking care that it has the required permissions to relaunch the application and to read and write in the csv folder.

Common issues

Although, in principle, the installation of the package should be rather straightforward if you follow all the above instructions there may pop up from time to time certain difficulties that should not be difficult to overcome.

Please, follow these steps by order:

  1. Check core functionality by first generating a very simple, i.e. standard Hello World script, .odt file with full PHP error reporting (error_reporting(E_ALL)). If the document is not generated check that:
    • Your PHP version is 5.3 or higher.
    • There is no lacking any required module like ZipArchive, that although standard by default your installation may be lacking.
    • The folder where you are trying to store the document is writable.
    • The package log folder is writable.
  2. Test full functionality by, for example, trying to generate the same document as before but in .pdf format. If the system fails check that:
    • The path to soffice set in config.ini is the correct one.
    • The Docxpresso.odt extension has been properly installed by checking the extensions in the GUI or running in the command line: unopkg list –shared Docxpresso.oxt (if you did not installed the extension for all users you may simply run unopkg list Docxpresso.oxt). If you get the message: ERROR: there is no such extension deployed try to install it again (be careful typing the full path to the required extension).
    • The docxpresso.php script is running (use the Task manager or the tasklist command). If it is not running, restart it again via de docxpresso.vbs command.
    • The user running the docxpresso.php script has read and write access to the csv package folder and to the folder where the final document should be stored.
    • The user running the docxpresso.php script has permission to run Libre or Open Office.
    • There are no other Libre or Open Office instances running in headless mode. To make sure just kill all current soffice processes with the help of the Task manager or the taskkill /fi “imagename eq soffice*” /f command.

In the unlikely case that the problem persists you may run directly docpresso.php in the foreground and check if there is any error thrown in the console.