PDF to ZUGFeRD & FACTUR-X conversion tool

ZUGFeRD (which is an acronym for  Zentraler User Guide des Forums elektronische Rechnung Deutschland) is a file format for digital invoices that are both human- and machine-readable. The human-readable part consists of a PDF/A-3 file, within which is embedded a machine-readable XML representation of the same information. The second version of the ZUGFeRD specification, which is currently under joint German/French development, is also called FACTUR-X.

Mimotek is making use of their experience of extracting content from PDF files, and of PDF manipulation in general, to develop a new tool (Mimotek Groom), which creates ZUGFeRD/FACTUR-X files from plain PDF invoices. The starting point is a PDF invoice file, which is opened in Mimotek Groom. The interactive software analyses the document, making the content accessible to the operator who associates the various data values with the XML tags that are defined in ZUGFeRD/FACTUR-X schema. Once the xml tags have been populated, the XML invoice is created and embedded in the PDF. Finally, the PDF is made compliant with the PDF/A-3 specification and the required metadata is added, to create a valid ZUGFeRD/FACTUR-X invoice.

This tool will be useful for organisations who receive or generate PDF invoices that must be converted to ZUGFeRD/FACTUR-X, and is currently in beta testing. For more information, or to apply to join the beta programme, please contact us.

Mimotek announces the release of Mimotek Structuriser 2.0

The second generation of Mimotek’s Structuriser software is now shipping. This upgrade builds on the success of the original version but offers a significant improvement in productivity, both through its feature set and its performance. The key changes are: Mimotek Structuriser Server 2.0 shows a significant increase in throughput compared with version 1. ­Mimotek StructuriserContinue Reading

Structured PDF or Tagged PDF

‘Tagged PDF’ and ‘Structured PDF’ are both terms that describe flavours of PDF that not only allow a digital document to be displayed and/or printed, but also allow meaningful content to be reliably extracted. The details of the two definitions are similar in that they both require (amongst other things) that fonts are embedded, charactersContinue Reading

PDF validation

The validation of PDF files was a major theme of the recent PDF Association technical conferences. While there are organisational and political issues to be solved (involving who would provide the tools and what guarantees could be given as to their accuracy), it would clearly be valuable if those processing PDF files could have access toContinue Reading