Is it safe to remove XFA?
Are there any issues that can come up of removing the XFA format from a PDF form?
If I don't drop XFA, I can see the fields pre-filled on Acrobat Reader but not in Acrobat Pro. Other viewers like Ubuntu Document Viewer, present the file correctly. I don't mind dropping XFA but I'm just checking if there might be issues that I am not aware of.
Posted on StackOverflow on Apr 14, 2015 by user3501223
There are three types of forms in PDF:
Forms using AcroForm technology. In this case, each field corresponds with one or more widgets with fixed positions on specific pages. The form is described using nothing but PDF syntax.
Dynamic forms using the XML Forms Architecture (XFA). In this case, the PDF file is nothing but a container for an XML file that describes the whole form. We refer to this as dynamic XFA, because the form can expand or shrink based on the data that is added: a 1-page form can turn into a 100-page form by adding more data.
Hybrid forms that combine AcroForm and XFA technology. In this case, the form is described twice: once using PDF objects; once using XML. Obviously, such a form is not dynamic: the AcroForm part still defines widget annotations that are defined at absolute positions on specific pages. The form can't adapt to its data.
If you have a dynamic XFA form, dropping the XML will remove the complete form. There won't be anything left.
However, it seems that you are confronted with a hybrid form that consists of both AcroForm and XFA syntax. Hybrid forms are a pain because they often lead to confusion. For instance: a viewer that is not XFA aware, will show you the data as stored in the AcroForm. A viewer that is XFA aware, can give preference to the data as stored in the XFA form. What's the problem, you might ask? Aren't both forms equivalent?
Ideally, both versions of the form are indeed equivalent, but:
If the form isn't filled out correctly, the AcroForm can be different from the XFA form.
XFA has more functionality that AcroForm technology. For instance: a text field in an XFA form can be justified (similar to
<p align="justify">
in HTML). However, this option doesn't exist in an AcroForm text field (you can only have left, center or right alignment). Hence if you have text that is justified in an XFA form, but you only look at the AcroForm, then the text won't be justified (because justified text doesn't exist in an AcroForm text field).
This is a long answer to explain that, if you have a hybrid form, it is in most cases OK to throw away the XFA part. You may have small differences, but if you are OK with what the form looks like in Ubuntu Document Viewer (a viewer that doesn't support XFA), then you should be fine.
Click Is it safe to remove XFA? if you want to see how to answer this question in iText 5.
Creating XFA forms required the use of Adobe LiveCycle Designer, however, Adobe LiveCycle was discontinued in March 2018. In addition, XFA forms were deprecated with the publication of the ISO PDF 2.0 specification in 2017.
We developed the pdfXFA add-on for iText Core to assist with existing XFA workflows. However, for new workflows we recommend alternative solutions such as Fluent, which offers dynamic data-driven document generation in a wide range of document formats, including PDF, PDF/A and PDF/UA.
If you need to process XFA documents, we have a related blog post that may be interesting to you: