1
DocumentUltimate doesn't produce correct ligatures for Persian Farsi when converting to PDF
Problem reported by Allen Drennan - 1/5/2022 at 7:56 AM
Resolved
For some reason the DocumentUltimate conversion doesn't properly generate the ligatures in the resulting PDF for some languages we have tested such as Persian Farsi.  

However when you export the PPTX from Microsoft PowerPoint to a PDF, the ligatures are correct in the resulting PDF.  Has anyone else experienced this issue and have a workaround?


We have the "distribution" license for DocumentUltimate for both 4.x and upgraded to 5.x, hoping that the issue would be resolved.  Tested with 6.x trial and it still exists.

6 Replies

Reply to Thread
0
Allen Drennan Replied
Some more details on this issue:
1. It works for Arabic, but not Persian Farsi.
2. Word documents convert to PDF correctly, PowerPoints convert incorrectly.

An example word in Persian Farsi such as ("Cost"):

هزينه

When converted to PDF in GleamTech's DocumateUltimate, is displayed as:
When converted to PDF using PowerPoint's Export to PDF feature display correctly.  In the above example the issue is the handling of the Persian letter "Yeh".  If we deconstruct the PDFs created by DocumentUltimate for both Word and PowerPoint we see the following:

The Word output used the following Unicode codepoints for the word for “Cost”:
FEEA: Arabic Letter Heh Final Form
FEE8: Arabic Letter Noon Medial Form
FBFE: Arabic Letter Farsi Yeh Initial Form
FEB0: Arabic Letter Zain Final Form
FEEB: Arabic Letter Heh Initial Form

While the PowerPoint output is:
FEEA: Arabic Letter Heh Final Form
FEE8: Arabic Letter Noon Medial Form
06CC: Arabic Letter Farsi Yeh
FEB0: Arabic Letter Zain Final Form
FEEB: Arabic Letter Heh Initial Form

The only difference is in the 3rd codepoint, where PowerPoint uses U+06CC (Arabic Letter Farsi Yeh) and Word uses U+FBFE (Arabic Letter Farsi Yeh Initial Form).

There are other Persian letters that are not handled properly.  I have attached an example PowerPoint slide that shows the issues.
slide03ppt.pptx
slide03pdf_gleamtech.pdf
slide03pdf_powerpoint.pdf
0
Cem Alacayir Replied
Employee Post
This could be a font issue. On your desktop computer, you installed Microsoft Office so you have wide range of new fonts. However on a Windows Server where you run DocumentUltimate, these fonts will be missing.
So DocumentUltimate will try to substitute with the most close font, but it may not be good for Persian language in that case?

For example:
“Arial Unicode MS” may not exist on a Windows Server out of the box .
I guess it’s installed only when MS Office is installed:
https://docs.microsoft.com/en-us/typography/font-list/arial-unicode-ms
Check list of fonts in C:\Windows\Fonts on your server
0
Allen Drennan Replied
It also happens locally on machines with Microsoft Office installed.  Please look over the information we provided.
0
Cem Alacayir Replied
Employee Post Marked As Resolution
FYI, this issue is now fixed in Version 6.1.0 - February 22, 2021:

  • Fixed: Incorrect ligatures for Persian Farsi when converting PPTX to PDF.

0
Allen Drennan Replied
We appreciate you looking into this issue.  We retested with the newest 6.1 version and unfortunately the issue with Persian Farsi ligature conversion remains.  This is only a PowerPoint conversion issue the conversion for Word documents is correct.  Please see the attached files.  The pdf is the DocumentUltimate output of the PowerPoint.

Persian test.docx
0
Allen Drennan Replied
Any update on this issue?  It still exists in the latest version and the online convert demo.  If you simply take the PPTX and convert it, you get a different result.  Word conversion works, PowerPoint conversion does not, see Persian test.pptx 

Reply to Thread