[build2] Official build2 Fedora packages
Matthew Krupcale
mkrupcale at matthewkrupcale.com
Mon Jun 24 14:53:08 UTC 2019
On Mon, Jun 24, 2019 at 9:04 AM Boris Kolpackov <boris at codesynthesis.com> wrote:
>
> I dug a bit deeper into this and I don't believe converting PS to UTF-8
> with iconv(1) is the correct thing to do: PostScript has a set of
> encodings and its ISO-8859-1 equivalent is called "Latin 1"[1].
Thanks for digging deeper into that. Maybe then no encoding
conversions should be performed at all. The currently shipped
PostScript does appear to be correctly encoded ISO-8859-1.
I don't recall if it was correct before the change from UTF-8 tree
directory list characters to ASCII, while the title page appears
correct since the doc/doc.html2ps config file is already ASCII and
uses HTML character references (e.g. and ©). The reason
I'm not sure about the directory tree listings is because the html2ps
didn't specify[1] the HTML input file encoding when opening[2], so it
may have just been reading the raw bytes and thus not generating
correct ISO-8859-1 encoded PostScript previously.
> If you grep the .ps file you will see a few references to Latin1.
Ah, yes that's probably important :).
> Also, after doing the conversion the result certainly look off for me when viewed
> with gv(1).
Yeah, I can reproduce this on the converted PostScript as well,
whereas the original latin1 PostScript looks correct.
> I also don't see why we need to change the encoding of the .ps files
> since (after fixing up the tree output) they only use character from
> ISO-8859-1. Is Fedora requiring all the documentation to be in UTF-8?
This was another warning from the rpmlint(1) process and was
specifically mentioned in the review process. I actually couldn't find
any reference to such a requirement in the official packaging
documentation, however, and I was really only doing this to get past
the review. I personally don't really work with PostScript except EPS
for graphics and tend to just use the PDF.
Sorry for all the trouble with this then. But if it is desirable to
include UTF-8 characters inside the CLI docs--as was previously done
for the directory tree listings--(and thereby generated HTML), this
may still be an issue when using html2ps since the encoding is not
specified when opening the HTML file. When converting UTF-8 CLI docs
to LaTeX you need to be sure to specify the input encoding and be sure
that the Unicode characters are specified with e.g.
\DeclareUnicodeCharacter (as done in pmboxdraw package for directory
tree listings).
Best,
Matthew
[1] https://perldoc.perl.org/functions/open.html
[2] https://github.com/sveinbjornt/sagadb.org/blob/d951bf8c02664be807d33eb7ee02cccf66a7149b/html2ps/html2ps#L2289
More information about the users
mailing list