Fixed map from {\'{E}} to É rather than Έ #12

fuhrmanator · 2017-06-16T04:30:55Z

No description provided.

pcooksey

If you search the string there are actually two "\'{E}" and the first one already had the value "\u00C9". So I think removing the second one rather than changing it is better. I looked online and it seems the unicode \u0388 and \u00c9 both produce the same latex (https://www.johndcook.com/unicode_latex.html).

This reverts commit 58ef266.

Based on the translation for Ί U+038A to \'{}{I}, I added a set of empty {} after the \' for several capital greek (with tonos) characters, such as Ό u+038C and Έ u+0388. Note that these LaTeX commands won't produce the true unicode character, but rather an apostrophe followed by the capital letter. Not being an expert in Greek, I'm not sure how likely these encodings would appear in a BibTeX file.

fuhrmanator · 2017-06-22T04:16:03Z

I changed some things regarding the duplicates.

BibTeX encoding of accents technically is really limited. The unicode-latex translation site you provided seems ambitious, and likely out of the scope of BibTeX (the original encodings were from 1988!).

If the goal is to take a raw BibTeX file and convert it to a web page, it might make sense to assume very little is going on in the BibTeX. Currently, latex_to_unicode is ambitious, e.g., \Iota won't compile in a basic LaTeX document (maybe if you've loaded the right package?). \alpha exists in basic LaTeX, but \Alpha does not (since LaTeX expects you to use a capital Roman A). To see more about Greek letters in LaTeX, read here.

JabRef converts BibTeX files in unicode to standard LaTeX encoding, so that might be a place to start. Lots of eyes have looked at https://github.com/JabRef/jabref/blob/master/src/main/java/org/jabref/logic/util/strings/HTMLUnicodeConversionMaps.java for example. However, I see it has mappings to things like \Alpha as well, so maybe it's also trying to do more than basic LaTeX.

tobiasdiez · 2017-12-19T00:24:19Z

JabRef recently switched to latex2unicode which covered more cases and handled exceptions better then our own conversion algorithm. Maybe the following map is helpful: https://github.com/tomtung/latex2unicode/blob/master/src/main/scala/com/github/tomtung/latex2unicode/helper/Escape.scala

Fixed map from {\'{E}} to É rather than Έ

f5c4401

fuhrmanator mentioned this pull request Jun 18, 2017

Some diacritics e.g., É not rendering in HTML properly #11

Open

pcooksey requested changes Jun 21, 2017

View reviewed changes

Cris Fuhrman added 3 commits June 21, 2017 22:42

Removed duplicate mapping for \'{E}

58ef266

Revert "Removed duplicate mapping for \'{E}"

818fecc

This reverts commit 58ef266.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed map from {\'{E}} to É rather than Έ #12

Fixed map from {\'{E}} to É rather than Έ #12

fuhrmanator commented Jun 16, 2017

pcooksey left a comment

fuhrmanator commented Jun 22, 2017 •

edited

Loading

tobiasdiez commented Dec 19, 2017

Fixed map from {\'{E}} to É rather than Έ #12

Are you sure you want to change the base?

Fixed map from {\'{E}} to É rather than Έ #12

Conversation

fuhrmanator commented Jun 16, 2017

pcooksey left a comment

Choose a reason for hiding this comment

fuhrmanator commented Jun 22, 2017 • edited Loading

tobiasdiez commented Dec 19, 2017

fuhrmanator commented Jun 22, 2017 •

edited

Loading