Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed map from {\'{E}} to É rather than Έ #12

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fuhrmanator
Copy link

No description provided.

Copy link
Owner

@pcooksey pcooksey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you search the string there are actually two "\'{E}" and the first one already had the value "\u00C9". So I think removing the second one rather than changing it is better. I looked online and it seems the unicode \u0388 and \u00c9 both produce the same latex (https://www.johndcook.com/unicode_latex.html).

Cris Fuhrman added 3 commits June 21, 2017 22:42
Based on the translation for Ί U+038A to \'{}{I}, I added a set of empty
{} after the \' for several capital greek (with tonos) characters, such
as Ό u+038C and Έ u+0388. Note that these LaTeX commands won't produce
the true unicode character, but rather an apostrophe followed by the
capital letter. Not being an expert in Greek, I'm not sure how likely
these encodings would appear in a BibTeX file.
@fuhrmanator
Copy link
Author

fuhrmanator commented Jun 22, 2017

I changed some things regarding the duplicates.

BibTeX encoding of accents technically is really limited. The unicode-latex translation site you provided seems ambitious, and likely out of the scope of BibTeX (the original encodings were from 1988!).

If the goal is to take a raw BibTeX file and convert it to a web page, it might make sense to assume very little is going on in the BibTeX. Currently, latex_to_unicode is ambitious, e.g., \Iota won't compile in a basic LaTeX document (maybe if you've loaded the right package?). \alpha exists in basic LaTeX, but \Alpha does not (since LaTeX expects you to use a capital Roman A). To see more about Greek letters in LaTeX, read here.

JabRef converts BibTeX files in unicode to standard LaTeX encoding, so that might be a place to start. Lots of eyes have looked at https://github.com/JabRef/jabref/blob/master/src/main/java/org/jabref/logic/util/strings/HTMLUnicodeConversionMaps.java for example. However, I see it has mappings to things like \Alpha as well, so maybe it's also trying to do more than basic LaTeX.

@tobiasdiez
Copy link

JabRef recently switched to latex2unicode which covered more cases and handled exceptions better then our own conversion algorithm. Maybe the following map is helpful: https://github.com/tomtung/latex2unicode/blob/master/src/main/scala/com/github/tomtung/latex2unicode/helper/Escape.scala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants