Bad codepoint range #423

ccleve · 2022-10-31T17:01:43Z

I'm trying to compile the exact example here:

I've copied unicode_categories.re into the test directory and attempted to compile the code using re2rust. I get this error:

tests/pipeline/tokenizers/re2c/unicode_categories.re:2:68: error: bad code point range: '0xF8 - 0x2C1'

What am I doing wrong?

Also: I see that unicode_categories.re is three years old at this point. Should it be regenerated with a more recent version of unicode?

The text was updated successfully, but these errors were encountered:

skvadrik · 2022-10-31T17:11:46Z

What am I doing wrong?

Can you provide you command line? Did you forget --utf8 argument?

Also: I see that unicode_categories.re is three years old at this point. Should it be regenerated with a more recent version of unicode?

Yes, it should. We even had a project for rewriting the generator (#235 (comment)) but that somehow got stuck.

ccleve · 2022-10-31T17:14:39Z

Yes, I did forget the --ut8 argument. Thank you, works now.

I'll take a look at the regeneration code. Maybe I can help.

ccleve mentioned this issue Nov 2, 2022

Add script to regen unicode file #425

Closed

skvadrik closed this as completed Nov 11, 2022

Provide feedback