Do we really need ASCII-only text output? #540

mpusz · 2024-01-04T12:21:20Z

Besides the Unicode text output mp-units provides the ability to output ASCI-only text as well.

Standardizing such ASCII-only text output will be hard as ISO and SI standards do not specify alternative ASCII characters for those. This means we will have to guess and use some arbitrary things. Moreover, this complicates the design (e.g., requires an additional unit_symbol class template that stores two fixed_string objects).

Please let us know if you have issues with removing support for ASCII-only output and what is the rationale for keeping it.

The text was updated successfully, but these errors were encountered:

JohelEGP · 2024-01-04T14:26:09Z

We can follow [time.duration.io]:

(1.5)
Otherwise, if Period::type is micro, it is implementation-defined whether units-suffix is "μs" ("\u00b5\u0073") or "us".

mpusz · 2024-01-04T14:38:59Z

Yes, we could, but I do not think that is a good idea. For chrono it was one exception case. For our library there are plenty of cases like that.

mpusz · 2024-01-04T14:39:10Z

See: https://github.com/sg16-unicode/sg16-meetings#november-29th-2023.

JohelEGP · 2024-01-04T14:45:45Z

Our support for ASCII can be one exception case in the specification.
Rather than specifying how each string representing a dimension, unit, and eventually quantity, maps to ASCII,
just specify that the format specifier for ASCII does an implementation-defined mapping of the Unicode equivalent.

mpusz · 2024-01-04T14:54:20Z

I think that is not an option. The alternative symbol for each Unicode sign has to be explicitly provided so that text logs from one application can be then read as input by the other (see #541).

JohelEGP · 2024-01-04T15:02:08Z

Does it?
What does scnlib or WG21 says about round-tripping the one case in std::chrono?

tahonermann · 2024-01-06T18:28:45Z

From a standardization perspective, symbols that utilize only characters from the basic literal character set are required since the complete set of Unicode characters is not supported by all character encodings allowed by the C++ standard. I think the question posed in this issue is therefore misguided.

I believe the desired design is for a unit specification to have a preferred symbol selected from all of the characters available in the Unicode standard as well as a fallback symbol selected from the basic literal character set ([lex.charset]p7). By default, the preferred symbol would be used if the target encoding supports the full range of Unicode characters and the fallback symbol used otherwise. For those that wish to restrict output to ASCII-only, an option should be provided to use the fallback symbol in cases where the preferred symbol could otherwise be used but is not desired.

mpusz · 2024-01-07T10:29:38Z

Exactly! I tried to form a question so that most C++ developers would understand it. I believe that most have heard about ASCII but may have no clue what "basic literal character set" means 😉

Anyway, the main question remains. Do we want to limit the implementation to The Unicode characters only, or do we also want to provide a fallback option? Having both complicates the design and potential support for text input in the future, but may be required by some users, and I would love to hear about such cases.

mpusz · 2024-01-07T10:32:58Z

@ChrisRyan98008 stated on LinkedIn:

... from a general engineering opinion I would like to keep the ascii version. I could foresee uses for it. It is just sometimes too hard to type special unicode characters so I presume it would maintain symmetry with that input method.

mpusz · 2024-01-07T10:35:07Z

@ChrisRyan98008 also suggested:

Maybe you could just do the unicode output but with a units translations output utility layer to ascii. Maybe this would open up the translation output option for other formats like LaTeX.

For now, we do not plan to provide a translation layer for text output, but a user could probably do something on their own to implement it. Please let us know in case someone has a good idea of how to incorporate such a feature into the framework.

tahonermann · 2024-01-07T17:14:00Z

Anyway, the main question remains. Do we want to limit the implementation to The Unicode characters only, or do we also want to provide a fallback option?

A fallback symbol is required for standardization since there is no guarantee that characters outside the basic literal character set are representable at all. That fallback symbol is needed regardless of whether the proposed std::format grammar includes an option to explicitly opt-out of use of symbols that potentially include characters from outside the basic literal character set.

The question to be posed is, is the units-text-encoding grammar option currently present in the D3045R0 draft needed or does it suffice for the implementation to determine on its own when to use the fallback symbol. The responses so far suggest that the grammar option would be used and appreciated. I don't see a reason not to provide that option.

kwikius · 2024-02-07T11:50:18Z

Exactly! I tried to form a question so that most C++ developers would understand it. I believe that most have heard about ASCII but may have no clue what "basic literal character set" means 😉

Anyway, the main question remains. Do we want to limit the implementation to The Unicode characters only, or do we also want to provide a fallback option? Having both complicates the design and potential support for text input in the future, but may be required by some users, and I would love to hear about such cases.

Use case : I use my quantities library on 8bit mcu .eg https://github.com/kwikius/ultrasonic_wind_sensor/blob/master/libraries/UltrasonicWindSensor/wind_sensor_impl.cpp. ( Atmega328 ) For that type of use, the serial port is often used for output with ascii text.

mpusz · 2024-02-28T07:57:35Z

Based on the feedback we got, we decide to leave ASCII-only text output.

mpusz added help wanted Extra attention is needed question Further information is requested design Design-related discussion labels Jan 4, 2024

mpusz closed this as completed Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do we really need ASCII-only text output? #540

Do we really need ASCII-only text output? #540

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024

mpusz commented Jan 4, 2024

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024 •

edited

Loading

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024

tahonermann commented Jan 6, 2024

mpusz commented Jan 7, 2024

mpusz commented Jan 7, 2024

mpusz commented Jan 7, 2024

tahonermann commented Jan 7, 2024

kwikius commented Feb 7, 2024

mpusz commented Feb 28, 2024

Do we really need ASCII-only text output? #540

Do we really need ASCII-only text output? #540

Comments

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024

mpusz commented Jan 4, 2024

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024 • edited Loading

mpusz commented Jan 4, 2024

JohelEGP commented Jan 4, 2024

tahonermann commented Jan 6, 2024

mpusz commented Jan 7, 2024

mpusz commented Jan 7, 2024

mpusz commented Jan 7, 2024

tahonermann commented Jan 7, 2024

kwikius commented Feb 7, 2024

mpusz commented Feb 28, 2024

JohelEGP commented Jan 4, 2024 •

edited

Loading