-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow process.detect_types
to match last type instead of the first
#34
Comments
detect_types
detect_types
Interesting.... so by default, Lines 278 to 283 in 524b2fd
If you have 17 floats with The general issue is, "what to do anytime content matches a higher Lines 264 to 266 in 524b2fd
Lines 160 to 169 in 524b2fd
E.g., >>> records = it.cycle(
... [
... {"int": 1, "float": "1.0"},
... {"int": 1, "float": "10.00"},
... {"int": 0, "float": "0.0"}
... ]
... )
>>> detect_types(records)[1]['types']
[{'id': 'int', 'type': 'bool'}, {'id': 'float', 'type': 'int'}]
>>> records = it.cycle(
... [
... {"int": 1, "float": "1.0"},
... {"int": 1, "float": "10.00"},
... {"int": 0, "float": "0.0"},
... {"int": 2, "float": "0.1"}
... ]
... )
>>> detect_types(records)[1]['types']
[{'id': 'int', 'type': 'int'}, {'id': 'float', 'type': 'float'}] |
One potential solution is an detect_types(records, upcast=True)[1]['types'] |
process.detect_types
to match last type instead of the first
Float values with zero as the fractional component e.g.
'0.0'
,'0.1'
,'1.00'
are detected asint
instead offloat
. This is because they can be parsed asint
according tofntools.is_int
. Although the data could be interpreted as an integer, given that the source has a decimal place I would argue thatdetect_types
should not perform casting to an integer. For example, database reports may include data from float/decimal columns which, just by chance, have no fractional component however this doesn't mean they should not be treated as floats.Example Test Case
Fails with:
Potential Solutions
detect_types
that is passed down tois_int
to change the behaviour from "can this be parsed as an int" to "this is definitely an int"I am happy to implement the changes required after a decision is made on the correct behaviour 😄
The text was updated successfully, but these errors were encountered: