Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection breaks on good file #99

Open
hyperknot opened this issue Apr 26, 2023 · 1 comment
Open

Detection breaks on good file #99

hyperknot opened this issue Apr 26, 2023 · 1 comment

Comments

@hyperknot
Copy link

The following file is from Google Sheets. In one column there is a markdown formatted multiline text. The problem is that CleverCSV detects this file wrong, braking it.

Essentially it select this super weird star(*) based delimiter, which then breaks the whole file.

Running normal form detection ...
Not normal, has potential escapechar.
Running data consistency measure ...
SimpleDialect(',', '', ''):	P =       14.309419	T =        0.672613	Q =        9.624698
SimpleDialect(',', '', '/'):	P =       14.268794	T =        0.615974	Q =        8.789203
SimpleDialect(',', '"', ''):	P =       37.647059	T =        0.942647	Q =       35.487889
SimpleDialect(',', '"', '/'):	P =       18.751838	skip.
SimpleDialect('', '', ''):	P =        0.313000	skip.
SimpleDialect('', '"', ''):	P =        0.040000	skip.
SimpleDialect(' ', '', ''):	P =       45.500250	T =        0.332927	Q =       15.148254
SimpleDialect(' ', '"', ''):	P =       13.000500	skip.
SimpleDialect('#', '', ''):	P =       26.065333	skip.
SimpleDialect('#', '"', ''):	P =        0.040000	skip.
SimpleDialect('*', '', ''):	P =       93.639500	T =        0.843074	Q =       78.945071
SimpleDialect('*', '"', ''):	P =        0.040000	skip.
SimpleDialect('-', '', ''):	P =       39.078500	skip.
SimpleDialect('-', '"', ''):	P =        0.040000	skip.
SimpleDialect(':', '', ''):	P =       21.732000	skip.
SimpleDialect(':', '"', ''):	P =        9.750500	skip.
SimpleDialect('_', '', ''):	P =        0.406000	skip.
SimpleDialect('_', '"', ''):	P =        0.269500	skip.

CSV file attached.
csv_good_dialect_star.csv

Link to Google Sheets: https://docs.google.com/spreadsheets/d/1pbU8Fe0h-NvCc5Cxxbg_nonJgYZB4mHdHNsrmva57CE/edit?usp=sharing

@ws-garcia
Copy link

In this case, @hyperknot, I suggest you to provide a candidate list to exclude the star character potential dialects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants