Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify the path of the error file #195

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

RatserX
Copy link
Contributor

@RatserX RatserX commented May 31, 2024

This change enables the use of the -e option for bcp (https://learn.microsoft.com/en-us/sql/tools/bcp-utility?view=sql-server-ver16#-e-err_file). This allows the creation of error files that contain detailed information about the rows that the bcp command couldn't process for whatever reason when executed.

Also included is a change to the bcp_path param typing to add the pathlib.Path type just because the bcp() function already allows it.

@RatserX
Copy link
Contributor Author

RatserX commented Jun 2, 2024

Actually, let me try a different approach. I think it would be better to store the error file in a temp file every time there's an error, read the tempfile and append the full text to the BCPandasException.

Nevermind, that idea didn't work that well. The error file can actually get pretty big when there are a lot of errors in the input file. If we try to read and print that those errors, they can actually slow down the to_csv function. Not only that, but this adds too much clutter to the original BCPandasException; even if we log print it, it is still hard to understand without proper formatting.

So, to keep the changes simple, I think my current approach to use the BCP parameter -e and generate an error file if a path is provided works just fine.

@yehoshuadimarsky
Copy link
Owner

looks good, can you please add some unit test

@RatserX
Copy link
Contributor Author

RatserX commented Dec 14, 2024

I added 2 unit tests, one for the identity_insert option and another one that tests the err_file. I've also updated the README based on what I had to do to run the tests locally. Furthermore, I've also added the TrustServerCertificate=yes option in certain unit test configs, which fixes an SSL issue when using Microsoft ODBC Driver 18.

@yehoshuadimarsky
Copy link
Owner

Let's split up the 2 unrelated issues, the SSL with ODBC 18, and your proposed change, can you make a new PR with just the SSL change?

Also we use uv now for everything, not pip - see https://docs.astral.sh/uv/

@RatserX
Copy link
Contributor Author

RatserX commented Dec 17, 2024

Let's split up the 2 unrelated issues, the SSL with ODBC 18, and your proposed change, can you make a new PR with just the SSL change?

Also we use uv now for everything, not pip - see https://docs.astral.sh/uv/

Gotcha. I reverted the changes that add the TrustServerCertificate option. I'll leave it up to you to update it since it only affects the tests, the package itself is working fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants