Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimal batch size for SQL Server #121

Open
premsai opened this issue Dec 13, 2022 · 9 comments
Open

Optimal batch size for SQL Server #121

premsai opened this issue Dec 13, 2022 · 9 comments
Assignees

Comments

@premsai
Copy link

premsai commented Dec 13, 2022

Hi
We are in the process of evaluating your library for our firm our tech stack is
.NETCore, C# and SQLServer
I have a massive data load problem I am trying to solve about 5 million rows to be inserted, what is the best approach, list size, connection handling etc. We want the load to finish in reasonable time like 30 min or so using the library I was able to get it done in an hour and 30 minutes, Can you help?

Thanks
P

@JonathanMagnan JonathanMagnan self-assigned this Dec 13, 2022
@JonathanMagnan
Copy link
Member

Hello @premsai ,

Is it possible to get your SQL Table definition, including index and trigger? You can send it in private here: [email protected]

It will be easier for us to give you a better answer with those informations as 1h30 for only 5 million looks very slow at first sight.

Best Regards,

Jon

@JonathanMagnan
Copy link
Member

Hello @premsai

Since our last conversation, we haven't heard from you.

Don't hesitate to contact us if you need further information.

Best regards,

Jon

@premsai
Copy link
Author

premsai commented Dec 16, 2022 via email

@JonathanMagnan
Copy link
Member

Hello @premsai ,

Thank you for the additional information.

This is what I expected, your table has hundreds of columns, which explains why this is long.

In your case, one thing to make sure of is that the SqlBulkCopy is executed directly in the destination table and not your temporary table, as it will make the operations way longer otherwise.

By making sure the SqlBulkCopy is executed in the destination table, you will gain a performance increase. However, you will not be able to output values such as the identity value.

So all options such as specifying the identity must be removed (you can specify it as a key instead).

See the following example: https://dotnetfiddle.net/bdiM7D , you will see in the log returned the following line:

SqlBulkCopy: [Invoice]
-- BulkCopyTimeout:120

So in this case, we know that the SqlBulkCopy has been done in the destination table [Invoice].

Let me know if you see any performance gains or if the SqlBulkCopy was already made in the destination table.

Best Regards,

Jon

@premsai
Copy link
Author

premsai commented Dec 19, 2022 via email

@JonathanMagnan
Copy link
Member

Hello @premsai,

Since our last conversation, we haven't heard form you.

How if your evaluation going?

Let me know if you need further assistance.

Best regards,

Jon

@premsai
Copy link
Author

premsai commented Jan 10, 2023 via email

@JonathanMagnan
Copy link
Member

Hello @premsai,

Sure! You can contact us directly at [email protected]

I will be able to assist you with any questions or concerns.

Best regards,

Jon

@premsai
Copy link
Author

premsai commented Jan 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants