-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a pagination to import data #74
Comments
Hello @romainguerrero and thank you for this wonderfull and well documented issue. it is indeed a good idea to have more argument when using the import command, to choose the number of object to be processed or choose the collection to index. I've created a PR in order to add these arguments : #75 I need some tests before merging it. A good start would be to add cases in |
Hi @npotier, i was trying to import about 400k entries from my database and while executing : about 20 seconds after validating the command I tried to import using 2 versions of the bundle : I've made some change locally and the problem seem's to be solved so I thought it would be great to share it with you. Here is the fact : You must detach Objects from Doctrine by clearing the entity manager. Here is the change i made from the ImportCommand class :
Hope it could help, |
Is your feature request related to a problem? Please describe.
Actually the command
typesense:import
parse all collections and for each gets all entities to send them to the typesense instance. But if we have a lot of entities in a collection, the call could return an error413 Request Entity Too Large
depending on the server configuration.Describe the solution you'd like
To avoid this issue, I suggest to update the ImportCommand to send data in batchs of 100 documents for example like it's done by the FOSElasticaBundle. As it's done in this previous bundle in the
import
of theAsyncPagerPersister
(see "here":https://github.com/FriendsOfSymfony/FOSElasticaBundle/blob/master/src/Persister/AsyncPagerPersister.php#L55), the best solution could be to use a (configurable ?) pager for requesting entities in the database and send them in batchs.Describe alternatives you've considered
An easier solution could be to still get all entities as done already but send data in batchs of 100 documents (or any configurable batch size ?) in the foreach loop. But I fear it could result on a memory limit error in case of huge entity number or size.
Additional context
Just for information here's the list of available options in the foselastica populate command :
The text was updated successfully, but these errors were encountered: