Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Add proper async support #10590

Closed
JamesHutchison opened this issue Feb 12, 2024 · 5 comments
Closed

[Feature Request]: Add proper async support #10590

JamesHutchison opened this issue Feb 12, 2024 · 5 comments
Labels
enhancement New feature or request triage Issue needs to be triaged/prioritized

Comments

@JamesHutchison
Copy link

JamesHutchison commented Feb 12, 2024

Feature Description

Async doesn't seem to be supported. Calling VectorStoreIndex.from_documents(documents, storage_context, use_async=True) results in an error about using asyncio.run in a nested context. If you use nest_asyncio you get an error about the coroutine that actually does the work not being awaited. Or maybe you get an error about how uvloop isn't patchable.

If you muck around with things, you eventually find that it does all this in a sync session, so now you have to maintain logic for both sync and async sessions, which is needed if you need to, for example, provide the cert of the server you're connecting to.

    def _initialize(self) -> None:
        if not self._is_initialized:
            self._connect()
            if self.perform_setup:
                self._create_extension()
                self._create_schema_if_not_exists()
                self._create_tables_if_not_exists()
            self._is_initialized = True

Reason

In general, the Pythonic way to do async support is to create an async version of something, either a class or method. Keyword arguments are not standard practice and are counter-productive to static typing when it works correctly (which it does not here). The issue is that for some reason its using asyncio as an implementation detail rather than approaching the problem as "I'm running in an async context".

Value of Feature

I use an async framework.

@JamesHutchison JamesHutchison added enhancement New feature or request triage Issue needs to be triaged/prioritized labels Feb 12, 2024
Copy link

dosubot bot commented Feb 12, 2024

🤖

It's great to see your interest in adding proper async support to the repository! Your insights into the current implementation and the Pythonic way of handling async are valuable. If you're willing to contribute a solution or propose a fix for this feature, it would be a fantastic addition to the repository. Your expertise in async frameworks could really benefit the community. Keep up the good work!

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@logan-markewich
Copy link
Collaborator

The error about nested asyncio can be resolved with

import nest_asyncio
nest_asyncio.apply()

The error about uvloop is solved by telling fastapi to set the loop-type to asyncio

I agree, a proper afrom_documents() method would be nice

@logan-markewich
Copy link
Collaborator

As a workaround, the ingestion pipeline has async support which works nicely

https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/root.html

@ZachHandley
Copy link

ZachHandley commented Feb 24, 2024

I'd actually like to chime in and say this issue could be expanded to the replacement of nest_asyncio

erdewit/nest_asyncio#36

from what I can tell, nest_asyncio https://stackoverflow.com/questions/59740704/correct-use-constraints-of-use-of-nest-asyncio is not a real proper implementation of this, and I get that there's not a ton of better solutions --- I did find https://sdiehl.github.io/gevent-tutorial/ which seems to be what it does in a way --- cause as far as I understand it, it's because it's trying to provide optional async support. This however breaks other packages and non-asyncio loops.

I'm not quite sure on the perfect setup, but requiring nest_asyncio is breaking discord.py and I worry about the longevity of using a patch for async solutions.

erdewit/nest_asyncio#90 I made an issue there, I'd like to see what's causing it.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 1, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 8, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jun 8, 2024
@ZachHandley
Copy link

Just to add to this, there's definitely some funky crap going on with Python's async implementation. If you'd like to experience it, make a discord bot, make it async (using discord.py for instance which runs off of asyncio) and then make a bunch of calls, it'll eventually start to do this weird lagging thing where it's doing things, and not erroring, but it's like it's spawning a bunch of child processes it's not finishing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

3 participants