-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling interaction like scrolling to load the full page #793
Comments
Ok would you like to implement and we will accept the pr? |
I have near zero experience with this field or library. Do you have a recommendation on how you think implementing this would be best do point me in the right direction? Do you agree that allowing for the source input to graphs to be a selenium or playwright driver is an appropriate approach? I think this would allow for actions to be taken on a page prior to scraping, which would allow for iteration between LLM calls and interaction steps. |
Stumbled upon this issue and was interested in solutioning. It would be nice to be able to pass an existing Playwright browser context in to a scraper. This could be useful for leveraging |
Ok guys, so for this you need to use something like Selenium to interact with Javascript to push on buttons or do infinite scrolls. I can also volunteer to help if the developers would guide me a bit on when to start |
@aflansburg and @aleenprd would you like to make a meeting for scheduling the design? |
@VinciGit00 I'll reach out to you on LinkedIn for my contact |
Yes interested! |
@aflansburg I wouldn't mind sitting in as well as an observer, end user, if you don't think it'll slow you down. |
I did a small dive into the project ahead of the call, and this was useful exercise to at least learn about the project. I'm not sure how 'clean' it is, but I think a relevantly simple way (minimal refactoring) to expose the state["original_html"] = document
state.update({self.output[0]: compressed_document,})
if page:
state["page"] = page
return state I'm unsure if this bit will work, however, I assume something like this could accomplish additional interaction with the page (such as scrolling): graph = OmniScraperGraph(
app_config.prompt,
app_config.url,
graph_config,
)
results_a = graph.run()
page = graph.final_state.get("page")
# call relevant `page` methods -> https://playwright.dev/docs/api/class-page
results_b = graph.run() # subsequent run As a consequence, this would require the user of the library to close the page (doable essentially via For my issue (authentication & cookies in a separate chromium instance) I was able to determine that threading context = await browser.new_context(
java_script_enabled=True,
storage_state=self.storage_state,
user_agent=self.user_agent,
) Then I was able to leverage session state from a separate invocation of playwright, i.e.: from playwright.async_api import Page, Browser
...
async def async_run_login(browser: Browser):
browser_state_file = app_config.browser_state_file
user_agent = app_config.user_agent_str
# check if the state file exists and if it is less than 24 hours old
if (
os.path.exists(browser_state_file)
and os.path.getmtime(browser_state_file) > time.time() - 24 * 60 * 60
):
print("Using existing state file.")
else:
browser_state_file = None
print(
"No existing state file found or it is older than 24 hours. I will create a new state file."
)
context = await browser.new_context(
user_agent=user_agent,
storage_state=browser_state_file,
)
page = await context.new_page()
if await _is_logged_in(page):
print("Already logged in.")
await page.close()
await browser.close()
return
logged_in = await _login(page)
if not logged_in:
await page.close()
await browser.close()
raise Exception("Unable to login. Time to debug!")
await page.close()
await browser.close() |
This is quite close to the use case I was initially describing. To make it more concrete. Here is a snippet of something I am feeding into my prompt that should give a good idea of the functionality I am hoping for. I prompt the LLM agent to output a field called
|
This feature is still desired |
Yea if we are able to enable auto scrolling, that would be very very handy, it will make the scrapper truly able to scrape everything |
Is your feature request related to a problem? Please describe.
I am frustrated when I query a website for information, but it is not completely loaded. This is often the case for shopify websites with a large number of elements. You must scroll down to load the rest of the elements in the page.
Describe the solution you'd like
I'm uncertain of what the best solution is. I think some basic agentic commands would be extremely useful. Perhaps not as exhaustive as lavague, but perhaps some basic ability to send commands to selenium or similar. Something as simple as allowing being able to pass in a selenium driver to a graph class would be great.
Describe alternatives you've considered
Lavague, custom langchain. I've just started to play with using the node pieces separately, but it is challenging.
Additional context
For a simple demo, try to load all of the different coffees from this webpage: https://georgehowellcoffee.com/collections/all-coffee
You'll find that it only loads the first dozen or so.
The text was updated successfully, but these errors were encountered: