-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Múltiple scrollable elements #276
Comments
It would be nice if stagehand could try emitting scrollwheel events. Stagehand (and LLM output code) prefers Stagehand could check if window height == viewport height: try scrollwheel. This does work, but it would be good for Stagehand to reach for this tool as needed instead of having to code it statically.
|
Patching the handler to add new tools is possible. This one for example enables Stagehand to scroll with the mousewheel. A few usage notes:
/*
Monkeypatching actHandler._performPlaywrightMethod() Stagehand method that handles tool use calls from LLM responses
This adds a 'scrollDownALittle' tool that emits mousewheel events and works on dynamic SPAs
You need to tell the LLM (prompt) about this new tool or it won't use it
*/
function patchScrollBehavior(stagehand: any) {
// Get the act handler instance
const actHandler = Reflect.get(stagehand, 'actHandler');
if (!actHandler) {
throw new Error('Could not access actHandler');
}
const proto = Object.getPrototypeOf(actHandler);
const originalMethod = proto._performPlaywrightMethod;
// Monkeypatch to add a tool
proto._performPlaywrightMethod = async function(
method: string,
args: unknown[],
xpath: string,
domSettleTimeoutMs?: number
) {
if (method === 'scrollDownALittle') {
const viewport = await this.stagehand.page.viewportSize();
const scroll_y = viewport.height * 0.9;
await this.stagehand.page.mouse.wheel(0, scroll_y);
await this.waitForSettledDom(domSettleTimeoutMs);
return;
}
// Passthrough any other tool calls to the original implementation
return originalMethod.call(this, method, args, xpath, domSettleTimeoutMs);
};
} Usage example: await stagehand.init();
patchScrollBehavior(stagehand); // <-------------
await stagehand.page.goto("http://localhost/");
await stagehand.act({ action: "Scroll to the bottom of the page. Only use 'scrollDownALittle' for this." }); |
Some pages' main scroll bar is not at the main
window
, but in some innerdiv
.Other pages may even have many scrollable elements, the main
window
being one of them or not.It seems all scrolls made in the API calls to read more
chunks
assume a single scroll on the mainwindow
(e.g. when countingchunks
andscrollToHeight()
).This makes all content inside scrollable areas other than the main
window
invisible to the API.Additionally, scrollable elements are not eligible as "interactive" if not having such
role
attribute (which is not very common practice afaik) thus cannot be included in the DOM elements provided as context to the LLM. That means one cannot expect the LLM to decide to scroll them on its own, as it's unaware of them.The text was updated successfully, but these errors were encountered: