You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After we promote these new schemas to be the new default we need to update agent processing.
We should be able to just do SELECT * when copying data from crawl_staging to crawl in crawl_complete pipeline.
The text was updated successfully, but these errors were encountered:
max-ostapenko
changed the title
Align agent data structure to be written to BQ with the new all dataset schema.
Update pages and requests schemas written by agent to BQ to a new one
Oct 14, 2024
The transformation of crawl_staging.requests into crawl.requests lasted ~13h for Nov 2024 crawl (not including failed attempts).
@pmeenan let's update the wptagent to match crawl_staging with crawl schema.
As agreed we will not update legacy table anymore, so ready for a cleanup.
The older data schema is being reprocessed using these queries:
After we promote these new schemas to be the new default we need to update agent processing.
We should be able to just do
SELECT *
when copying data fromcrawl_staging
tocrawl
in crawl_complete pipeline.The text was updated successfully, but these errors were encountered: