diff --git a/src/content/docs/reference/blobs/lighthouse.mdx b/src/content/docs/reference/blobs/lighthouse.mdx index 41f726c..be9f299 100644 --- a/src/content/docs/reference/blobs/lighthouse.mdx +++ b/src/content/docs/reference/blobs/lighthouse.mdx @@ -3,8 +3,13 @@ title: Lighthouse blob description: Reference docs for the Lighthouse blob --- -_Appears in: [`pages` table](/reference/tables/pages/)_ +_Appears in: [`pages`](/reference/tables/pages/) table_\ +_As: [`lighthouse`](/reference/tables/pages/#lighthouse)_ JSON-encoded blob of Lighthouse data for the page. -**The actual schema of the Lighthouse object is liable to change depending on the page and Lighthouse version.** \ No newline at end of file +**The actual schema of the Lighthouse object is liable to change depending on the page and Lighthouse version.** + +## Schema + +TODO diff --git a/src/content/docs/reference/blobs/page-metadata.mdx b/src/content/docs/reference/blobs/page-metadata.mdx index 8cc7b19..29a37e5 100644 --- a/src/content/docs/reference/blobs/page-metadata.mdx +++ b/src/content/docs/reference/blobs/page-metadata.mdx @@ -3,13 +3,13 @@ title: Page metadata blob description: Reference docs for the HAR page metadata blob --- -_Appears in: [`pages` table](/reference/tables/pages/)_\ -_As: `metadata`_ +_Appears in: [`pages`](/reference/tables/pages/) table_\ +_As: [`metadata`](/reference/tables/pages/#metadata)_ JSON-encoded HTTP Archive metadata about the page that was tested. -Here's an example of the decoded object: - +
+An example of the decoded object ```json { "crawl_depth": 1, @@ -28,6 +28,7 @@ Here's an example of the decoded object: ] } ``` +
## Schema diff --git a/src/content/docs/reference/structs/page-payload.md b/src/content/docs/reference/blobs/page-payload.mdx similarity index 50% rename from src/content/docs/reference/structs/page-payload.md rename to src/content/docs/reference/blobs/page-payload.mdx index 6c9d42e..f7d99c9 100644 --- a/src/content/docs/reference/structs/page-payload.md +++ b/src/content/docs/reference/blobs/page-payload.mdx @@ -1,15 +1,15 @@ --- -title: Pages payload struct -description: Reference docs for the page payload struct +title: Page payload blob +description: Reference docs for the page payload blob --- -_Appears in: [`pages` table](/reference/tables/pages/#payload)_\ -_As: `payload`_ +_Appears in: [`pages` table](/reference/tables/pages/)_\ +_As: [`payload`](/reference/tables/pages/#payload)_ JSON-encoded WebPageTest result data for a page. -Here's an example of the decoded object: - +
+An example of the decoded object ```json { "_LargestContentfulPaintNodeType": "P", @@ -285,435 +285,404 @@ Here's an example of the decoded object: "title": "Run 1, First View for https://www.example.com/" } ``` +
## Schema -Field | Type | Description --- | -- | -- -[`_LargestContentfulPaintNodeType`](#_largestcontentfulpaintnodetype) | `string` | The node type of the largest contentful paint -[`_LargestContentfulPaintType`](#_largestcontentfulpainttype) | `string` | The type of the largest contentful paint -[`_LastInteractive`](#_lastinteractive) | `int` | The time when the page was last interactive in milliseconds -[`_PerformancePaintTiming.first-contentful-paint`](#_performancepainttimingfirst-contentful-paint) | `float` | The time when the first contentful paint occurred in milliseconds -[`_PerformancePaintTiming.first-paint`](#_performancepainttimingfirst-paint) | `float` | The time when the first paint occurred in milliseconds -[`_SpeedIndex`](#_speedindex) | `int` | The Speed Index score -[`_TTFB`](#_ttfb) | `int` | The time to first byte in milliseconds -[`_TTIMeasurementEnd`](#_ttimeasurementend) | `int` | The time when the TTI measurement ended in milliseconds -[`_URL`](#_url) | `string` | The URL of the page -[`_aft`](#_aft) | `int` | The above-the-fold time in milliseconds -[`_audit_issues`](#_audit_issues) | `array` | Audit issues -[`_basePageSSLTime`](#_basepagessltime) | `int` | The time spent on SSL for the base page in milliseconds -[`_base_page_cdn`](#_base_page_cdn) | `string` | The CDN used for the base page -[`_base_page_cname`](#_base_page_cname) | `string` | The CNAME used for the base page -[`_base_page_dns_server`](#_base_page_dns_server) | `string` | The DNS server used for the base page -[`_base_page_ip_ptr`](#_base_page_ip_ptr) | `string` | The IP PTR used for the base page -[`_browserVersion`](#_browserversion) | `string` | The browser version -[`_browser_name`](#_browser_name) | `string` | The browser name -[`_browser_version`](#_browser_version) | `string` | The browser version -[`_bytesIn`](#_bytesin) | `int` | The number of bytes received in -[`_bytesInDoc`](#_bytesindoc) | `int` | The number of bytes received in the document -[`_bytesOut`](#_bytesout) | `int` | The number of bytes sent out -[`_bytesOutDoc`](#_bytesoutdoc) | `int` | The number of bytes sent out in the -[`_cached`](#_cached) | `int` | Whether the page was cached -[`_chromeUserTiming`](#_chromeusertiming) | `array` | Chrome user timing -[`_chromeUserTiming.CumulativeLayoutShift`](#_chromeusertimingcumulativelayoutshift) | `int` | The cumulative layout shift -[`_chromeUserTiming.LargestContentfulPaint`](#_chromeusertiminglargestcontentfulpaint) | `int` | The largest contentful paint -[`_chromeUserTiming.LargestTextPaint`](#_chromeusertiminglargesttextpaint) | `int` | The largest text paint -[`_chromeUserTiming.TotalLayoutShift`](#_chromeusertimingtotallayoutshift) | `int` | The total layout shift -[`_chromeUserTiming.commitNavigationEnd`](#_chromeusertimingcommitnavigationend) | `int` | The commit navigation end -[`_chromeUserTiming.domComplete`](#_chromeusertimingdomcomplete) | `int` | The DOM complete -[`_chromeUserTiming.domContentLoadedEventEnd`](#_chromeusertimingdomcontentloadedeventend) | `int` | The DOM content loaded event end -[`_chromeUserTiming.domContentLoadedEventStart`](#_chromeusertimingdomcontentloadedeventstart) | `int` | The DOM content loaded event start -[`_chromeUserTiming.domInteractive`](#_chromeusertimingdominteractive) | `int` | The DOM interactive -[`_chromeUserTiming.domLoading`](#_chromeusertimingdomloading) | `int` | The DOM loading -[`_chromeUserTiming.fetchStart`](#_chromeusertimingfetchstart) | `int` | The fetch start -[`_chromeUserTiming.firstContentfulPaint`](#_chromeusertimingfirstcontentfulpaint) | `int` | The first contentful paint -[`_chromeUserTiming.firstMeaningfulPaint`](#_chromeusertimingfirstmeaningfulpaint) | `int` | The first meaningful paint -[`_chromeUserTiming.firstMeaningfulPaintCandidate`](#_chromeusertimingfirstmeaningfulpaintcandidate) | `int` | The first meaningful paint candidate -[`_chromeUserTiming.firstPaint`](#_chromeusertimingfirstpaint) | `int` | The first paint -[`_chromeUserTiming.loadEventEnd`](#_chromeusertimingloadeventend) | `int` | The load event end -[`_chromeUserTiming.loadEventStart`](#_chromeusertimingloadeventstart) | `int` | The load event start -[`_chromeUserTiming.markAsMainFrame`](#_chromeusertimingmarkasmainframe) | `int` | The mark as main frame -[`_chromeUserTiming.navigationStart`](#_chromeusertimingnavigationstart) | `int` | The navigation start -[`_chromeUserTiming.responseEnd`](#_chromeusertimingresponseend) | `int` | The response end -[`_chromeUserTiming.unloadEventEnd`](#_chromeusertimingunloadeventend) | `int` | The unload event end -[`_chromeUserTiming.unloadEventStart`](#_chromeusertimingunloadeventstart) | `int` | The unload event start -[`_connections`](#_connections) | `int` | The number of connections -[`_consoleLog`](#_consolelog) | `array` | Console logs -[`_cpu.CommitLoad`](#_cpucommitload) | `int` | The CPU time spent on commit load in milliseconds -[`_cpu.EventDispatch`](#_cpueventdispatch) | `int` | The CPU time spent on event dispatch in milliseconds -[`_cpu.FunctionCall`](#_cpufunctioncall) | `int` | The CPU time spent on function call in milliseconds -[`_cpu.HTMLDocumentParser::FetchQueuedPreloads`](#_cpuhtmldocumentparserfetchqueuedpreloads) | `int` | The CPU time spent on HTML document parser fetch queued preloads in milliseconds -[`_cpu.Idle`](#_cpuidle) | `int` | The CPU time spent on idle in milliseconds -[`_cpu.Layerize`](#_cpulayerize) | `int` | The CPU time spent on layerize in milliseconds -[`_cpu.Layout`](#_cpulayout) | `int` | The CPU time spent on layout in milliseconds -[`_cpu.MarkDOMContent`](#_cpumarkdomcontent) | `int` | The CPU time spent on marking DOM content in milliseconds -[`_cpu.MarkLoad`](#_cpumarkload) | `int` | The CPU time spent on marking load in milliseconds -[`_cpu.Paint`](#_cpupaint) | `int` | The CPU time spent on paint in milliseconds -[`_cpu.ParseHTML`](#_cpuparsehtml) | `int` | The CPU time spent on parsing HTML in milliseconds -[`_cpu.PrePaint`](#_cpuprepaint) | `int` | The CPU time spent on pre-paint in milliseconds -[`_cpu.ResourceFetcher::requestResource`](#_cpuresourcefetcherrequestresource) | `int` | The CPU time spent on resource fetcher request resource in milliseconds -[`_cpu.UpdateLayoutTree`](#_cpuupdatelayouttree) | `int` | The CPU time spent on updating layout tree in milliseconds -[`_cpu.V8.GC_TIME_TO_SAFEPOINT`](#_cpuv8gc_timeto_safepoint) | `int` | The CPU time spent on V8 GC time to safepoint in milliseconds -[`_cpu.largestContentfulPaint::Candidate`](#_cpulargestcontentfulpaintcandidate) | `int` | The CPU time spent on largest contentful paint candidate in milliseconds -[`_cpuTimes`](#_cputimes) | `object` | CPU times -[`_cpuTimesDoc`](#_cputimesdoc) | `object` | CPU times for the document -[`_date`](#_date) | `float` | The date in Unix timestamp format -[`_docTime`](#_doctime) | `int` | The document time in milliseconds -[`_document_URL`](#_document_url) | `string` | The URL of the document -[`_document_hostname`](#_document_hostname) | `string` | The hostname of the document -[`_document_origin`](#_document_origin) | `string` | The origin of the document -[`_domComplete`](#_domcomplete) | `int` | The DOM complete in milliseconds -[`_domContentLoadedEventEnd`](#_domcontentloadedeventend) | `int` | The DOM content loaded event end in milliseconds -[`_domContentLoadedEventStart`](#_domcontentloadedeventstart) | `int` | The DOM content loaded event start in milliseconds -[`_domElements`](#_domelements) | `int` | The number of DOM elements -[`_domInteractive`](#_dominteractive) | `int` | The DOM interactive in milliseconds -[`_domLoading`](#_domloading) | `int` | The DOM loading in milliseconds -[`_domTime`](#_domtime) | `int` | The DOM time in milliseconds -[`_edge-processed`](#_edge-processed) | `boolean` | Whether the page was processed by Edge -[`_effectiveBps`](#_effectivebps) | `int` | The effective BPS -[`_eventName`](#_eventname) | `string` | The event name -[`_execution_contexts`](#_execution_contexts) | `array` | Execution contexts -[`_final_base_page_request`](#_final_base_page_request) | `int` | The final base page request -[`_final_base_page_request_id`](#_final_base_page_request_id) | `string` | The final base page request ID -[`_final_url`](#_final_url) | `string` | The final URL -[`_firstContentfulPaint`](#_firstcontentfulpaint) | `int` | The first contentful paint in milliseconds -[`_firstMeaningfulPaint`](#_firstmeaningfulpaint) | `int` | The first meaningful paint in milliseconds -[`_firstPaint`](#_firstpaint) | `float` | The first paint in milliseconds -[`_fullyLoaded`](#_fullyloaded) | `int` | The fully loaded time in milliseconds -[`_fullyLoadedCPUms`](#_fullyloadedcpums) | `int` | The fully loaded CPU time in milliseconds -[`_fullyLoadedCPUpct`](#_fullyloadedcpupct) | `float` | The fully loaded CPU percentage -[`_gzip_savings`](#_gzip_savings) | `int` | The bytes saved by gzip compression -[`_gzip_total`](#_gzip_total) | `int` | The total bytes in gzip compression -[`_image_savings`](#_image_savings) | `int` | The bytes saved by image compression -[`_image_total`](#_image_total) | `int` | The total bytes in image compression -[`_interactivePeriods`](#_interactiveperiods) | `array` | Interactive periods -[`_largestPaints`](#_largestpaints) | `array` | Largest paints -[`_lastVisualChange`](#_lastvisualchange) | `int` | The time of the last visual change in milliseconds -[`_lighthouse.Accessibility`](#_lighthouseaccessibility) | `float` | The Lighthouse accessibility score -[`_lighthouse.BestPractices`](#_lighthousebestpractices) | `float` | The Lighthouse best practices score -[`_lighthouse.Performance`](#_lighthouseperformance) | `float` | The Lighthouse performance score -[`_lighthouse.Performance.cumulative-layout-shift`](#_lighthouseperformancecumulative-layout-shift) | `int` | The Lighthouse cumulative layout shift -[`_lighthouse.Performance.first-contentful-paint`](#_lighthouseperformancefirst-contentful-paint) | `float` | The Lighthouse first contentful paint -[`_lighthouse.Performance.largest-contentful-paint`](#_lighthouseperformancelargest-contentful-paint) | `float` | The Lighthouse largest contentful paint -[`_lighthouse.Performance.speed-index`](#_lighthouseperformancespeed-index) | `int` | The Lighthouse speed index -[`_lighthouse.Performance.total-blocking-time`](#_lighthouseperformancetotal-blocking-time) | `int` | The Lighthouse total blocking time -[`_lighthouse.SEO`](#_lighthouseseo) | `float` | The Lighthouse SEO score -[`_loadEventEnd`](#_loadeventend) | `int` | The load event end in milliseconds -[`_loadEventStart`](#_loadeventstart) | `int` | The load event start in milliseconds -[`_loadTime`](#_loadtime) | `int` | The load time in milliseconds -[`_main_frame`](#_main_frame) | `string` | The main frame -[`_minify_savings`](#_minify_savings) | `int` | The bytes saved by minification -[`_minify_total`](#_minify_total) | `int` | The total bytes in minification -[`_optimization_checked`](#_optimization_checked) | `int` | Whether optimization checks were performed -[`_origin_dns`](#_origin_dns) | `object` | Origin DNS -[`_osPlatform`](#_osplatform) | `string` | The OS platform -[`_osVersion`](#_osversion) | `string` | The OS version -[`_os_version`](#_os_version) | `string` | The OS version -[`_render`](#_render) | `int` | The render time in milliseconds -[`_renderBlockingCSS`](#_renderblockingcss) | `int` | The render blocking CSS time in milliseconds -[`_renderBlockingJS`](#_renderblockingjs) | `int` | The render blocking JS time in milliseconds -[`_requests`](#_requests) | `int` | The number of requests -[`_requestsDoc`](#_requestsdoc) | `int` | The number of requests in the document -[`_requestsFull`](#_requestsfull) | `int` | The number of full requests -[`_responses_200`](#_responses_200) | `int` | The number of 200 responses -[`_responses_404`](#_responses_404) | `int` | The number of 404 responses -[`_responses_other`](#_responses_other) | `int` | The number of other responses -[`_result`](#_result) | `int` | The result -[`_run`](#_run) | `int` | The run -[`_score_cache`](#_score_cache) | `int` | The cache score -[`_score_cdn`](#_score_cdn) | `int` | The CDN score -[`_score_combine`](#_score_combine) | `int` | The combine score -[`_score_compress`](#_score_compress) | `int` | The compress score -[`_score_cookies`](#_score_cookies) | `int` | The cookies score -[`_score_etags`](#_score_etags) | `int` | The etags score -[`_score_gzip`](#_score_gzip) | `int` | The gzip score -[`_score_keep-alive`](#_score_keep-alive) | `int` | The keep-alive score -[`_score_minify`](#_score_minify) | `int` | The minify score -[`_score_progressive_jpeg`](#_score_progressive_jpeg) | `int` | The progressive JPEG score -[`_server_rtt`](#_server_rtt) | `int` | The server RTT -[`_start_epoch`](#_start_epoch) | `float` | The start epoch in Unix timestamp format -[`_step`](#_step) | `int` | The step -[`_testID`](#_testid) | `string` | The test ID -[`_testStartOffset`](#_teststartoffset) | `int` | The test start offset -[`_testUrl`](#_testurl) | `string` | The test URL -[`_test_run_time_ms`](#_test_run_time_ms) | `int` | The test run time in milliseconds -[`_tester`](#_tester) | `string` | The tester -[`_titleTime`](#_titletime) | `int` | The title time in milliseconds -[`_v8Stats`](#_v8stats) | `object` | V8 stats -[`_viewport`](#_viewport) | `object` | The viewport -[`_visualComplete`](#_visualcomplete) | `int` | The visual complete time in milliseconds -[`_visualComplete85`](#_visualcomplete85) | `int` | The 85th percentile visual complete time in milliseconds -[`_visualComplete90`](#_visualcomplete90) | `int` | The 90th percentile visual complete time in milliseconds -[`_visualComplete95`](#_visualcomplete95) | `int` | The 95th percentile visual complete time in milliseconds -[`_visualComplete99`](#_visualcomplete99) | `int` | The 99th percentile visual complete time in milliseconds -[`id`](#id) | `string` | The page ID -[`pageTimings`](#pagetimings) | `object` | Page timings -[`startedDateTime`](#starteddatetime) | `string` | The start date and time of the page in Unix timestamp format -[`testID`](#testid) | `string` | The test ID -[`title`](#title) | `string` | The page title - ### `_LargestContentfulPaintNodeType` +Type: `string` + The node type of the largest contentful paint ### `_LargestContentfulPaintType` +Type: `string` + The type of the largest contentful paint ### `_LastInteractive` +Type: `int` + The time when the page was last interactive in milliseconds -### `_PerformancePaintTimingfirst-contentful-paint` +### `_PerformancePaintTiming.first-contentful-paint` + +Type: `float` The time when the first contentful paint occurred in milliseconds -### `_PerformancePaintTimingfirst-paint` +### `_PerformancePaintTiming.first-paint` + +Type: `float` The time when the first paint occurred in milliseconds ### `_SpeedIndex` +Type: `int` + The Speed Index score ### `_TTFB` +Type: `int` + The time to first byte in milliseconds ### `_TTIMeasurementEnd` +Type: `int` + The time when the TTI measurement ended in milliseconds ### `_URL` +Type: `string` + The URL of the page ### `_aft` +Type: `int` + The above-the-fold time in milliseconds ### `_audit_issues` +Type: `array` + Audit issues ### `_basePageSSLTime` +Type: `int` + The time spent on SSL for the base page in milliseconds ### `_base_page_cdn` +Type: `string` + The CDN used for the base page ### `_base_page_cname` +Type: `string` + The CNAME used for the base page ### `_base_page_dns_server` +Type: `string` + The DNS server used for the base page ### `_base_page_ip_ptr` +Type: `string` + The IP PTR used for the base page ### `_browserVersion` +Type: `string` + The browser version ### `_browser_name` +Type: `string` + The browser name ### `_browser_version` +Type: `string` + The browser version ### `_bytesIn` +Type: `int` + The number of bytes received in ### `_bytesInDoc` +Type: `int` + The number of bytes received in the document ### `_bytesOut` +Type: `int` + The number of bytes sent out ### `_bytesOutDoc` +Type: `int` + The number of bytes sent out in the document ### `_cached` +Type: `int` + Whether the page was cached ### `_chromeUserTiming` +Type: `array` + Chrome user timing -### `_chromeUserTimingCumulativeLayoutShift` +### `_chromeUserTiming.CumulativeLayoutShift` + +Type: `int` The cumulative layout shift -### `_chromeUserTimingLargestContentfulPaint` +### `_chromeUserTiming.LargestContentfulPaint` + +Type: `int` The largest contentful paint -### `_chromeUserTimingLargestTextPaint` +### `_chromeUserTiming.LargestTextPaint` + +Type: `int` The largest text paint -### `_chromeUserTimingTotalLayoutShift` +### `_chromeUserTiming.TotalLayoutShift` + +Type: `int` The total layout shift -### `_chromeUserTimingcommitNavigationEnd` +### `_chromeUserTiming.commitNavigationEnd` + +Type: `int` The commit navigation end -### `_chromeUserTimingdomComplete` +### `_chromeUserTiming.domComplete` + +Type: `int` The DOM complete -### `_chromeUserTimingdomContentLoadedEventEnd` +### `_chromeUserTiming.domContentLoadedEventEnd` + +Type: `int` The DOM content loaded event end -### `_chromeUserTimingdomContentLoadedEventStart` +### `_chromeUserTiming.domContentLoadedEventStart` + +Type: `int` The DOM content loaded event start -### `_chromeUserTimingdomInteractive` +### `_chromeUserTiming.domInteractive` + +Type: `int` The DOM interactive -### `_chromeUserTimingdomLoading` +### `_chromeUserTiming.domLoading` + +Type: `int` The DOM loading in milliseconds -### `_chromeUserTimingfetchStart` +### `_chromeUserTiming.fetchStart` + +Type: `int` The fetch start in milliseconds -### `_chromeUserTimingfirstContentfulPaint` +### `_chromeUserTiming.firstContentfulPaint` + +Type: `int` The first contentful paint -### `_chromeUserTimingfirstMeaningfulPaint` +### `_chromeUserTiming.firstMeaningfulPaint` + +Type: `int` The first meaningful paint -### `_chromeUserTimingfirstMeaningfulPaintCandidate` +### `_chromeUserTiming.firstMeaningfulPaintCandidate` + +Type: `int` The first meaningful paint candidate -### `_chromeUserTimingfirstPaint` +### `_chromeUserTiming.firstPaint` + +Type: `int` The first paint -### `_chromeUserTimingloadEventEnd` +### `_chromeUserTiming.loadEventEnd` + +Type: `int` The load event end -### `_chromeUserTimingloadEventStart` +### `_chromeUserTiming.loadEventStart` + +Type: `int` The load event start -### `_chromeUserTimingmarkAsMainFrame` +### `_chromeUserTiming.markAsMainFrame` + +Type: `int` The mark as main frame -### `_chromeUserTimingnavigationStart` +### `_chromeUserTiming.navigationStart` + +Type: `int` The navigation start -### `_chromeUserTimingresponseEnd` +### `_chromeUserTiming.responseEnd` + +Type: `int` The response end -### `_chromeUserTimingunloadEventEnd` +### `_chromeUserTiming.unloadEventEnd` + +Type: `int` The unload event end -### `_chromeUserTimingunloadEventStart` +### `_chromeUserTiming.unloadEventStart` + +Type: `int` The unload event start ### `_connections` +Type: `int` + The number of connections ### `_consoleLog` +Type: `array` + Console logs -### `_cpuCommitLoad` +### `_cpu.CommitLoad` + +Type: `int` The CPU time spent on commit load in milliseconds -### `_cpuEventDispatch` +### `_cpu.EventDispatch` + +Type: `int` The CPU time spent on event dispatch in milliseconds -### `_cpuFunctionCall` +### `_cpu.FunctionCall` + +Type: `int` The CPU time spent on function call in milliseconds -### `_cpuHTMLDocumentParserFetchQueuedPreloads` +### `_cpu.HTMLDocumentParser::FetchQueuedPreloads` + +Type: `int` The CPU time spent on HTML document parser fetch queued preloads in milliseconds -### `_cpuIdle` +### `_cpu.Idle` + +Type: `int` The CPU time spent on idle in milliseconds -### `_cpuLayerize` +### `_cpu.Layerize` + +Type: `int` The CPU time spent on layerize in milliseconds -### `_cpuLayout` +### `_cpu.Layout` + +Type: `int` The CPU time spent on layout in milliseconds -### `_cpuMarkDOMContent` +### `_cpu.MarkDOMContent` + +Type: `int` The CPU time spent on marking DOM content in milliseconds -### `_cpuMarkLoad` +### `_cpu.MarkLoad` + +Type: `int` The CPU time spent on marking load in milliseconds -### `_cpuPaint` +### `_cpu.Paint` + +Type: `int` The CPU time spent on paint in milliseconds -### `_cpuParseHTML` +### `_cpu.ParseHTML` + +Type: `int` The CPU time spent on parsing HTML in milliseconds -### `_cpuPrePaint` +### `_cpu.PrePaint` + +Type: `int` The CPU time spent on pre-paint in milliseconds -### `_cpuResourceFetcherrequestResource` +### `_cpu.ResourceFetcher::requestResource` + +Type: `int` The CPU time spent on resource fetcher request resource in milliseconds -### `_cpuUpdateLayoutTree` +### `_cpu.UpdateLayoutTree` + +Type: `int` The CPU time spent on updating layout tree in milliseconds -### `_cpuV8GC_TIMETO_SAFEPOINT` +### `_cpu.V8.GC_TIME_TO_SAFEPOINT` + +Type: `int` The CPU time spent on V8 GC time to safepoint in milliseconds -### `_cpulargestContentfulPaintCandidate` +### `_cpu.largestContentfulPaint::Candidate` + +Type: `int` The CPU time spent on largest contentful paint candidate in milliseconds ### `_cpuTimes` +Type: `object` + CPU times ### `_cpuTimesDoc` @@ -742,356 +711,534 @@ The origin of the document ### `_domComplete` +Type: `int` + The DOM complete in milliseconds ### `_domContentLoadedEventEnd` +Type: `int` + The DOM content loaded event end in milliseconds ### `_domContentLoadedEventStart` +Type: `int` + The DOM content loaded event start in milliseconds ### `_domElements` +Type: `int` + The number of DOM elements ### `_domInteractive` +Type: `int` + The DOM interactive in milliseconds ### `_domLoading` +Type: `int` + The DOM loading in milliseconds ### `_domTime` +Type: `int` + The DOM time in milliseconds ### `_edge-processed` +Type: `boolean` + Whether the page was processed by Edge ### `_effectiveBps` +Type: `int` + The effective BPS ### `_eventName` +Type: `string` + The event name ### `_execution_contexts` +Type: `array` + Execution contexts ### `_final_base_page_request` +Type: `int` + The final base page request ### `_final_base_page_request_id` +Type: `string` + The final base page request ID ### `_final_url` +Type: `string` + The final URL ### `_firstContentfulPaint` +Type: `int` + The first contentful paint in milliseconds ### `_firstMeaningfulPaint` +Type: `int` + The first meaningful paint in milliseconds ### `_firstPaint` +Type: `float` + The first paint in milliseconds ### `_fullyLoaded` +Type: `int` + The fully loaded time in milliseconds ### `_fullyLoadedCPUms` +Type: `int` + The fully loaded CPU time in milliseconds ### `_fullyLoadedCPUpct` +Type: `float` + The fully loaded CPU percentage ### `_gzip_savings` +Type: `int` + The bytes saved by gzip compression ### `_gzip_total` +Type: `int` + The total bytes in gzip compression ### `_image_savings` +Type: `int` + The bytes saved by image compression ### `_image_total` +Type: `int` + The total bytes in image compression ### `_interactivePeriods` +Type: `array` + Interactive periods in milliseconds ### `_largestPaints` +Type: `array` + Largest paints ### `_lastVisualChange` +Type: `int` + The time of the last visual change in milliseconds -### `_lighthouseAccessibility` +### `_lighthouse.Accessibility` + +Type: `float` The Lighthouse accessibility score -### `_lighthouseBestPractices` +### `_lighthouse.BestPractices` + +Type: `float` The Lighthouse best practices score -### `_lighthousePerformance` +### `_lighthouse.Performance` + +Type: `float` The Lighthouse performance score -### `_lighthousePerformancecumulative-layout-shift` +### `_lighthouse.Performance.cumulative-layout-shift` + +Type: `int` The Lighthouse cumulative layout shift -### `_lighthousePerformancefirst-contentful-paint` +### `_lighthouse.Performance.first-contentful-paint` + +Type: `float` The Lighthouse first contentful paint -### `_lighthousePerformancelargest-contentful-paint` +### `_lighthouse.Performance.largest-contentful-paint` + +Type: `float` The Lighthouse largest contentful paint -### `_lighthousePerformancespeed-index` +### `_lighthouse.Performance.speed-index` + +Type: `int` The Lighthouse speed index -### `_lighthousePerformancetotal-blocking-time` +### `_lighthouse.Performance.total-blocking-time` + +Type: `int` The Lighthouse total blocking time -### `_lighthouseSEO` +### `_lighthouse.SEO` + +Type: `float` The Lighthouse SEO score ### `_loadEventEnd` +Type: `int` + The load event end in milliseconds ### `_loadEventStart` +Type: `int` + The load event start in milliseconds ### `_loadTime` +Type: `int` + The load time in milliseconds ### `_main_frame` +Type: `string` + The main frame ### `_minify_savings` +Type: `int` + The bytes saved by minification ### `_minify_total` +Type: `int` + The total bytes in minification ### `_optimization_checked` +Type: `int` + Whether optimization checks were performed ### `_origin_dns` +Type: `object` + Origin DNS ### `_osPlatform` +Type: `string` + The OS platform ### `_osVersion` +Type: `string` + The OS version ### `_os_version` +Type: `string` + The OS version ### `_render` +Type: `int` + The render time in milliseconds ### `_renderBlockingCSS` +Type: `int` + The render blocking CSS time in milliseconds ### `_renderBlockingJS` +Type: `int` + The render blocking JS time in milliseconds ### `_requests` +Type: `int` + The number of requests ### `_requestsDoc` +Type: `int` + The number of requests in the document ### `_requestsFull` +Type: `int` + The number of full requests ### `_responses_200` +Type: `int` + The number of 200 responses ### `_responses_404` +Type: `int` + The number of 404 responses ### `_responses_other` +Type: `int` + The number of other responses ### `_result` +Type: `int` + The result code of the test run ### `_run` +Type: `int` + The run number ### `_score_cache` +Type: `int` + The cache score ### `_score_cdn` +Type: `int` + The CDN score ### `_score_combine` +Type: `int` + The combine score ### `_score_compress` +Type: `int` + The compress score ### `_score_cookies` +Type: `int` + The cookies score ### `_score_etags` +Type: `int` + The etags score ### `_score_gzip` +Type: `int` + The gzip score ### `_score_keep-alive` +Type: `int` + The keep-alive score ### `_score_minify` +Type: `int` + The minify score ### `_score_progressive_jpeg` +Type: `int` + The progressive JPEG score ### `_server_rtt` +Type: `int` + The server RTT ### `_start_epoch` +Type: `float` + The start epoch in Unix timestamp format ### `_step` +Type: `int` + The step number ### `_testID` +Type: `string` + The test ID ### `_testStartOffset` +Type: `int` + The test start offset ### `_testUrl` +Type: `string` + The test URL ### `_test_run_time_ms` +Type: `int` + The test run time in milliseconds ### `_tester` +Type: `string` + The tester ### `_titleTime` +Type: `int` + The title time in milliseconds ### `_v8Stats` +Type: `object` + V8 stats ### `_viewport` +Type: `object` + The viewport dimensions ### `_visualComplete` +Type: `int` + The visual complete time in milliseconds ### `_visualComplete85` +Type: `int` + The 85th percentile visual complete time in milliseconds ### `_visualComplete90` +Type: `int` + The 90th percentile visual complete time in milliseconds ### `_visualComplete95` +Type: `int` + The 95th percentile visual complete time in milliseconds ### `_visualComplete99` +Type: `int` + The 99th percentile visual complete time in milliseconds ### `id` +Type: `string` + The page ID ### `pageTimings` +Type: `object` + Page timings ### `startedDateTime` +Type: `string` + The start date and time of the page in Unix timestamp format ### `testID` +Type: `string` + The test ID ### `title` +Type: `string` + The page title diff --git a/src/content/docs/reference/structs/page-summary.md b/src/content/docs/reference/blobs/page-summary.mdx similarity index 94% rename from src/content/docs/reference/structs/page-summary.md rename to src/content/docs/reference/blobs/page-summary.mdx index 36f1ad4..69e841a 100644 --- a/src/content/docs/reference/structs/page-summary.md +++ b/src/content/docs/reference/blobs/page-summary.mdx @@ -1,15 +1,15 @@ --- -title: Page summary struct -description: Reference docs for the page summary struct +title: Page summary blob +description: Reference docs for the page summary blob --- -_Appears in: [`pages` table](/reference/tables/pages/)_\ -_As: `summary`_ +_Appears in: [`pages`](/reference/tables/pages/) table_\ +_As: [`summary`](/reference/tables/pages/#summary)_ JSON-encoded summarization of the page-level data. -Here's an example of the decoded object: - +
+An example of the decoded object ```json { "SpeedIndex": 400, @@ -97,6 +97,7 @@ Here's an example of the decoded object: "visualComplete": 400 } ``` +
## Schema diff --git a/src/content/docs/reference/structs/request-payload.md b/src/content/docs/reference/blobs/request-payload.mdx similarity index 51% rename from src/content/docs/reference/structs/request-payload.md rename to src/content/docs/reference/blobs/request-payload.mdx index 54607d1..d2782fc 100644 --- a/src/content/docs/reference/structs/request-payload.md +++ b/src/content/docs/reference/blobs/request-payload.mdx @@ -1,17 +1,17 @@ --- -title: Request payload struct -description: Reference docs for the request payload struct +title: Request payload blob +description: Reference docs for the request payload blob --- -_Appears in: [`requests` table](/reference/tables/requests/#payload)_\ -_As: `payload`_ +_Appears in: [`requests`](/reference/tables/requests/) table_\ +_As: [`payload`](/reference/tables/requests/#payload)_ JSON-encoded WebPageTest result data for a request. -**The actual schema of the WebPageTest result data is liable to change, depending on a request.** - -Here's an example of the decoded object: +**The actual schema is liable to change, depending on a request.** +
+An example of the decoded object ```json { "_all_end": 234, @@ -247,788 +247,942 @@ Here's an example of the decoded object: } } ``` +
## Schema -| Field | Type | Description | -| ---------------------------------------------------- | -------- | ------------------------------------------------- | -| `_all_end` | `int` | End time of all operations. | -| `_all_ms` | `int` | Total time taken for all operations. | -| `_all_start` | `int` | Start time of all operations. | -| `_body_file` | `string` | File containing the body of the request. | -| `_bytesIn` | `int` | Number of bytes received. | -| `_bytesOut` | `int` | Number of bytes sent. | -| `_cache_time` | `int` | Cache time. | -| `_cacheControl` | `string` | Cache control header value. | -| `_cached` | `int` | Indicates if the request was cached (0 or 1). | -| `_cdn_provider` | `string` | CDN provider used. | -| `_certificates` | `array` | Certificates used. | -| `_chunks` | `array` | Array of chunks received. | -| `_chunks[].bytes` | `int` | Size of the chunk. | -| `_chunks[].inflated` | `int` | Size of the inflated chunk. | -| `_chunks[].ts` | `int` | Timestamp of the chunk. | -| `_connect_end` | `int` | Connection end time. | -| `_connect_ms` | `int` | Connection time in milliseconds. | -| `_connect_start` | `int` | Connection start time. | -| `_contentEncoding` | `string` | Content encoding of the response. | -| `_contentType` | `string` | Content type of the response. | -| `_created` | `int` | Creation time of the request. | -| `_dns_end` | `int` | DNS end time. | -| `_dns_info` | `object` | DNS information. | -| `_dns_info.results.aliases` | `array` | Aliases for the domain. | -| `_dns_info.results.canonical_names` | `array` | Canonical names for the domain. | -| `_dns_info.results.endpoint_metadatas` | `array` | Endpoint metadata. | -| `_dns_info.results.expiration` | `string` | Expiration date of the DNS query. | -| `_dns_info.results.host_ports` | `array` | Host ports. | -| `_dns_info.results.hostname_results` | `array` | Hostname results. | -| `_dns_info.results.ip_endpoints[].endpoint_address` | `string` | IP address of the endpoint. | -| `_dns_info.results.ip_endpoints[].endpoint_port` | `int` | Port of the endpoint. | -| `_dns_info.results.ip_endpoints` | `array` | IP endpoints. | -| `_dns_info.results.text_records` | `array` | Text records. | -| `_dns_info.results` | `object` | Results of the DNS query. | -| `_dns_info.secure` | `int` | Indicates if the DNS query is secure. | -| `_dns_info.transactions_needed[].dns_query_type` | `string` | Type of DNS query. | -| `_dns_info.transactions_needed` | `array` | Transactions needed for DNS query. | -| `_dns_ms` | `int` | DNS lookup time in milliseconds. | -| `_dns_start` | `int` | DNS start time. | -| `_documentURL` | `string` | Document URL of the request. | -| `_download_end` | `int` | Download end time. | -| `_download_ms` | `int` | Download time in milliseconds. | -| `_download_start` | `int` | Download start time. | -| `_expires` | `string` | Expiry date of the request. | -| `_final_base_page` | `int` | Indicates if the request is the final base page. | -| `_frame_id` | `string` | Frame ID where the request was made. | -| `_full_url` | `string` | Full URL of the request. | -| `_gzip_save` | `int` | Size saved due to gzip compression. | -| `_gzip_total` | `int` | Total size of the gzip-compressed content. | -| `_host` | `string` | Host of the request. | -| `_http2_server_settings` | `object` | HTTP/2 server settings. | -| `_http2_stream_dependency` | `int` | HTTP/2 stream dependency. | -| `_http2_stream_exclusive` | `int` | HTTP/2 stream exclusivity. | -| `_http2_stream_id` | `int` | HTTP/2 stream ID. | -| `_http2_stream_weight` | `int` | HTTP/2 stream weight. | -| `_id` | `string` | Unique identifier for the request. | -| `_image_save` | `int` | Size saved due to image optimization. | -| `_image_total` | `int` | Total size of images. | -| `_index` | `int` | Index of the request. | -| `_initial_priority` | `string` | Initial priority of the request. | -| `_initiator_column` | `string` | Column number of the initiator. | -| `_initiator_line` | `string` | Line number of the initiator. | -| `_initiator_type` | `string` | Type of initiator (e.g., script). | -| `_initiator` | `string` | Initiator of the request. | -| `_ip_addr` | `string` | IP address of the requested server. | -| `_is_base_page` | `int` | Indicates if the request is the base page. | -| `_is_secure` | `int` | Indicates if the request is secure (0 or 1). | -| `_load_end` | `int` | Load end time. | -| `_load_ms` | `int` | Load time in milliseconds. | -| `_load_start_float` | `float` | Precise start time of load. | -| `_load_start` | `int` | Start time of load in milliseconds. | -| `_method` | `string` | HTTP method used for the request. | -| `_minify_save` | `int` | Size saved due to minification. | -| `_minify_total` | `int` | Total size of minified content. | -| `_netlog_id` | `int` | Netlog ID. | -| `_number` | `int` | Number of the request. | -| `_objectSize` | `int` | Size of the object received. | -| `_objectSizeUncompressed` | `int` | Uncompressed size of the object received. | -| `_priority` | `string` | Priority of the request. | -| `_protocol` | `string` | Protocol used for the request. | -| `_raw_id` | `string` | Raw ID for the request. | -| `_request_id` | `string` | Identifier for the original request. | -| `_request_type` | `string` | Type of the request (e.g., Document). | -| `_responseCode` | `int` | HTTP response code. | -| `_run` | `int` | The run number of the test. | -| `_score_cache` | `int` | Cache score. | -| `_score_cdn` | `int` | CDN score. | -| `_score_combine` | `int` | Combine score. | -| `_score_compress` | `int` | Compression score. | -| `_score_cookies` | `int` | Cookies score. | -| `_score_etags` | `int` | ETags score. | -| `_score_gzip` | `int` | Gzip compression score. | -| `_score_keep-alive` | `int` | Keep-alive score. | -| `_score_minify` | `int` | Minification score. | -| `_securityDetails` | `object` | Security details of the request. | -| `_securityDetails.certificateId` | `int` | Certificate ID. | -| `_securityDetails.certificateTransparencyCompliance` | `string` | Certificate transparency compliance. | -| `_securityDetails.cipher` | `string` | Cipher used. | -| `_securityDetails.encryptedClientHello` | `int` | Indicates if the client hello is encrypted. | -| `_securityDetails.issuer` | `string` | Issuer of the certificate. | -| `_securityDetails.keyExchange` | `string` | Key exchange used. | -| `_securityDetails.keyExchangeGroup` | `string` | Key exchange group used. | -| `_securityDetails.protocol` | `string` | Security protocol used. | -| `_securityDetails.sanList` | `array` | Subject alternative names. | -| `_securityDetails.serverSignatureAlgorithm` | `int` | Server signature algorithm. | -| `_securityDetails.signedCertificateTimestampList` | `array` | List of signed certificate timestamps. | -| `_securityDetails.subjectName` | `string` | Subject name of the certificate. | -| `_securityDetails.validFrom` | `int` | Valid from date of the certificate. | -| `_securityDetails.validTo` | `int` | Valid to date of the certificate. | -| `_server_count` | `int` | Number of servers used. | -| `_server_port` | `string` | Server port. | -| `_server_rtt` | `int` | Server round-trip time. | -| `_socket_group` | `string` | Socket group. | -| `_socket` | `int` | Socket used for the request. | -| `_ssl_end` | `int` | SSL handshake end time. | -| `_ssl_ms` | `int` | SSL handshake time in milliseconds. | -| `_ssl_start` | `int` | SSL handshake start time. | -| `_tls_cipher_suite` | `int` | Cipher suite used. | -| `_tls_next_proto` | `string` | Next protocol used. | -| `_tls_resumed` | `string` | Indicates if the TLS session was resumed. | -| `_tls_version` | `string` | TLS version used. | -| `_ttfb_end` | `int` | Time to first byte end time. | -| `_ttfb_ms` | `int` | Time to first byte in milliseconds. | -| `_ttfb_start` | `int` | Time to first byte start time. | -| `_type` | `int` | Type identifier for the request. | -| `_url` | `string` | URL path of the request. | -| `cache` | `object` | Cache details (empty in this example). | -| `pageref` | `string` | Reference to the page containing this request. | -| `request` | `object` | Details of the request. | -| `request.bodySize` | `int` | Size of the request body. | -| `request.cookies` | `array` | Cookies sent with the request. | -| `request.headersSize` | `int` | Size of the request headers. | -| `request.httpVersion` | `string` | HTTP version used for the request. | -| `request.method` | `string` | HTTP method used for the request. | -| `request.queryString` | `array` | Query string parameters. | -| `request.url` | `string` | URL of the requested resource. | -| `response` | `object` | Details of the response. | -| `response.bodySize` | `int` | Size of the response body. | -| `response.content.mimeType` | `string` | MIME type of the content. | -| `response.content.size` | `int` | Size of the content. | -| `response.content` | `object` | Content details of the response. | -| `response.cookies` | `array` | Cookies received with the response. | -| `response.headersSize` | `int` | Size of the response headers. | -| `response.httpVersion` | `string` | HTTP version used for the response. | -| `response.status` | `int` | HTTP response status code. | -| `response.statusText` | `string` | Status text of the response. | -| `startedDateTime` | `string` | Start time of the request. | -| `time` | `int` | Total time taken for the request in milliseconds. | -| `timings` | `object` | Timing details of various stages of the request. | -| `timings.blocked` | `int` | Time spent in blocking. | -| `timings.connect` | `int` | Time spent in establishing a connection. | -| `timings.dns` | `int` | Time spent in DNS lookup. | -| `timings.receive` | `int` | Time spent in receiving the response. | -| `timings.send` | `int` | Time spent in sending the request. | -| `timings.ssl` | `int` | Time spent in SSL handshake. | -| `timings.wait` | `int` | Time spent in waiting for the response. | - ### `_all_end` +Type: `int` + End time of all operations. ### `_all_ms` +Type: `int` + Total time taken for all operations. ### `_all_start` +Type: `int` + Start time of all operations. ### `_body_file` +Type: `string` + File containing the body of the request. ### `_bytesIn` +Type: `int` + Number of bytes received. ### `_bytesOut` +Type: `int` + Number of bytes sent. ### `_cacheControl` +Type: `string` + Cache control header value. ### `_cache_time` +Type: `int` + Cache time. ### `_cached` +Type: `int` + Indicates if the request was cached (0 or 1). ### `_cdn_provider` +Type: `string` + CDN provider used. ### `_certificates` -array | Certificates used. +Type: `array` + +Certificates used. ### `_chunks` -array | Array of chunks received. +Type: `array` + +Array of chunks received. - #### `_chunks[].bytes` + Type: `int` + Size of the chunk. - #### `_chunks[].inflated` + Type: `int` + Size of the inflated chunk. - #### `_chunks[].ts` + Type: `int` + Timestamp of the chunk. ### `_connect_end` +Type: `int` + Connection end time. ### `_connect_ms` +Type: `int` + Connection time in milliseconds. ### `_connect_start` +Type: `int` + Connection start time. ### `_contentEncoding` +Type: `string` + Content encoding of the response. ### `_contentType` +Type: `string` + Content type of the response. ### `_created` +Type: `int` + Creation time of the request. ### `_dns_end` +Type: `int` + DNS end time. ### `_dns_info` +Type: `object` + DNS information. - #### `_dns_info.results` + Type: `object` + Results of the DNS query. - #### `_dns_info.results.aliases` + Type: `array` + Aliases for the domain. - #### `_dns_info.results.canonical_names` + Type: `array` + Canonical names for the domain. - #### `_dns_info.results.endpoint_metadatas` + Type: `array` + Endpoint metadata. - #### `_dns_info.results.expiration` + Type: `string` + Expiration date of the DNS query. - #### `_dns_info.results.host_ports` + Type: `array` + Host ports. - #### `_dns_info.results.hostname_results` + Type: `array` + Hostname results. - #### `_dns_info.results.ip_endpoints` + Type: `array` + IP endpoints. - #### `_dns_info.results.ip_endpoints[].endpoint_address` + Type: `string` + IP address of the endpoint. - #### `_dns_info.results.ip_endpoints[].endpoint_port` + Type: `int` + Port of the endpoint. - #### `_dns_info.results.text_records` + Type: `array` + Text records. - #### `_dns_info.secure` + Type: `int` + Indicates if the DNS query is secure. - #### `_dns_info.transactions_needed` + Type: `array` + Transactions needed for DNS query. - #### `_dns_info.transactions_needed[].dns_query_type` + Type: `string` + Type of DNS query. ### `_dns_ms` +Type: `int` + DNS lookup time in milliseconds. ### `_dns_start` +Type: `int` + DNS start time. ### `_documentURL` +Type: `string` + Document URL of the request. ### `_download_end` +Type: `int` + Download end time. ### `_download_ms` +Type: `int` + Download time in milliseconds. ### `_download_start` +Type: `int` + Download start time. ### `_expires` +Type: `string` + Expiry date of the request. ### `_final_base_page` +Type: `int` + Indicates if the request is the final base page. ### `_frame_id` +Type: `string` + Frame ID where the request was made. ### `_full_url` +Type: `string` + Full URL of the request. ### `_gzip_save` +Type: `int` + Size saved due to gzip compression. ### `_gzip_total` +Type: `int` + Total size of the gzip-compressed content. ### `_host` +Type: `string` + Host of the request. ### `_http2_server_settings` +Type: `object` + HTTP/2 server settings. ### `_http2_stream_dependency` +Type: `int` + HTTP/2 stream dependency. ### `_http2_stream_exclusive` +Type: `int` + HTTP/2 stream exclusivity. ### `_http2_stream_id` +Type: `int` + HTTP/2 stream ID. ### `_http2_stream_weight` +Type: `int` + HTTP/2 stream weight. ### `_id` +Type: `string` + Unique identifier for the request. ### `_image_save` +Type: `int` + Size saved due to image optimization. ### `_image_total` +Type: `int` + Total size of images. ### `_index` +Type: `int` + Index of the request. ### `_initial_priority` +Type: `string` + Initial priority of the request. ### `_initiator` +Type: `string` + Initiator of the request. ### `_initiator_column` +Type: `string` + Column number of the initiator. ### `_initiator_line` +Type: `string` + Line number of the initiator. ### `_initiator_type` +Type: `string` + Type of initiator (e.g., script). ### `_ip_addr` +Type: `string` + IP address of the requested server. ### `_is_base_page` +Type: `int` + Indicates if the request is the base page. ### `_is_secure` +Type: `int` + Indicates if the request is secure (0 or 1). ### `_load_end` +Type: `int` + Load end time. ### `_load_ms` +Type: `int` + Load time in milliseconds. ### `_load_start` +Type: `int` + Start time of load in milliseconds. ### `_load_start_float` -float | Precise start time of load. +Type: `float` + +Precise start time of load. ### `_method` +Type: `string` + HTTP method used for the request. ### `_minify_save` +Type: `int` + Size saved due to minification. ### `_minify_total` +Type: `int` + Total size of minified content. ### `_netlog_id` +Type: `int` + Netlog ID. ### `_number` +Type: `int` + Number of the request. ### `_objectSize` +Type: `int` + Size of the object received. ### `_objectSizeUncompressed` +Type: `int` + Uncompressed size of the object received. ### `_priority` +Type: `string` + Priority of the request. ### `_protocol` +Type: `string` + Protocol used for the request. ### `_raw_id` +Type: `string` + Raw ID for the request. ### `_request_id` +Type: `string` + Identifier for the original request. ### `_request_type` +Type: `string` + Type of the request (e.g., Document). ### `_responseCode` +Type: `int` + HTTP response code. ### `_run` +Type: `int` + The run number of the test. ### `_score_cache` +Type: `int` + Cache score. ### `_score_cdn` +Type: `int` + CDN score. ### `_score_combine` +Type: `int` + Combine score. ### `_score_compress` +Type: `int` + Compression score. ### `_score_cookies` +Type: `int` + Cookies score. ### `_score_etags` +Type: `int` + ETags score. ### `_score_gzip` +Type: `int` + Gzip compression score. ### `_score_keep-alive` +Type: `int` + Keep-alive score. ### `_score_minify` +Type: `int` + Minification score. ### `_securityDetails` +Type: `object` + Security details of the request. - #### `_securityDetails.certificateId` + Type: `int` + Certificate ID. - #### `_securityDetails.certificateTransparencyCompliance` + Type: `string` + Certificate transparency compliance. - #### `_securityDetails.cipher` + Type: `string` + Cipher used. - #### `_securityDetails.encryptedClientHello` + Type: `int` + Indicates if the client hello is encrypted. - #### `_securityDetails.issuer` + Type: `string` + Issuer of the certificate. - #### `_securityDetails.keyExchange` + Type: `string` + Key exchange used. - #### `_securityDetails.keyExchangeGroup` + Type: `string` + Key exchange group used. - #### `_securityDetails.protocol` + Type: `string` + Security protocol used. - #### `_securityDetails.sanList` + Type: `array` + Subject alternative names. - #### `_securityDetails.serverSignatureAlgorithm` + Type: `int` + Server signature algorithm. - #### `_securityDetails.signedCertificateTimestampList` + Type: `array` + List of signed certificate timestamps. - #### `_securityDetails.subjectName` + Type: `string` + Subject name of the certificate. - #### `_securityDetails.validFrom` + Type: `int` + Valid from date of the certificate. - #### `_securityDetails.validTo` + Type: `int` + Valid to date of the certificate. ### `_server_count` +Type: `int` + Number of servers used. ### `_server_port` +Type: `string` + Server port. ### `_server_rtt` +Type: `int` + Server round-trip time. ### `_socket` +Type: `int` + Socket used for the request. ### `_socket_group` +Type: `string` + Socket group. ### `_ssl_end` +Type: `int` + SSL handshake end time. ### `_ssl_ms` +Type: `int` + SSL handshake time in milliseconds. ### `_ssl_start` +Type: `int` + SSL handshake start time. ### `_tls_cipher_suite` +Type: `int` + Cipher suite used. ### `_tls_next_proto` +Type: `string` + Next protocol used. ### `_tls_resumed` +Type: `string` + Indicates if the TLS session was resumed. ### `_tls_version` +Type: `string` + TLS version used. ### `_ttfb_end` +Type: `int` + Time to first byte end time. ### `_ttfb_ms` +Type: `int` + Time to first byte in milliseconds. ### `_ttfb_start` +Type: `int` + Time to first byte start time. ### `_type` +Type: `int` + Type identifier for the request. ### `_url` +Type: `string` + URL path of the request. ### `cache` +Type: `object` + Cache details (empty in this example). ### `pageref` +Type: `string` + Reference to the page containing this request. ### `request` +Type: `object` + Details of the request. - #### `request.bodySize` + Type: `int` + Size of the request body. - #### `request.cookies` + Type: `array` + Cookies sent with the request. - #### `request.headersSize` + Type: `int` + Size of the request headers. - #### `request.httpVersion` + Type: `string` + HTTP version used for the request. - #### `request.method` + Type: `string` + HTTP method used for the request. - #### `request.queryString` + Type: `array` + Query string parameters. - #### `request.url` + Type: `string` + URL of the requested resource. ### `response` +Type: `object` + Details of the response. - #### `response.bodySize` + Type: `int` + Size of the response body. - #### `response.content` + Type: `object` + Content details of the response. - #### `response.content.mimeType` + Type: `string` + MIME type of the content. - #### `response.content.size` + Type: `int` + Size of the content. - #### `response.cookies` + Type: `array` + Cookies received with the response. - #### `response.headersSize` + Type: `int` + Size of the response headers. - #### `response.httpVersion` + Type: `string` + HTTP version used for the response. - #### `response.status` + Type: `int` + HTTP response status code. - #### `response.statusText` + Type: `string` + Status text of the response. ### `startedDateTime` +Type: `string` + Start time of the request. ### `time` +Type: `int` + Total time taken for the request in milliseconds. ### `timings` +Type: `object` + Timing details of various stages of the request. - #### `timings.blocked` + Type: `int` + Time spent in blocking. - #### `timings.connect` + Type: `int` + Time spent in establishing a connection. - #### `timings.dns` + Type: `int` + Time spent in DNS lookup. - #### `timings.receive` + Type: `int` + Time spent in receiving the response. - #### `timings.send` + Type: `int` + Time spent in sending the request. - #### `timings.ssl` + Type: `int` + Time spent in SSL handshake. - #### `timings.wait` + Type: `int` + Time spent in waiting for the response. diff --git a/src/content/docs/reference/structs/request-summary.md b/src/content/docs/reference/blobs/request-summary.mdx similarity index 85% rename from src/content/docs/reference/structs/request-summary.md rename to src/content/docs/reference/blobs/request-summary.mdx index f2fcd28..d383dcc 100644 --- a/src/content/docs/reference/structs/request-summary.md +++ b/src/content/docs/reference/blobs/request-summary.mdx @@ -1,15 +1,15 @@ --- -title: Request summary struct -description: Reference docs for the request summary struct +title: Request summary blob +description: Reference docs for the request summary blob --- -_Appears in: [`requests` table](/reference/tables/requests/#summary)_\ -_As: `summary`_ +_Appears in: [`requests` table](/reference/tables/requests/)_\ +_As: [`summary`](/reference/tables/requests/#summary)_ JSON-encoded summarization of request data. -Here's an example of the decoded object: - +
+An example of the decoded object ```json { "_cdn_provider": "Edgecast", @@ -33,6 +33,7 @@ Here's an example of the decoded object: "type": "html" } ``` +
## Schema diff --git a/src/content/docs/reference/custom-metrics/ads.md b/src/content/docs/reference/custom-metrics/ads.md deleted file mode 100644 index e121df0..0000000 --- a/src/content/docs/reference/custom-metrics/ads.md +++ /dev/null @@ -1,162 +0,0 @@ ---- -title: Ads custom metric -description: Reference docs for the feature struct ---- - -_Appears in: [`custom_metrics` struct](/reference/structs/custom-metrics/)_\ -_As: [`ads`](/reference/structs/custom-metrics/#ads)_ - -## Schema - -| Field name | Type | Description | -| ------------------------------------------------ | ------------- | -------------------------------------------------------------------------------------------- | -| `ads` | object | Contains information about the ads.txt file. | -| `ads.present` | boolean | Indicates if the ads.txt file is present. | -| `ads.status` | integer | HTTP status code of the ads.txt file response. | -| `ads.redirected` | boolean | Indicates if the ads.txt file request was redirected. | -| `ads.redirected_to` | string | URL to which the ads.txt resource was redirected. | -| `ads.account_count` | integer | Number of advertising accounts listed in the ads.txt file. | -| `ads.account_types` | object | Types of accounts (direct or reseller) listed in the ads.txt file. | -| `ads.account_types.direct` | object | Information about direct advertising accounts. | -| `ads.account_types.direct.domains` | array | List of domains with direct advertising accounts. | -| `ads.account_types.direct.account_count` | integer | Number of direct advertising accounts. | -| `ads.account_types.direct.domain_count` | integer | Number of unique domains with direct advertising accounts. | -| `ads.account_types.reseller` | object | Information about reseller advertising accounts. | -| `ads.account_types.reseller.domains` | array | List of domains with reseller advertising accounts. | -| `ads.account_types.reseller.account_count` | integer | Number of reseller advertising accounts. | -| `ads.account_types.reseller.domain_count` | integer | Number of unique domains with reseller advertising accounts. | -| `ads.line_count` | integer | Total number of lines in the ads.txt file. | -| `ads.variables` | array | List of variables found in the ads.txt file. | -| `ads.variable_count` | integer | Number of variables found in the ads.txt file. | -| `app_ads` | object | Contains information about the app-ads.txt file. | -| `app_ads.present` | boolean | Indicates if the app-ads.txt file is present. | -| `app_ads.status` | integer | HTTP status code of the app-ads.txt file response. | -| `app_ads.redirected` | boolean | Indicates if the app-ads.txt file request was redirected. | -| `app_ads.redirected_to` | string | URL to which the app-ads.txt resource was redirected. | -| `app_ads.account_count` | integer | Number of advertising accounts listed in the app-ads.txt file. | -| `app_ads.account_types` | object | Types of accounts (direct or reseller) listed in the app-ads.txt file. | -| `app_ads.account_types.direct` | object | Information about direct advertising accounts. | -| `app_ads.account_types.direct.domains` | array | List of domains with direct advertising accounts. | -| `app_ads.account_types.direct.account_count` | integer | Number of direct advertising accounts. | -| `app_ads.account_types.direct.domain_count` | integer | Number of unique domains with direct advertising accounts. | -| `app_ads.account_types.reseller` | object | Information about reseller advertising accounts. | -| `app_ads.account_types.reseller.domains` | array | List of domains with reseller advertising accounts. | -| `app_ads.account_types.reseller.account_count` | integer | Number of reseller advertising accounts. | -| `app_ads.account_types.reseller.domain_count` | integer | Number of unique domains with reseller advertising accounts. | -| `app_ads.line_count` | integer | Total number of lines in the app-ads.txt file. | -| `app_ads.variables` | array | List of variables found in the app-ads.txt file. | -| `app_ads.variable_count` | integer | Number of variables found in the app-ads.txt file. | -| `sellers` | object | Contains information about the sellers.json file. | -| `sellers.present` | boolean | Indicates if the sellers.json file is present. | -| `sellers.status` | integer | HTTP status code of the sellers.json file response. | -| `sellers.redirected` | boolean | Indicates if the sellers.json file request was redirected. | -| `sellers.redirected_to` | string | URL to which the sellers.json resource was redirected. | -| `sellers.seller_count` | integer | Number of sellers listed in the sellers.json file. | -| `sellers.seller_types` | object | Types of sellers (publisher, intermediary, both) listed in the sellers.json file. | -| `sellers.seller_types.publisher` | object | Information about publisher sellers. | -| `sellers.seller_types.publisher.domains` | array | List of domains associated with publisher sellers. | -| `sellers.seller_types.publisher.seller_count` | integer | Number of publisher sellers. | -| `sellers.seller_types.publisher.domain_count` | integer | Number of unique domains associated with publisher sellers. | -| `sellers.seller_types.intermediary` | object | Information about intermediary sellers. | -| `sellers.seller_types.intermediary.domains` | array | List of domains associated with intermediary sellers. | -| `sellers.seller_types.intermediary.seller_count` | integer | Number of intermediary sellers. | -| `sellers.seller_types.intermediary.domain_count` | integer | Number of unique domains associated with intermediary sellers. | -| `sellers.seller_types.both` | object | Information about sellers who are both publishers and intermediaries. | -| `sellers.seller_types.both.domains` | array | List of domains associated with sellers who are both publishers and intermediaries. | -| `sellers.seller_types.both.seller_count` | integer | Number of sellers who are both publishers and intermediaries. | -| `sellers.seller_types.both.domain_count` | integer | Number of unique domains associated with sellers who are both publishers and intermediaries. | -| `sellers.passthrough_count` | integer | Number of passthrough sellers listed in the sellers.json file. | -| `sellers.confidential_count` | integer | Number of confidential sellers listed in the sellers.json file. | - -Here's an example of the decoded object from `https://www.amazon.com/` page crawl: - -```json -{ - "ads": { - "present": true, - "status": 200, - "redirected": false, - "account_count": 1, - "account_types": { - "direct": { - "domains": [ - "placeholder.example.com" - ], - "account_count": 1, - "domain_count": 1 - }, - "reseller": { - "domains": [], - "account_count": 0, - "domain_count": 0 - } - }, - "line_count": 10, - "variables": [], - "variable_count": 0 - }, - "app_ads": { - "present": true, - "status": 200, - "redirected": false, - "account_count": 1, - "account_types": { - "direct": { - "domains": [ - "placeholder.example.com" - ], - "account_count": 1, - "domain_count": 1 - }, - "reseller": { - "domains": [], - "account_count": 0, - "domain_count": 0 - } - }, - "line_count": 10, - "variables": [], - "variable_count": 0 - }, - "sellers": { - "present": true, - "redirected": true, - "status": 200, - "seller_count": 2732, - "seller_types": { - "publisher": { - "domains": [ - "cumuli.com", - "realself.com", - "trendscatchers.io", - ... - ], - "seller_count": 2199, - "domain_count": 1923 - }, - "intermediary": { - "domains": [ - "bidsxchange.com", - "vuukle.com", - "vdo.ai", - ... - ], - "seller_count": 232, - "domain_count": 172 - }, - "both": { - "domains": [ - "gourmetads.com", - "freestar.com", - "shinez.io", - ... - ], - "seller_count": 148, - "domain_count": 134 - } - }, - "passthrough_count": 0, - "confidential_count": 2 - } -} -``` diff --git a/src/content/docs/reference/custom-metrics/ads.mdx b/src/content/docs/reference/custom-metrics/ads.mdx new file mode 100644 index 0000000..f4856a4 --- /dev/null +++ b/src/content/docs/reference/custom-metrics/ads.mdx @@ -0,0 +1,446 @@ +--- +title: Ads custom metric +description: Reference docs for the feature struct +--- + +_Appears in: [`custom_metrics.other`](/reference/custom-metrics/other/) struct_\ +_As: [`ads`](/reference/custom-metrics/other/#ads)_ + +
+An example of the decoded object from `https://www.amazon.com/` page crawl +```json +{ + "ads": { + "present": true, + "status": 200, + "redirected": false, + "account_count": 1, + "account_types": { + "direct": { + "domains": [ + "placeholder.example.com" + ], + "account_count": 1, + "domain_count": 1 + }, + "reseller": { + "domains": [], + "account_count": 0, + "domain_count": 0 + } + }, + "line_count": 10, + "variables": [], + "variable_count": 0 + }, + "app_ads": { + "present": true, + "status": 200, + "redirected": false, + "account_count": 1, + "account_types": { + "direct": { + "domains": [ + "placeholder.example.com" + ], + "account_count": 1, + "domain_count": 1 + }, + "reseller": { + "domains": [], + "account_count": 0, + "domain_count": 0 + } + }, + "line_count": 10, + "variables": [], + "variable_count": 0 + }, + "sellers": { + "present": true, + "redirected": true, + "status": 200, + "seller_count": 2732, + "seller_types": { + "publisher": { + "domains": [ + "cumuli.com", + "realself.com", + "trendscatchers.io", + ... + ], + "seller_count": 2199, + "domain_count": 1923 + }, + "intermediary": { + "domains": [ + "bidsxchange.com", + "vuukle.com", + "vdo.ai", + ... + ], + "seller_count": 232, + "domain_count": 172 + }, + "both": { + "domains": [ + "gourmetads.com", + "freestar.com", + "shinez.io", + ... + ], + "seller_count": 148, + "domain_count": 134 + } + }, + "passthrough_count": 0, + "confidential_count": 2 + } +} +``` +
+ +## Schema + +### `ads` + +Type: `object` + +Contains information about the ads.txt file. + +#### `ads.present` + +Type: `boolean` + +Indicates if the ads.txt file is present. + +#### `ads.status` + +Type: `integer` + +HTTP status code of the ads.txt file response. + +#### `ads.redirected` + +Type: `boolean` + +Indicates if the ads.txt file request was redirected. + +#### `ads.redirected_to` + +Type: `string` + +URL to which the ads.txt resource was redirected. + +#### `ads.account_count` + +Type: `integer` + +Number of advertising accounts listed in the ads.txt file. + +#### `ads.account_types` + +Type: `object` + +Types of accounts (direct or reseller) listed in the ads.txt file. + +#### `ads.account_types.direct` + +Type: `object` + +Information about direct advertising accounts. + +##### `ads.account_types.direct.domains` + +Type: `array` + +List of domains with direct advertising accounts. + +##### `ads.account_types.direct.account_count` + +Type: `integer` + +Number of direct advertising accounts. + +##### `ads.account_types.direct.domain_count` + +Type: `integer` + +Number of unique domains with direct advertising accounts. + +#### `ads.account_types.reseller` + +Type: `object` + +Information about reseller advertising accounts. + +##### `ads.account_types.reseller.domains` + +Type: `array` + +List of domains with reseller advertising accounts. + +##### `ads.account_types.reseller.account_count` + +Type: `integer` + +Number of reseller advertising accounts. + +##### `ads.account_types.reseller.domain_count` + +Type: `integer` + +Number of unique domains with reseller advertising accounts. + +#### `ads.line_count` + +Type: `integer` + +Total number of lines in the ads.txt file. + +#### `ads.variables` + +Type: `array` + +List of variables found in the ads.txt file. + +#### `ads.variable_count` + +Type: `integer` + +Number of variables found in the ads.txt file. + +### `app_ads` + +Type: `object` + +Contains information about the app-ads.txt file. + +#### `app_ads.present` + +Type: `boolean` + +Indicates if the app-ads.txt file is present. + +#### `app_ads.status` + +Type: `integer` + +HTTP status code of the app-ads.txt file response. + +#### `app_ads.redirected` + +Type: `boolean` + +Indicates if the app-ads.txt file request was redirected. + +#### `app_ads.redirected_to` + +Type: `string` + +URL to which the app-ads.txt resource was redirected. + +#### `app_ads.account_count` + +Type: `integer` + +Number of advertising accounts listed in the app-ads.txt file. + +#### `app_ads.account_types` + +Type: `object` + +Types of accounts (direct or reseller) listed in the app-ads.txt file. + +#### `app_ads.account_types.direct` + +Type: `object` + +Information about direct advertising accounts. + +##### `app_ads.account_types.direct.domains` + +Type: `array` + +List of domains with direct advertising accounts. + +##### `app_ads.account_types.direct.account_count` + +Type: `integer` + +Number of direct advertising accounts. + +##### `app_ads.account_types.direct.domain_count` + +Type: `integer` + +Number of unique domains with direct advertising accounts. + +#### `app_ads.account_types.reseller` + +Type: `object` + +Information about reseller advertising accounts. + +##### `app_ads.account_types.reseller.domains` + +Type: `array` + +List of domains with reseller advertising accounts. + +##### `app_ads.account_types.reseller.account_count` + +Type: `integer` + +Number of reseller advertising accounts. + +##### `app_ads.account_types.reseller.domain_count` + +Type: `integer` + +Number of unique domains with reseller advertising accounts. + +#### `app_ads.line_count` + +Type: `integer` + +Total number of lines in the app-ads.txt file. + +#### `app_ads.variables` + +Type: `array` + +List of variables found in the app-ads.txt file. + +#### `app_ads.variable_count` + +Type: `integer` + +Number of variables found in the app-ads.txt file. + +### `sellers` + +Type: `object` + +Contains information about the sellers.json file. + +#### `sellers.present` + +Type: `boolean` + +Indicates if the sellers.json file is present. + +#### `sellers.status` + +Type: `integer` + +HTTP status code of the sellers.json file response. + +#### `sellers.redirected` + +Type: `boolean` + +Indicates if the sellers.json file request was redirected. + +#### `sellers.redirected_to` + +Type: `string` + +URL to which the sellers.json resource was redirected. + +#### `sellers.seller_count` + +Type: `integer` + +Number of sellers listed in the sellers.json file. + +#### `sellers.seller_types` + +Type: `object` + +Types of sellers (publisher, intermediary, both) listed in the sellers.json file. + +#### `sellers.seller_types.publisher` + +Type: `object` + +Information about publisher sellers. + +##### `sellers.seller_types.publisher.domains` + +Type: `array` + +List of domains associated with publisher sellers. + +##### `sellers.seller_types.publisher.seller_count` + +Type: `integer` + +Number of publisher sellers. + +##### `sellers.seller_types.publisher.domain_count` + +Type: `integer` + +Number of unique domains associated with publisher sellers. + +#### `sellers.seller_types.intermediary` + +Type: `object` + +Information about intermediary sellers. + +##### `sellers.seller_types.intermediary.domains` + +Type: `array` + +List of domains associated with intermediary sellers. + +##### `sellers.seller_types.intermediary.seller_count` + +Type: `integer` + +Number of intermediary sellers. + +##### `sellers.seller_types.intermediary.domain_count` + +Type: `integer` + +Number of unique domains associated with intermediary sellers. + +#### `sellers.seller_types.both` + +Type: `object` + +Information about sellers who are both publishers and intermediaries. + +##### `sellers.seller_types.both.domains` + +Type: `array` + +List of domains associated with sellers who are both publishers and intermediaries. + +##### `sellers.seller_types.both.seller_count` + +Type: `integer` + +Number of sellers who are both publishers and intermediaries. + +##### `sellers.seller_types.both.domain_count` + +Type: `integer` + +Number of unique domains associated with sellers who are both publishers and intermediaries. + +#### `sellers.passthrough_count` + +Type: `integer` + +Number of passthrough sellers listed in the sellers.json file. + +#### `sellers.confidential_count` + +Type: `integer` + +Number of confidential sellers listed in the sellers.json file. + diff --git a/src/content/docs/reference/custom-metrics/other.mdx b/src/content/docs/reference/custom-metrics/other.mdx new file mode 100644 index 0000000..1883d16 --- /dev/null +++ b/src/content/docs/reference/custom-metrics/other.mdx @@ -0,0 +1,163 @@ +--- +title: Other custom metric +description: Reference docs for the other custom metric +--- + +_Appears in: [`custom_metrics` struct](/reference/structs/custom-metrics/)_\ +_As: [`other`](/reference/structs/custom-metrics/#other)_ + +## Schema + +### `ads` + +Advertising technology and usage. See the [ads](/reference/custom-metrics/ads/) custom metric for more information. + +### `almanac` + +Metrics defined in the early versions of Web Almanac crawls. + +### `aurora` + +Project Aurora. + +### `avg_dom_depth` + +The average DOM depth of a page. + +### `Colordepth` + +Color depth of a screen. + +### `crawl_links` + +The links found during a crawl. + +### `css` + +CSS usage. + +### `doctype` + +Document type declaration. + +### `document_height` + +Height of the document. + +### `document_width` + +Width of the document. + +### `Dpi` + +Dots per inch (DPI) of a screen. + +### `event-names` + +Event names used in JavaScript. + +### `fugu-apis` + +Usage of Fugu APIs. + +### `generated-content` + +Client-side generated content. + +### `has_shadow_root` + +Presence of shadow DOM roots. + +### `Images` + +Images usage. + +### `img-loading-attr` + +Image loading attributes. + +### `initiators` + +Resource initiators. + +### `inline_style_bytes` + +Type: `integer` + +Size of inline styles. + +### `lib-detector-version` + +Libraries detector version. + +### `localstorage_size` + +Size of local storage. + +### `meta_viewport` + +Meta viewport tag. + +### `num_iframes` + +Type: `integer` + +Number of iframes on a page. + +### `num_scripts` + +Type: `integer` + +Number of script tags. + +### `num_scripts_async` + +Type: `integer` + +Number of asynchronous scripts. + +### `num_scripts_sync` + +Type: `integer` + +Number of synchronous scripts. + +### `observers` + +Metrics related to the usage of observer APIs. + +### `privacy-sandbox` + +Privacy Sandbox initiative usage. + +### `pwa` + +Progressive Web Apps. + +### `quirks_mode` + +Usage of quirks mode in browsers. + +### `Resolution` + +Resolution of a screen. + +### `robots_meta` + +Robots meta tag. + +### `sass` + +Usage of Sass. + +### `sessionstorage_size` + +Size of session storage. + +### `usertiming` + +User Timing API. + +### `valid-head` + +Validity of the head element. diff --git a/src/content/docs/reference/custom-metrics/performance.mdx b/src/content/docs/reference/custom-metrics/performance.mdx deleted file mode 100644 index ebb27f5..0000000 --- a/src/content/docs/reference/custom-metrics/performance.mdx +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: Performance custom metric -description: Reference docs for the feature struct ---- - -_Appears in: [`custom_metrics` struct](/reference/structs/custom-metrics/)_ - -TODO - -## Schema - -TODO diff --git a/src/content/docs/reference/custom-metrics/privacy.md b/src/content/docs/reference/custom-metrics/privacy.md deleted file mode 100644 index ec84277..0000000 --- a/src/content/docs/reference/custom-metrics/privacy.md +++ /dev/null @@ -1,125 +0,0 @@ ---- -title: Privacy custom metric -description: Reference docs for the feature struct ---- - -_Appears in: [`custom_metrics` struct](/reference/structs/custom-metrics/)_\ -_As: [`privacy`](/reference/structs/custom-metrics/#privacy)_ - -## Schema - -| Field name | Type | Description | -| -------------------------------------------- | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `privacy_wording_links` | array | Links related to privacy policy. | -| `privacy_wording_links[i].text` | string | Title of the link. | -| `iab_tcf_v1` | object | IAB TCF v1 settings. | -| `iab_tcf_v1.present` | boolean | Presence of IAB TCF v1. | -| `iab_tcf_v1.data` | object | TCF v1 vendor consents. [VendorConsents](https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/CMP%20JS%20API%20v1.1%20Final.md#vendorconsents-) | -| `iab_tcf_v1.compliant_setup` | boolean | Verifies compliance of TCF v1 vendor consents. | -| `iab_tcf_v2` | object | IAB TCF v2 settings. | -| `iab_tcf_v2.present` | boolean | Presence of IAB TCF v2. | -| `iab_tcf_v2.data` | object | TCF v2 vendor consents. [TCData](https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/TCFv2/IAB%20Tech%20Lab%20-%20CMP%20API%20v2.md#tcdata) | -| `iab_tcf_v2.compliant_setup` | boolean | Verifies compliance of TCF v2 vendor consents. | -| `iab_usp` | object | Shows the presence of IAB U.S. Privacy String. | -| `iab_usp.present` | boolean | Shows the presence of IAB U.S. Privacy String. | -| `iab_usp.privacy_string` | string | IAB U.S. Privacy String. | -| `navigator_doNotTrack` | boolean | Indicates whether the browser's "Do Not Track" setting is enabled. | -| `navigator_globalPrivacyControl` | boolean | Indicates whether the browser's Global Privacy Control setting is enabled. | -| `document_permissionsPolicy` | boolean | Indicates the presence of the Permissions Policy. | -| `document_featurePolicy` | boolean | Indicates the presence of the Feature Policy. | -| `referrerPolicy` | object | Specifies the referrer policy for the entire document and individual requests. | -| `referrerPolicy.entire_document_policy` | string | Referrer policy for the entire document. | -| `referrerPolicy.individual_requests` | string | Referrer policy for individual requests. | -| `referrerPolicy.link_relations` | string | Referrer policy for link relations. | -| `media_devices` | object | Tracks the usage of media device APIs like `enumerateDevices` and `getUserMedia`. | -| `media_devices["API_NAME"]` | boolean | Indicates usage of a particular API. | -| `geolocation` | object | Tracks the usage of geolocation APIs like `getCurrentPosition` and `watchPosition`. | -| `geolocation["API_NAME"]` | boolean | Indicates usage of a particular API. | -| `fingerprinting` | object | Tracks potential fingerprinting attempts by counting API calls and listing likely fingerprinting scripts. | -| `fingerprinting.counts` | object | Counts of fingerprinting-related API calls. | -| `fingerprinting.counts["API_NAME"]` | integer | Counts of fingerprinting-related API calls. | -| `fingerprinting.likelyFingerprintingScripts` | array | List of likely fingerprinting script URLs. | -| `request_hostnames_with_cname` | object | Lists hostnames with their corresponding CNAME records. | -| `request_hostnames_with_cname.["HOSTNAME"]` | array | CNAME records for a given hostname. | -| `ccpa_link` | object | California Consumer Privacy Act (CCPA) compliance. | -| `ccpa_link.hasCCPALink` | boolean | Presence of a CCPA link. | -| `ccpa_link.CCPALinkPhrases` | array | Related CCPA link phrases. | - -Here's an example of the decoded object from `https://www.google.com/` page crawl: - -```json -{ - "privacy_wording_links": [ - { - "text": "Privacy" - } - ], - "iab_tcf_v1": { - "present": false, - "data": null, - "compliant_setup": null - }, - "iab_tcf_v2": { - "present": false, - "data": null, - "compliant_setup": null - }, - "iab_usp": { - "present": false, - "privacy_string": null - }, - "navigator_doNotTrack": false, - "navigator_globalPrivacyControl": false, - "document_permissionsPolicy": false, - "document_featurePolicy": false, - "referrerPolicy": { - "entire_document_policy": "origin", - "individual_requests": null, - "link_relations": null - }, - "media_devices": { - "navigator_mediaDevices_enumerateDevices": false, - "navigator_mediaDevices_getUserMedia": true, - "navigator_mediaDevices_getDisplayMedia": false - }, - "geolocation": { - "navigator_geolocation_getCurrentPosition": false, - "navigator_geolocation_watchPosition": false - }, - "fingerprinting": { - "counts": { - "prefers-contrast": 4, - "forced-colors": 15, - "devicememory": 1, - "hardwareconcurrency": 2, - "localstorage": 5, - "screen.width": 7, - "screen.height": 5, - "sessionstorage": 1, - "gettimezoneoffset": 5, - "maxtouchpoints": 5, - "ontouchstart": 5, - "navigator.vendor": 1, - "getchanneldata": 4, - "navigator.platform": 1 - }, - "likelyFingerprintingScripts": [ - "https://www.google.com/", - "https://www.gstatic.com/og/_/js/k=og.qtm.en_US.ftxzKLuybBw.2019.O/rt=j/m=qabr,q_d,qcwid,qapid,qald,q_dg/exm=qaaw,qadd,qaid,qein,qhaw,qhba,qhbr,qhch,qhga,qhid,qhin/d=1/ed=1/rs=AA2YrTsOEv0aSAP39vut5xzjLXfdU4aRbQ", - ... - ] - }, - "request_hostnames_with_cname": { - "ogs.google.com": [ - "www3.l.google.com" - ], - "apis.google.com": [ - "plus.l.google.com" - ] - }, - "ccpa_link": { - "hasCCPALink": false, - "CCPALinkPhrases": [] - } -} -``` diff --git a/src/content/docs/reference/custom-metrics/privacy.mdx b/src/content/docs/reference/custom-metrics/privacy.mdx new file mode 100644 index 0000000..9c17847 --- /dev/null +++ b/src/content/docs/reference/custom-metrics/privacy.mdx @@ -0,0 +1,293 @@ +--- +title: Privacy custom metric +description: Reference docs for the privacy custom metric +--- + +_Appears in: [`custom_metrics`](/reference/structs/custom-metrics/) struct_\ +_As: [`privacy`](/reference/structs/custom-metrics/#privacy)_ + +
+An example of the decoded object from `https://www.google.com/` page crawl +```json +{ + "privacy_wording_links": [ + { + "text": "Privacy" + } + ], + "iab_tcf_v1": { + "present": false, + "data": null, + "compliant_setup": null + }, + "iab_tcf_v2": { + "present": false, + "data": null, + "compliant_setup": null + }, + "iab_usp": { + "present": false, + "privacy_string": null + }, + "navigator_doNotTrack": false, + "navigator_globalPrivacyControl": false, + "document_permissionsPolicy": false, + "document_featurePolicy": false, + "referrerPolicy": { + "entire_document_policy": "origin", + "individual_requests": null, + "link_relations": null + }, + "media_devices": { + "navigator_mediaDevices_enumerateDevices": false, + "navigator_mediaDevices_getUserMedia": true, + "navigator_mediaDevices_getDisplayMedia": false + }, + "geolocation": { + "navigator_geolocation_getCurrentPosition": false, + "navigator_geolocation_watchPosition": false + }, + "fingerprinting": { + "counts": { + "prefers-contrast": 4, + "forced-colors": 15, + "devicememory": 1, + "hardwareconcurrency": 2, + "localstorage": 5, + "screen.width": 7, + "screen.height": 5, + "sessionstorage": 1, + "gettimezoneoffset": 5, + "maxtouchpoints": 5, + "ontouchstart": 5, + "navigator.vendor": 1, + "getchanneldata": 4, + "navigator.platform": 1 + }, + "likelyFingerprintingScripts": [ + "https://www.google.com/", + "https://www.gstatic.com/og/_/js/k=og.qtm.en_US.ftxzKLuybBw.2019.O/rt=j/m=qabr,q_d,qcwid,qapid,qald,q_dg/exm=qaaw,qadd,qaid,qein,qhaw,qhba,qhbr,qhch,qhga,qhid,qhin/d=1/ed=1/rs=AA2YrTsOEv0aSAP39vut5xzjLXfdU4aRbQ", + ... + ] + }, + "request_hostnames_with_cname": { + "ogs.google.com": [ + "www3.l.google.com" + ], + "apis.google.com": [ + "plus.l.google.com" + ] + }, + "ccpa_link": { + "hasCCPALink": false, + "CCPALinkPhrases": [] + } +} +``` +
+ +## Schema + +### `privacy_wording_links` + +Type: `array` + +Links related to privacy policy. + +#### `privacy_wording_links[i].text` + +Type: `string` + +Title of the link. + +### `iab_tcf_v1` + +Type: `object` + +IAB TCF v1 settings. + +#### `iab_tcf_v1.present` + +Type: `boolean` + +Presence of IAB TCF v1. + +#### `iab_tcf_v1.data` + +Type: `object` + +TCF v1 vendor consents. [VendorConsents](https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/CMP%20JS%20API%20v1.1%20Final.md#vendorconsents-) + +#### `iab_tcf_v1.compliant_setup` + +Type: `boolean` + +Verifies compliance of TCF v1 vendor consents. + +### `iab_tcf_v2` + +Type: `object` + +IAB TCF v2 settings. + +#### `iab_tcf_v2.present` + +Type: `boolean` + +Presence of IAB TCF v2. + +#### `iab_tcf_v2.data` + +Type: `object` + +TCF v2 vendor consents. [TCData](https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/TCFv2/IAB%20Tech%20Lab%20-%20CMP%20API%20v2.md#tcdata) + +#### `iab_tcf_v2.compliant_setup` + +Type: `boolean` + +Verifies compliance of TCF v2 vendor consents. + +### `iab_usp` + +Type: `object` + +Shows the presence of IAB U.S. Privacy String. + +#### `iab_usp.present` + +Type: `boolean` + +Shows the presence of IAB U.S. Privacy String. + +#### `iab_usp.privacy_string` + +Type: `string` + +IAB U.S. Privacy String. + +### `navigator_doNotTrack` + +Type: `boolean` + +Indicates whether the browser's "Do Not Track" setting is enabled. + +### `navigator_globalPrivacyControl` + +Type: `boolean` + +Indicates whether the browser's Global Privacy Control setting is enabled. + +### `document_permissionsPolicy` + +Type: `boolean` + +Indicates the presence of the Permissions Policy. + +### `document_featurePolicy` + +Type: `boolean` + +Indicates the presence of the Feature Policy. + +### `referrerPolicy` + +Type: `object` + +Specifies the referrer policy for the entire document and individual requests. + +#### `referrerPolicy.entire_document_policy` + +Type: `string` + +Referrer policy for the entire document. + +#### `referrerPolicy.individual_requests` + +Type: `string` + +Referrer policy for individual requests. + +#### `referrerPolicy.link_relations` + +Type: `string` + +Referrer policy for link relations. + +### `media_devices` + +Type: `object` + +Tracks the usage of media device APIs like `enumerateDevices` and `getUserMedia`. + +#### `media_devices["API_NAME"]` + +Type: `boolean` + +Indicates usage of a particular API. + +### `geolocation` + +Type: `object` + +Tracks the usage of geolocation APIs like `getCurrentPosition` and `watchPosition`. + +#### `geolocation["API_NAME"]` + +Type: `boolean` + +Indicates usage of a particular API. + +### `fingerprinting` + +Type: `object` + +Tracks potential fingerprinting attempts by counting API calls and listing likely fingerprinting scripts. + +#### `fingerprinting.counts` + +Type: `object` + +Counts of fingerprinting-related API calls. + +#### `fingerprinting.counts["API_NAME"]` + +Type: `integer` + +Counts of fingerprinting-related API calls. + +#### `fingerprinting.likelyFingerprintingScripts` + +Type: `array` + +List of likely fingerprinting script URLs. + +### `request_hostnames_with_cname` + +Type: `object` + +Lists hostnames with their corresponding CNAME records. + +#### `request_hostnames_with_cname.["HOSTNAME"]` + +Type: `array` + +CNAME records for a given hostname. + +### `ccpa_link` + +Type: `object` + +California Consumer Privacy Act (CCPA) compliance. + +#### `ccpa_link.hasCCPALink` + +Type: `boolean` + +Presence of a CCPA link. + +#### `ccpa_link.CCPALinkPhrases` + +Type: `array` + +Related CCPA link phrases. diff --git a/src/content/docs/reference/structs/custom-metrics.mdx b/src/content/docs/reference/structs/custom-metrics.mdx index 328fae9..ea76422 100644 --- a/src/content/docs/reference/structs/custom-metrics.mdx +++ b/src/content/docs/reference/structs/custom-metrics.mdx @@ -3,152 +3,127 @@ title: Custom metrics struct description: Reference docs for the custom metrics struct --- -_Appears in: [`pages` table](/reference/tables/pages/)_\ +_Appears in: [`pages`](/reference/tables/pages/) table_\ _As [`custom_metrics`](/reference/tables/pages/#custom_metrics)_ ## Schema -| Field name | Type | Description | -|--------------------|---------|-------------------------------------------------| -| `a11y` | JSON | Accessibility. | -| `cms` | JSON | Content Management Systems. | -| `cookies` | JSON | Cookie usage. | -| `css_variables` | JSON | Use of CSS variables. | -| `ecommerce` | JSON | E-commerce features. | -| `element_count` | JSON | Number of elements on a page. | -| `javascript` | JSON | JavaScript usage. | -| `markup` | JSON | HTML markup. | -| `media` | JSON | Media elements. | -| `origin_trials` | JSON | Origin Trials. | -| `performance` | JSON | Web performance. | -| `privacy` | JSON | Privacy settings and policies. | -| `responsive_images`| JSON | Responsive image techniques. | -| `robots_txt` | JSON | robots.txt file. | -| `security` | JSON | Security features. | -| `structured_data` | JSON | Structured data. | -| `third_parties` | JSON | Third-party resources. | -| `well_known` | JSON | well-known URIs. | -| `wpt_bodies` | JSON | Metrics derived from WebPageTest bodies object. | -| `other` | JSON | Other custom metrics. | - ### `a11y` +Type: `JSON` + Accessibility. ### `cms` +Type: `JSON` + Content Management Systems. ### `cookies` +Type: `JSON` + Cookie usage. ### `css_variables` +Type: `JSON` + Use of CSS variables. ### `ecommerce` +Type: `JSON` + E-commerce features. ### `element_count` +Type: `JSON` + Number of elements on a page. ### `javascript` +Type: `JSON` + JavaScript usage. ### `markup` +Type: `JSON` + HTML markup. ### `media` +Type: `JSON` + Media elements. ### `origin_trials` +Type: `JSON` + Origin Trials. ### `performance` -Web performance. See the [`performance` custom metric](/reference/custom-metrics/performance/) for more information. +Type: `JSON` + +Web performance. ### `privacy` -Privacy settings and policies. See the [`privacy`](/reference/custom-metrics/privacy/) custom metric for more information. +Type: `JSON` + +Privacy settings and policies. See the [`privacy`](/reference/custom-metrics/privacy/) custom metrics for more information. ### `responsive_images` +Type: `JSON` + Responsive image techniques. ### `robots_txt` +Type: `JSON` + robots.txt file. ### `security` +Type: `JSON` + Security features. ### `structured_data` +Type: `JSON` + Structured data. ### `third_parties` +Type: `JSON` + Third-party resources. ### `well_known` +Type: `JSON` + well-known URIs. ### `wpt_bodies` +Type: `JSON` + Metrics derived from WebPageTest bodies object. ### `other` -Other custom metrics. - -List: - -| Field name | Description | -|--------------------|---------------------------------------------------| -| `ads` | Advertising technology and usage. See the [`ads` custom metric](/reference/custom-metrics/ads/) for more information. | -| `almanac` | Metrics defined in the early versions of Web Almanac crawls. | -| `aurora` | Project Aurora. | -| `avg_dom_depth` | The average DOM depth of a page. | -| `Colordepth` | Color depth of a screen. | -| `crawl_links` | The links found during a crawl. | -| `css` | CSS usage. | -| `doctype` | Document type declaration. | -| `document_height` | Height of the document. | -| `document_width` | Width of the document. | -| `Dpi` | Dots per inch (DPI) of a screen. | -| `event-names` | Event names used in JavaScript. | -| `fugu-apis` | Usage of Fugu APIs. | -| `generated-content`| Client-side generated content. | -| `has_shadow_root` | Presence of shadow DOM roots. | -| `Images` | Images usage. | -| `img-loading-attr` | Image loading attributes. | -| `initiators` | Resource initiators. | -| `inline_style_bytes` | Size of inline styles. | -| `lib-detector-version` | Libraries detector version. | -| `localstorage_size`| Size of local storage. | -| `meta_viewport` | Meta viewport tag. | -| `num_iframes` | Number of iframes on a page. | -| `num_scripts` | Number of script tags. | -| `num_scripts` | Number of script tags. | -| `num_scripts_async`| Number of asynchronous scripts. | -| `num_scripts_sync` | Number of synchronous scripts. | -| `observers` | Metrics related to the usage of observer APIs. | -| `privacy-sandbox` | Privacy Sandbox initiative usage. | -| `pwa` | Progressive Web Apps. | -| `quirks_mode` | Usage of quirks mode in browsers. | -| `Resolution` | Resolution of a screen. | -| `robots_meta` | Robots meta tag. | -| `sass` | Usage of Sass. | -| `sessionstorage_size` | Size of session storage. | -| `usertiming` | User Timing API. | -| `valid-head` | Validity of the head element. | +Type: `JSON` + +See the [other](/reference/custom-metrics/other/) custom metrics for more information. diff --git a/src/content/docs/reference/structs/technology.mdx b/src/content/docs/reference/structs/technology.mdx index 0d473c1..7f5cdec 100644 --- a/src/content/docs/reference/structs/technology.mdx +++ b/src/content/docs/reference/structs/technology.mdx @@ -3,10 +3,31 @@ title: Technology struct description: Reference docs for the technology struct --- -_Appears in: [`pages` table](/reference/tables/pages/)_ +_Appears in: [`pages`](/reference/tables/pages/) table_\ +_As: [`technologies`](/reference/tables/pages/#technologies)_ Technologies are detected by [Wappalyzer](https://www.wappalyzer.com/). Refer to HTTP Archive's fork of the [Wappalyzer repository](https://github.com/HTTPArchive/wappalyzer) on GitHub to request a new technology detection or to browse the source code of existing detections. +## Schema + +### `technology` + +Type: `STRING` + +Name of the detected technology + +### `categories` + +Type: `ARRAY` + +List of categories to which this technology belongs + +### `info` + +Type: `ARRAY` + +Additional metadata about the detected technology, ie version number + ## Example queries ### Pages using WordPress in the top 5k @@ -46,7 +67,7 @@ GROUP BY cms ORDER BY sites DESC -LIMIT +LIMIT 10 ``` @@ -71,23 +92,3 @@ WHERE date = '2023-09-01' AND t.technology = 'WordPress' ``` - -## Schema - -Field name | Type | Description ----|---|--- -`technology` | `STRING` | Name of the detected technology -`categories` | `ARRAY` | List of categories to which this technology belongs -`info` | `ARRAY` | Additional metadata about the detected technology, ie version number - -### `technology` - -Name of the detected technology - -### `categories` - -List of categories to which this technology belongs - -### `info` - -Additional metadata about the detected technology, ie version number diff --git a/src/content/docs/reference/tables/pages.mdx b/src/content/docs/reference/tables/pages.mdx index 320b0d1..e40f2c4 100644 --- a/src/content/docs/reference/tables/pages.mdx +++ b/src/content/docs/reference/tables/pages.mdx @@ -7,6 +7,108 @@ import { Tabs, TabItem } from '@astrojs/starlight/components'; [`httparchive.crawl.pages`](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1shttparchive!2scrawl!3spages) is a partitioned and clustered table containing one row per page tested in the HTTP Archive. Pages are tested on a monthly basis and as of April 2022, both the root page and one secondary page are tested. +## Schema + +Field name | Type | Description +---|---|--- +[`date`](#date) | `DATE` | YYYY-MM-DD format of the HTTP Archive monthly crawl +[`client`](#client) | `STRING` | Test environment: `'desktop'` or `'mobile'` +[`page`](#page) | `STRING` | The URL of the page being tested +[`is_root_page`](#is_root_page) | `BOOLEAN` | Whether the page is the root of the origin +[`root_page`](#root_page) | `STRING` | The URL of the root page being tested, the origin followed by `/` +[`rank`](#rank) | `INTEGER` | Site popularity rank, from CrUX +[`wptid`](#wptid) | `STRING` | ID of the WebPageTest results +[`payload`](#payload) | [`HAR`](/reference/blobs/har/) | JSON-encoded WebPageTest results for the page +[`summary`](#summary) | [`Page summary`](/reference/structs/page-summary/) | JSON-encoded summarization of the page-level data +[`custom_metrics`](#custom_metrics) | [`Custom metrics`](/reference/structs/custom-metrics/) | JSON-encoded test results of the custom metrics +[`lighthouse`](#lighthouse) | [`Lighthouse`](/reference/blobs/lighthouse/) | JSON-encoded Lighthouse report +[`features`](#features) | ARRAY<Feature> | Blink features detected at runtime +[`technologies`](#technologies) | ARRAY<Technology> | Technologies detected at runtime +[`metadata`](#metadata) | [`Page metadata`](/reference/blobs/page-metadata/) | Additional metadata about the test + +### `date` + +**This field is required for all queries over the `pages` table.** + +YYYY-MM-DD format of the HTTP Archive monthly crawl. + + Example: `date = '2023-06-01'` + +### `client` + +Test environment: `'desktop'` or `'mobile'`. + +### `page` + +The URL of the page being tested. + +Example: `page = 'https://har.fyi/'` + +### `is_root_page` + +Whether the page is the root of the origin. + +### `root_page` + +The URL of the root page being tested, the origin followed by `/`. + +Example: `root_page = 'https://har.fyi/'` + +### `rank` + +Site popularity rank, from CrUX + +### `wptid` + +ID of the WebPageTest results, for example `wptid = '230509_Dx20W_FMHK5'`. + +The ID encodes the date of the test at the start in YYMMDD format. The date is followed by an underscore and a `D` or `M` character indicating whether it was a desktop or mobile test. The rest of the ID is randomly generated. In the example above we can tell that the page was tested on May 9, 2023, and that it was a desktop test. + +You can view the WebPageTest results in the browser by visiting `https://webpagetest.httparchive.org/result//`, eg https://webpagetest.httparchive.org/result/230509_Dx20W_FMHK5/. This is HTTP Archive's own private instance of WebPageTest, which is required to view any of the results. + +### `payload` + +JSON-encoded WebPageTest results for the page. + +For a full example value, see [payload.json](/payload.json). + +See the [Page payload](/reference/blobs/page-payload/) reference for more details. + +### `summary` + +JSON-encoded summarization of the page-level data + +See the [Page summary](/reference/blobs/page-summary/) reference for more details. + +### `custom_metrics` + +JSON-encoded test results of the custom metrics. + +See the [Custom metrics](/reference/structs/custom-metrics/) reference for more details. + +### `lighthouse` + +JSON-encoded Lighthouse report. + +See the [Lighthouse](/reference/blobs/lighthouse/) reference for more details. + +### `features` + +Blink features detected at runtime (see https://chromestatus.com/features) + +See the [Features](/reference/structs/features/) reference for more details. + +### `technologies` + +Technologies detected at runtime (see https://www.wappalyzer.com/) + +See the [Technology](/reference/structs/technology/) reference for more details. + +### `metadata` + +Additional metadata about the test + +See the [Page metadata](/reference/blobs/page-metadata/) reference for more details. ## Example queries @@ -16,7 +118,6 @@ Here are some common operations you can perform with the `pages` table. - ```sql /* This query will process 1 GB when run. */ SELECT @@ -27,17 +128,14 @@ FROM `httparchive.crawl.pages` WHERE date = '2024-05-01' GROUP BY client, is_root_page ``` - - client | is_root_page | pages_total -- | -- | -- mobile | false | 13998652 mobile | true | 16193055 desktop | true | 12900240 desktop | false | 11585746 - @@ -45,7 +143,6 @@ desktop | false | 11585746 - ```sql /* This query will process 1.12 GB when run. */ WITH pages AS ( @@ -62,15 +159,12 @@ SELECT FROM pages GROUP BY client ``` - - client | median_page_weight -- | -- mobile | 1776291 desktop | 2029751 - @@ -88,7 +182,6 @@ Also note that for demonstration purposes, this query processes a 1% sample of t - ```sql /* This query will process 0.6 GB when run. */ WITH pages_summary AS ( @@ -110,116 +203,12 @@ SELECT APPROX_QUANTILES(reqTotal, 100)[SAFE_ORDINAL(95)] p95_requests FROM pages_summary ``` - - pages | avg_requests | p25_requests | p50_requests | p75_requests | p95_requests -- | -- | -- | -- | -- | -- 306151 | 92.53 | 37 | 65 | 107 | 212 The median number of requests per page is 65. The average is in fact skewed by outliers. Also, since the 25th percentile is 37 requests and the 75th percentile is 107 requests, that tells us that 50% of the 300K pages tracked by the HTTP Archive have between 37 and 107 requests. This is also known as the [interquartile range](https://en.wikipedia.org/wiki/Interquartile_range). - - -## Schema - -Field name | Type | Description ----|---|--- -[`date`](#date) | `DATE` | YYYY-MM-DD format of the HTTP Archive monthly crawl -[`client`](#client) | `STRING` | Test environment: `'desktop'` or `'mobile'` -[`page`](#page) | `STRING` | The URL of the page being tested -[`is_root_page`](#is_root_page) | `BOOLEAN` | Whether the page is the root of the origin -[`root_page`](#root_page) | `STRING` | The URL of the root page being tested, the origin followed by `/` -[`rank`](#rank) | `INTEGER` | Site popularity rank, from CrUX -[`wptid`](#wptid) | `STRING` | ID of the WebPageTest results -[`payload`](#payload) | [`HAR`](/reference/blobs/har/) | JSON-encoded WebPageTest results for the page -[`summary`](#summary) | [`Page summary`](/reference/structs/page-summary/) | JSON-encoded summarization of the page-level data -[`custom_metrics`](#custom_metrics) | [`Custom metrics`](/reference/structs/custom-metrics/) | JSON-encoded test results of the custom metrics -[`lighthouse`](#lighthouse) | [`Lighthouse`](/reference/blobs/lighthouse/) | JSON-encoded Lighthouse report -[`features`](#features) | ARRAY<Feature> | Blink features detected at runtime -[`technologies`](#technologies) | ARRAY<Technology> | Technologies detected at runtime -[`metadata`](#metadata) | [`Page metadata`](/reference/blobs/page-metadata/) | Additional metadata about the test - -### `date` - -**This field is required for all queries over the `pages` table.** - -YYYY-MM-DD format of the HTTP Archive monthly crawl. - - Example: `date = '2023-06-01'` - -### `client` - -Test environment: `'desktop'` or `'mobile'`. - -### `page` - -The URL of the page being tested. - -Example: `page = 'https://har.fyi/'` - -### `is_root_page` - -Whether the page is the root of the origin. - -### `root_page` - -The URL of the root page being tested, the origin followed by `/`. - -Example: `root_page = 'https://har.fyi/'` - -### `rank` - -Site popularity rank, from CrUX - -### `wptid` - -ID of the WebPageTest results, for example `wptid = '230509_Dx20W_FMHK5'`. - -The ID encodes the date of the test at the start in YYMMDD format. The date is followed by an underscore and a `D` or `M` character indicating whether it was a desktop or mobile test. The rest of the ID is randomly generated. In the example above we can tell that the page was tested on May 9, 2023, and that it was a desktop test. - -You can view the WebPageTest results in the browser by visiting `https://webpagetest.httparchive.org/result//`, eg https://webpagetest.httparchive.org/result/230509_Dx20W_FMHK5/. This is HTTP Archive's own private instance of WebPageTest, which is required to view any of the results. - -### `payload` - -JSON-encoded WebPageTest results for the page. - -For a full example value, see [payload.json](/payload.json). - -See the [`har`](/reference/blobs/har/) reference for more details. - -### `summary` - -JSON-encoded summarization of the page-level data - -See the [`summary`](/reference/structs/page-summary/) reference for more details. - -### `custom_metrics` - -JSON-encoded test results of the custom metrics. - -See the [`custom metrics`](/reference/structs/custom-metrics/) reference for more details. - -### `lighthouse` - -JSON-encoded Lighthouse report. - -See the [`lighthouse`](/reference/blobs/lighthouse/) reference for more details. - -### `features` - -Blink features detected at runtime (see https://chromestatus.com/features) - -### `technologies` - -Technologies detected at runtime (see https://www.wappalyzer.com/) - -See the [`technology`](/reference/structs/technology/) reference for more details. - -### `metadata` - -Additional metadata about the test - -See the [`metadata`](/reference/blobs/page-metadata/) reference for more details. diff --git a/src/content/docs/reference/tables/requests.mdx b/src/content/docs/reference/tables/requests.mdx index 67eff63..9cf4491 100644 --- a/src/content/docs/reference/tables/requests.mdx +++ b/src/content/docs/reference/tables/requests.mdx @@ -7,6 +7,97 @@ import { Tabs, TabItem } from '@astrojs/starlight/components'; [`httparchive.crawl.requests`](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1shttparchive!2scrawl!3srequests) is a partitioned and clustered table containing one row per request per page tested in the HTTP Archive. Pages are tested on a monthly basis and as of April 2022, both the root page and one secondary page are tested. +## Schema + +Field name | Type | Description +---|---|--- +[`date`](#date) | `DATE` | YYYY-MM-DD format of the HTTP Archive monthly crawl +[`client`](#client) | `STRING` | Test environment: `'desktop'` or `'mobile'` +[`page`](#page) | `STRING` | The URL of the page being tested +[`is_root_page`](#is_root_page) | `BOOLEAN` | Whether the page is the root of the origin +[`root_page`](#root_page) | `STRING` | The URL of the root page being tested, the origin followed by `/` +[`url`](#url) | `STRING` | The URL of the request +[`is_main_document`](#is_main_document) | `BOOLEAN` | Whether this request corresponds with the main HTML document of the page, which is the first HTML request after redirects +[`type`](#type) | `STRING` | Simplified description of the type of resource (script, html, css, text, other, etc) +[`index`](#index) | `INTEGER` | The sequential 0-based index of the request +[`payload`](#payload) | [`Request payload`](/reference/blobs/request-payload/) | JSON-encoded WebPageTest result data for this request +[`summary`](#summary) | [`Request summary`](/reference/blobs/request-summary/) | JSON-encoded summarization of request data +[`request_headers`](#request_headers) | ARRAY<Header> | Request headers +[`response_headers`](#response_headers) | ARRAY<Header> | Response headers +[`response_body`](#response_body) | `STRING` | Text-based response body + +### `date` + +**This field is required for all queries over the `requests` table.** + +YYYY-MM-DD format of the HTTP Archive monthly crawl. + + Example: `date = '2023-06-01'` + +### `client` + +Test environment: `'desktop'` or `'mobile'`. + +### `page` + +The URL of the page being tested. + +Example: `page = 'https://har.fyi/'` + +### `is_root_page` + +Whether the page is the root of the origin. + +### `root_page` + +The URL of the root page being tested, the origin followed by `/`. + +Example: `root_page = 'https://har.fyi/'` + +### `url` + +The URL of the request + +### `is_main_document` + +Whether this request corresponds with the main HTML document of the page, which is the first HTML request after redirects + +### `type` + +Simplified description of the type of resource (script, image, css, html, other, font, text, video, xml, audio, wasm, etc) + +### `index` + +The sequential 1-based index of the request + +### `payload` + +JSON-encoded WebPageTest result data for this request + +See the [Request payload](/reference/blobs/request-payload/) reference for more details. + +### `summary` + +JSON-encoded summarization of request data + +See the [Request summary](/reference/blobs/request-summary/) reference for more details. + +### `request_headers` + +Request headers + +See the [Header](/reference/structs/header/) reference for more details. + +### `response_headers` + +Response headers + +See the [Header](/reference/structs/header/) reference for more details. + +### `response_body` + +Text-based response body + ## Example queries Here are some common operations you can perform with the `requests` table. @@ -15,7 +106,6 @@ Here are some common operations you can perform with the `requests` table. - ```sql /* This query will process 85 GB when run. */ SELECT @@ -26,17 +116,14 @@ Here are some common operations you can perform with the `requests` table. WHERE date = '2024-05-01' group by client, is_root_page ``` - - client | is_root_page | requests_total -- | -- | -- mobile | true | 1517364094 desktop | true | 1299394354 mobile | false | 1216156430 desktop | false | 1093804725 - @@ -46,7 +133,6 @@ Let's check the size of individual requests served from websites across the enti - ```sql /* This query will process 26 GB when run. */ WITH requests AS ( @@ -70,10 +156,8 @@ Let's check the size of individual requests served from websites across the enti ORDER BY responseSize100KB ASC LIMIT 10 ``` - - responseSize100KB | requests | pct_requests -- | -- | -- 100.0 | 10113115 | 0.90864138408777051 @@ -86,7 +170,6 @@ Let's check the size of individual requests served from websites across the enti 800.0 | 19817 | 0.0017805143428575023 900.0 | 24519 | 0.0022029788147814046 1000.0 | 11787 | 0.0010590363102014118 - @@ -98,7 +181,6 @@ Let's filter out all of the non-Image content and examine the popularity of vari - ```sql /* This query will process 8 GB when run. */ WITH requests AS ( @@ -125,10 +207,8 @@ Let's filter out all of the non-Image content and examine the popularity of vari GROUP BY format ORDER BY requests DESC ``` - - format | requests | pages | percent_image_requests | percent_pages -- | -- | -- | -- | -- jpg | 1644804 | 1310081 | 0.38 | 0.43 @@ -140,102 +220,9 @@ Let's filter out all of the non-Image content and examine the popularity of vari avif | 29226 | 25794 | 0.01 | 0.01 | | 4405 | 3938 | 0.0 | 0.0 heic | 395 | 382 | 0.0 | 0.0 - :::note It's important to understand the bias in the data when doing this type of analysis. While 1.3 million page views is a very diverse set - the technology used to parse these pages is Chrome browsers (both Desktop and Emulated mobile). Because of this, some formats may be under-represented - since Chrome supports webp but not jpeg-xr or jpeg2000. There may be cases like this with other type of technologies as well - for example custom web font types that vary based on browser support. ::: - -## Schema - -Field name | Type | Description ----|---|--- -[`date`](#date) | `DATE` | YYYY-MM-DD format of the HTTP Archive monthly crawl -[`client`](#client) | `STRING` | Test environment: `'desktop'` or `'mobile'` -[`page`](#page) | `STRING` | The URL of the page being tested -[`is_root_page`](#is_root_page) | `BOOLEAN` | Whether the page is the root of the origin -[`root_page`](#root_page) | `STRING` | The URL of the root page being tested, the origin followed by `/` -[`url`](#url) | `STRING` | The URL of the request -[`is_main_document`](#is_main_document) | `BOOLEAN` | Whether this request corresponds with the main HTML document of the page, which is the first HTML request after redirects -[`type`](#type) | `STRING` | Simplified description of the type of resource (script, html, css, text, other, etc) -[`index`](#index) | `INTEGER` | The sequential 0-based index of the request -[`payload`](#payload) | [`Request payload`](/reference/structs/request-payload/) | JSON-encoded WebPageTest result data for this request -[`summary`](#summary) | [`Request summary`](/reference/structs/request-summary/) | JSON-encoded summarization of request data -[`request_headers`](#request_headers) | ARRAY<Header> | Request headers -[`response_headers`](#response_headers) | ARRAY<Header> | Response headers -[`response_body`](#response_body) | `STRING` | Text-based response body - -### `date` - -**This field is required for all queries over the `requests` table.** - -YYYY-MM-DD format of the HTTP Archive monthly crawl. - - Example: `date = '2023-06-01'` - -### `client` - -Test environment: `'desktop'` or `'mobile'`. - -### `page` - -The URL of the page being tested. - -Example: `page = 'https://har.fyi/'` - -### `is_root_page` - -Whether the page is the root of the origin. - -### `root_page` - -The URL of the root page being tested, the origin followed by `/`. - -Example: `root_page = 'https://har.fyi/'` - -### `url` - -The URL of the request - -### `is_main_document` - -Whether this request corresponds with the main HTML document of the page, which is the first HTML request after redirects - -### `type` - -Simplified description of the type of resource (script, image, css, html, other, font, text, video, xml, audio, wasm, etc) - -### `index` - -The sequential 1-based index of the request - -### `payload` - -JSON-encoded WebPageTest result data for this request - -See the [`payload`](/reference/structs/request-payload/) reference for more details. - -### `summary` - -JSON-encoded summarization of request data - -See the [`summary`](/reference/structs/request-summary/) reference for more details. - -### `request_headers` - -Request headers - -See the [Header](/reference/structs/header/) reference for more details. - -### `response_headers` - -Response headers - -See the [Header](/reference/structs/header/) reference for more details. - -### `response_body` - -Text-based response body -