Reference
Data dictionary
Mediabuyer.site is a free, no-login native ad spy tool for affiliate marketers and media buyers, indexing live creatives from Taboola, Outbrain, MGID, and RevContent.
This page describes every attribute carried by a record in the corpus. Each section corresponds to one record type — native-ad creatives, landing-page captures, operator clusters, and the editorial affiliate- network directory. The browsing surfaces that render each type are linked from the section headers.
Native ad networks covered
A native ad network is a content-recommendation marketplace that places sponsored articles and product cards alongside editorial content on publisher sites. The corpus covers Taboola, Outbrain, MGID, and RevContent. A native ad network is distinct from an affiliate network (which recruits affiliates and pays commissions) and from a tracking platform (which records the click and conversion).
Refresh cadence
The crawlers run on a daily cadence. New creatives observed in the last crawl appear on the live feed within hours of the next build. The landing-page capture queue runs independently — each new click-out URL is fetched, redirected through, screenshotted, and fingerprinted.
Native-ad creative fields
Rendered by the live feed and the per-creative detail pages under /spy/.
| Field | Description | Example |
|---|---|---|
id | Stable per-network creative identifier. Used as the URL slug for the creative detail page. | tab:9f4c1ab2d3e8 |
network | The native ad network the creative was observed on. One of the supported native-ad surfaces. Distinct from any tracking platform the advertiser uses. | taboola | outbrain | mgid | revcontent |
advertiser | Display name of the advertiser, as branded on the creative. Sourced from the network's branding text, not from the destination domain. | Smart Choice Daily |
advertiser_slug | URL-safe slug of the advertiser display name. Stable across the corpus so per-advertiser hubs reuse the same path. | smart-choice-daily |
title | Headline copy as displayed in the content-recommendation widget. Verbatim from the network surface. | |
thumbnail | Absolute URL of the creative thumbnail. Proxied through the mediabuyer image surface to avoid hotlinking penalties on the source CDNs. | |
advertiser_url | The click-out destination — full URL, including all query parameters present on the network surface. Click parameters are preserved so the landing page renders identically to what the user would see. | |
lp_host | Hostname of advertiser_url with the leading www. stripped. Useful for grouping creatives by landing-page domain. | offer.example.com |
countries | ISO 3166-1 alpha-2 country codes where the creative was observed in a content-recommendation widget. Multi-country. | US, GB, DE |
languages | Detected language codes for the creative copy. Derived from the headline and landing-page content. | en, fr, pl |
vertical | Affiliate vertical inferred from the creative copy and landing-page text. Coarse-grained — nutra, leadgen, ecommerce, content-arbitrage, sweepstakes, finance, and similar groupings. | |
first_seen | Date the creative was first observed in any content-recommendation widget. ISO 8601 date. | 2024-09-12 |
last_seen | Date the creative was most recently observed. ISO 8601 date. Used together with first_seen to compute days_running. | |
days_running | Inclusive day count between first_seen and last_seen. A coarse proxy for creative longevity. | |
longevity_decile | Per-network rank of days_running from 0 (shortest) to 9 (longest). Lets the wall surface long-runners without exposing raw counts. |
Landing-page capture fields
Rendered by the per-host landing-page hubs under /lps.
| Field | Description | Example |
|---|---|---|
host | Landing-page hostname after the full redirect chain resolves. The www. prefix is stripped for consistency with lp_host on the creative record. | |
redirect_chain | Ordered list of HTTP redirects from the click-out URL to the final landing page, including the status code of each hop. | |
tech_stack | Detected technologies on the landing page — CMS, frameworks, analytics, anti-bot, and CDN. Sourced from runtime heuristics, not advertised attribution. | |
trackers | Per-page tracker / pixel fingerprint — AdSense publisher IDs, GTM containers, Facebook Pixel IDs, common attribution platforms, and other URL-pattern detections. | |
tracking_platform | Affiliate tracking platform inferred from outbound links and tracker fingerprint. Examples: Voluum, Cake, Tune, Everflow. A tracking platform is the software that records the click and conversion — distinct from an affiliate network, which is the company recruiting affiliates and paying out commissions. | |
screenshot_url | Full-page screenshot of the landing page rendered in a headless browser. Hosted on the mediabuyer image surface. | |
captured_at | Timestamp of the most recent successful capture. ISO 8601 datetime. |
Operator-graph fields
Rendered by the operator directory.
| Field | Description | Example |
|---|---|---|
operator_id | Internal cluster identifier. Stable across builds; used as the URL slug for the operator detail page. | |
cluster_signals | Signals that bound the cluster together — shared AdSense publisher ID, shared GTM container, shared footer legal-entity string, and reused asset hashes. | |
member_creatives | List of creative IDs attributed to this operator cluster. | |
member_lps | List of landing-page hostnames attributed to this operator cluster. | |
first_seen | Date the first member creative was observed. | |
last_seen | Date the most recent member creative was observed. |
Affiliate-network directory fields
Rendered by the affiliate-network directory.
| Field | Description | Example |
|---|---|---|
name | Network display name as used in marketing material. | |
url | Network homepage URL. | |
verticals | Verticals the network is known to support. An affiliate network recruits affiliates and pays commissions on conversions — distinct from a native ad network, which is the traffic source. | |
payout_terms | Net-N payment cadence as stated by the network. | |
minimum_payout | Minimum withdrawal threshold as stated by the network. | |
tracking_platform | Affiliate tracking software the network runs on (Voluum, Cake, HasOffers / Tune, Everflow, in-house, etc.). |
Access methods
Every record is accessible via the public web interface — no login and no API key. Sitemaps at /sitemap.xml and /sitemap-ai.xml enumerate the canonical URLs. Editorial feeds are at /feed.xml and /atom.xml.
Looking for the dataset-level catalog instead? See /data.
// Last updated: