Skip to main content
mediabuyer
Saved

Reference

Data dictionary

Mediabuyer.site is a free, no-login native ad spy tool for affiliate marketers and media buyers, indexing live creatives from Taboola, Outbrain, MGID, and RevContent.

This page describes every attribute carried by a record in the corpus. Each section corresponds to one record type — native-ad creatives, landing-page captures, operator clusters, and the editorial affiliate- network directory. The browsing surfaces that render each type are linked from the section headers.

Native ad networks covered

A native ad network is a content-recommendation marketplace that places sponsored articles and product cards alongside editorial content on publisher sites. The corpus covers Taboola, Outbrain, MGID, and RevContent. A native ad network is distinct from an affiliate network (which recruits affiliates and pays commissions) and from a tracking platform (which records the click and conversion).

Refresh cadence

The crawlers run on a daily cadence. New creatives observed in the last crawl appear on the live feed within hours of the next build. The landing-page capture queue runs independently — each new click-out URL is fetched, redirected through, screenshotted, and fingerprinted.

Native-ad creative fields

Rendered by the live feed and the per-creative detail pages under /spy/.

FieldDescriptionExample
idStable per-network creative identifier. Used as the URL slug for the creative detail page.tab:9f4c1ab2d3e8
networkThe native ad network the creative was observed on. One of the supported native-ad surfaces. Distinct from any tracking platform the advertiser uses.taboola | outbrain | mgid | revcontent
advertiserDisplay name of the advertiser, as branded on the creative. Sourced from the network's branding text, not from the destination domain.Smart Choice Daily
advertiser_slugURL-safe slug of the advertiser display name. Stable across the corpus so per-advertiser hubs reuse the same path.smart-choice-daily
titleHeadline copy as displayed in the content-recommendation widget. Verbatim from the network surface.
thumbnailAbsolute URL of the creative thumbnail. Proxied through the mediabuyer image surface to avoid hotlinking penalties on the source CDNs.
advertiser_urlThe click-out destination — full URL, including all query parameters present on the network surface. Click parameters are preserved so the landing page renders identically to what the user would see.
lp_hostHostname of advertiser_url with the leading www. stripped. Useful for grouping creatives by landing-page domain.offer.example.com
countriesISO 3166-1 alpha-2 country codes where the creative was observed in a content-recommendation widget. Multi-country.US, GB, DE
languagesDetected language codes for the creative copy. Derived from the headline and landing-page content.en, fr, pl
verticalAffiliate vertical inferred from the creative copy and landing-page text. Coarse-grained — nutra, leadgen, ecommerce, content-arbitrage, sweepstakes, finance, and similar groupings.
first_seenDate the creative was first observed in any content-recommendation widget. ISO 8601 date.2024-09-12
last_seenDate the creative was most recently observed. ISO 8601 date. Used together with first_seen to compute days_running.
days_runningInclusive day count between first_seen and last_seen. A coarse proxy for creative longevity.
longevity_decilePer-network rank of days_running from 0 (shortest) to 9 (longest). Lets the wall surface long-runners without exposing raw counts.

Landing-page capture fields

Rendered by the per-host landing-page hubs under /lps.

FieldDescriptionExample
hostLanding-page hostname after the full redirect chain resolves. The www. prefix is stripped for consistency with lp_host on the creative record.
redirect_chainOrdered list of HTTP redirects from the click-out URL to the final landing page, including the status code of each hop.
tech_stackDetected technologies on the landing page — CMS, frameworks, analytics, anti-bot, and CDN. Sourced from runtime heuristics, not advertised attribution.
trackersPer-page tracker / pixel fingerprint — AdSense publisher IDs, GTM containers, Facebook Pixel IDs, common attribution platforms, and other URL-pattern detections.
tracking_platformAffiliate tracking platform inferred from outbound links and tracker fingerprint. Examples: Voluum, Cake, Tune, Everflow. A tracking platform is the software that records the click and conversion — distinct from an affiliate network, which is the company recruiting affiliates and paying out commissions.
screenshot_urlFull-page screenshot of the landing page rendered in a headless browser. Hosted on the mediabuyer image surface.
captured_atTimestamp of the most recent successful capture. ISO 8601 datetime.

Operator-graph fields

Rendered by the operator directory.

FieldDescriptionExample
operator_idInternal cluster identifier. Stable across builds; used as the URL slug for the operator detail page.
cluster_signalsSignals that bound the cluster together — shared AdSense publisher ID, shared GTM container, shared footer legal-entity string, and reused asset hashes.
member_creativesList of creative IDs attributed to this operator cluster.
member_lpsList of landing-page hostnames attributed to this operator cluster.
first_seenDate the first member creative was observed.
last_seenDate the most recent member creative was observed.

Affiliate-network directory fields

Rendered by the affiliate-network directory.

FieldDescriptionExample
nameNetwork display name as used in marketing material.
urlNetwork homepage URL.
verticalsVerticals the network is known to support. An affiliate network recruits affiliates and pays commissions on conversions — distinct from a native ad network, which is the traffic source.
payout_termsNet-N payment cadence as stated by the network.
minimum_payoutMinimum withdrawal threshold as stated by the network.
tracking_platformAffiliate tracking software the network runs on (Voluum, Cake, HasOffers / Tune, Everflow, in-house, etc.).

Access methods

Every record is accessible via the public web interface — no login and no API key. Sitemaps at /sitemap.xml and /sitemap-ai.xml enumerate the canonical URLs. Editorial feeds are at /feed.xml and /atom.xml.

Looking for the dataset-level catalog instead? See /data.

// Last updated: