Skip to main content
mediabuyer
Saved

Open dataset · CC BY 4.0

Native-ad creatives — open derived dataset

A free, machine-readable dataset of derived facts about native-ad creatives observed publicly across Taboola, Outbrain, MGID and RevContent. It carries the words and the numbers — headline text, advertiser, network, vertical, the countries a creative was seen in, first/last seen, longevity, a Scaling / Evergreen / Burned-out verdict, the landing-page host, a detected tracker, and a link back to the source page. It contains no creative images: these are facts about ads, not the copyrighted ads themselves.

Observation window: Apr 11 – Jun 11, 2026. Last refreshed: . Refreshes on every build.

JSON Lines.jsonl

One JSON record per line. Stream it.

15,000 rows · 8.9 MB

sha256: 5ca8e044d3cad145d3eae50792d57d6e4f6812404b90a59c0f745af34e60336c

Download ↓
CSV.csv

Spreadsheet-ready. Countries joined by |.

15,000 rows · 3.8 MB

sha256: 9a78c0090082c65acdbe20685debafab9dd836cd0a44b2c43f45aa8aac5201c5

Download ↓

What’s inside

The row-level export is bounded to the top 15,000 creatives by confirmed multi-day longevity, so the file stays a sane download size. The aggregate State of Native Advertising report is computed over the full 689,322 creatives observed in the window. By design, no thumbnail / image / video URL of any creative is included in any row — the dataset is FACTS about ads, never the copyrighted creative itself.

Schema

FieldTypeDescription
creative_idstringStable per-network creative id.
networkstringSource native network slug (taboola | outbrain | mgid | revcontent).
network_labelstringHuman-readable network name.
advertiserstring|nullObserved advertiser / brand name.
advertiser_slugstring|nullURL slug for the advertiser page.
headlinestring|nullAd headline TEXT (the words, not the image).
verticalstringEditorial vertical slug.
vertical_labelstringHuman-readable vertical.
countriesstring[]ISO-3166 country codes where the creative was observed.
geo_countnumberDistinct country count.
ad_typestring|nullObserved ad format (text | image | video …).
first_seendateFirst observation date (YYYY-MM-DD).
last_seendateMost recent observation date (YYYY-MM-DD).
days_activenumber|nullActive days within the 30-day window.
days_runningnumberFirst-to-last observed span in days.
verdictstringLongevity verdict: scaling | evergreen | active | burned-out | new (same logic as the live pages).
lp_hoststring|nullLanding-page hostname the creative clicks through to.
detected_trackerstring|nullDetected tracking platform/network on the landing page, when identified.
variant_countnumberCount of detected sibling/variant creatives.
pageurlLink to the creative's page on mediabuyer.site.

Sample record

One line of the JSONL export looks like this (a real record shape):

{"creative_id":"…","network":"taboola","network_label":"Taboola",
 "advertiser":"…","advertiser_slug":"…","headline":"…",
 "vertical":"general","vertical_label":"General",
 "countries":["US","CA","GB"],"geo_count":3,"ad_type":"text",
 "first_seen":"2026-04-12","last_seen":"2026-06-11",
 "days_active":60,"days_running":60,"verdict":"evergreen",
 "lp_host":"example.com","detected_tracker":null,"variant_count":5,
 "page":"https://mediabuyer.site/spy/taboola/…"}

Prefer to read the first lines directly? Open the JSONL file or the CSV.

By the numbers

A few figures this snapshot supports (full report linked below):

  • Median confirmed creative lifespan: 10 days (across creatives observed running at least one day).
  • Longest-lived network by median: RevContent at 25 days median.
  • Share of creatives currently Scaling (fresh + live): 2.8%.

Read the full State of Native Advertising report →

License & how to cite

CC BY 4.0 (attribution). These are DERIVED FACTS about third-party ads, not the ads themselves. No raw creative images are included or redistributed. Facts are free to reuse with attribution; the underlying creatives remain their advertisers' copyright.

When you use these figures, please credit mediabuyer.site and link back to this page or the specific creative page. Suggested citation:

“MediaBuyer Native Ads — Open Derived Dataset, mediabuyer.site (observed Apr 11 – Jun 11, 2026).”

Methodology

Every field is derived from repeated public observation of served native-ad creatives — no spend, CTR, or ROI is claimed or inferred. The verdict (Scaling / Evergreen / Burned-out / Active / New) uses the same longevity logic as the live creative pages. Dates and the observation window are derived from the data, not from the build clock, so the export is reproducible from its inputs. Full method and definitions: methodology · data dictionary · limitations.