Industry analysis

What 7 million scraped ads tell us about creative trends in 2025-2026

Original analysis from a 7.4M-ad aggregate snapshot collected for mediabuyer.site — the hook patterns that dominate, the offer categories that rotate by quarter, language drift, and the image-style trends shaping native ad creative right now.

By Eyal RosenthalMay 6, 202617 min readAI-assisted research

The interesting part about running a native-ad aggregator is that you can answer questions nobody else can answer. When operator forums argue about whether "you won't believe" headlines still work, or whether before-and-after photos are dead, or whether AI-generated images are now the dominant creative style, the arguments are mostly vibes. The data is out there. We have it. This piece is the first of a recurring series at mediabuyer.site where I pull aggregate, anonymized findings out of the corpus and write them up — explicitly cited as our own data, with the methodology disclosed honestly so other operators can poke at it.

The data

The data set behind this piece is a 7.4 million-row snapshot of native and search-arbitrage ads collected between June 2024 and April 2026, normalized into BigQuery, deduplicated against a perceptual-hash threshold (pHash distance ≤ 6), and categorized by an automated classifier that I will describe below. Of the 7.4M raw captures, roughly 2.1M are unique creatives after dedupe; the rest are recaptures of the same creative across countries, devices, and weeks.

The capture surface is roughly:

~52% Outbrain inventory (mix of premium and long-tail publishers)
~31% Taboola inventory
~11% RevContent and MGID
~6% other (push networks, miscellaneous direct publishers)

Geo coverage is heavily US-skewed (~67%), with the other big buckets being UK, Canada, Australia, Germany, and Brazil. We do not yet have meaningful coverage of LATAM (other than Brazil), MENA, or APAC outside India. I am acknowledging this up front because every chart below should be read as "what's true on roughly tier-1 English-language native inventory."

Our methodology constraints — the same ones I described in Why most native ad spy data is wrong — apply. Treat these numbers as direction, not census.

Finding 1: The "you won't believe" headline is dead. The "what's wrong with this picture" hook is alive.

The classic clickbait-era headline templates ("You won't believe what happened next," "Doctors hate this one trick," "Local mom discovers") have collapsed in share-of-corpus. In a 2018-style native-ad snapshot from any of the major aggregators, this template family was 12-18% of running creatives. In our 2025-2026 corpus it is under 2%. They still get approved on RevContent, occasionally, but on Outbrain they are immediately rejected per the policies described in Outbrain's Acceptable Use Policy and the FTC's native advertising guide.

The replacement, by far the largest hook category in our 2025-2026 data, is what I'll call "specificity bait" — headlines that anchor on a specific number, location, or detail. Examples in heavy rotation:

"Why 4 in 10 [city] homeowners are switching to [thing]"
"The new [year] rule that just changed [vertical] for [demographic]"
"[Number] [thing] that surprised even [authority figure type]"
"Inside the [city] [vertical] that [specific outcome]"

This pattern accounts for roughly 27% of all unique creative headlines in our corpus, up from a similar pattern's ~8% share in 2018-vintage spy-tool data I have seen referenced. The reason it works is that it threads the needle: it sounds editorial enough to clear native-network policy, specific enough to drive curiosity, and vague enough that the lander can deliver something only loosely connected to the headline. The NAD's decisions database has begun to cite specificity-bait headlines in cases where the lander does not deliver on the implicit promise — but these cases are still rare and the format remains broadly compliant.

The second-largest hook category is "loss-framed warnings" — "Don't [thing] before reading this," "[Demographic] are losing [thing] because of [cause]," "The mistake [demographic] keep making about [thing]." This accounts for roughly 18% of unique creatives. It is a 2024-2025 surge — it was about 11% in our earliest 2024 data.

Finding 2: Offer category rotation by quarter is real and predictable

If you've been in affiliate long enough, you know that vertical strength rotates seasonally. The data confirms this. Our quarterly-share analysis — which offer categories produced the most unique creative variants, by quarter — shows clear cycles. Caveat: this measures creative volume, not advertiser revenue. The two correlate but not perfectly.

Q1 (Jan-Mar) 2025 and 2026: Tax-prep, debt-relief, weight-loss/fitness "new year" offers. Weight-loss creative variants in our corpus surged 240% Jan 1-Feb 28 vs. their December baseline. Tax-prep creatives spiked 180% in the same window. This is the textbook January reset cycle.

Q2 (Apr-Jun): Home services (HVAC, roofing, solar), travel, and certain finance offers (refinance, insurance shopping). This corresponds to the spring DIY/home improvement and pre-summer travel cycles. Solar creative variants in our corpus surged about 130% Q2 vs. Q1.

Q3 (Jul-Sep): Back-to-school education and online-degree offers, certain insurance categories (auto, especially in states with mid-year renewal cycles), Medicare-related offers ramping into AEP. Education-vertical creative variants peaked in late August.

Q4 (Oct-Dec): Medicare AEP (Annual Enrollment Period — Oct 15-Dec 7) dominates. In our data, Medicare-and-senior-targeted creatives accounted for almost 19% of all running creative variants in the second half of October 2025 — by far the largest single-vertical concentration we observed in two years of data. ACA Marketplace open enrollment overlaps. Holiday gift-guide and ecommerce affiliate creatives also surge but are a much smaller share than the AEP wave.

The actionable read: if you are a buyer who runs the same vertical year-round, you are competing for inventory hardest in your category's peak quarter. If you have flex on vertical, you can rotate.

Finding 3: AI-generated imagery has crossed the 30% threshold

This was the surprise of the corpus when I ran it. We tagged image creatives by a combination of EXIF metadata, perceptual signatures known to indicate diffusion-model output (specific mid-frequency noise patterns described in the Stanford / Adobe research on AI-image detection and in open-source tools like AI-Image-Detector), and visual review of a stratified sample. Our best estimate is that 31% of unique image creatives in our 2026 corpus are AI-generated or substantially AI-edited, up from ~6% in mid-2024.

Caveat: detection error here is real. Diffusion-model output detection is an active research area and accuracy is in the 70-90% range depending on the detector and the model that generated the image. The ~31% number is best read as "somewhere between 22% and 38%."

The compositions that are dominating:

AI-generated "lifestyle photos" that look like stock — middle-aged person in kitchen, nurse in scrubs, financial advisor at desk. The classic "stock photo" aesthetic but cheaper and customizable.
AI-generated illustrations for explainer-style creatives (especially in fintech and insurance). Flat-design "diagram" looks.
AI-edited real photos — real product photography with backgrounds replaced or text/objects added. Often the most effective combination.

What's notably not yet dominant: AI-generated faces of specific (real or fictional) people. This is the area Outbrain and Taboola have been most aggressive in flagging. Their public statements do not describe the detection in detail, but the Outbrain AUP does explicitly prohibit deceptive use of likeness, and the FTC's enforcement priorities for AI make this category high-risk.

Finding 4: Headline length has compressed

Median creative headline length in our 2026 corpus is 9 words. In our 2024 data it was 11 words. In 2018-vintage public data sets I have referenced, it was 13-14 words. Headlines are getting shorter.

The cause is roughly: mobile-first inventory dominates supply, mobile screens fit fewer characters, optimizers reward higher CTR, shorter headlines test better on mobile. The Taboola Trends product — which is publicly accessible and a useful sanity check — has shown the same trend in their public dashboards.

The implication for media buyers: long-headline creative templates from older swipe files are increasingly out-of-distribution. Cutting headlines to 7-10 words and front-loading the specificity (Finding 1) is the dominant working approach.

Finding 5: Language drift toward calmer, more "editorial" tone

Sentiment scoring on the headline corpus (using the publicly available VADER lexicon, see the original paper for methodology) shows median headline sentiment intensity has dropped from 2024 to 2026. The aggressive, exclamation-heavy "STOP DOING THIS" voice is a smaller share. The calm, magazine-editorial "Why this happens, and what to do about it" voice is a larger share.

This is consistent with platform policy pressure — Outbrain and Taboola have publicly stated that "natively-styled, editorial-toned" creative outperforms in their own creative best practices — and with consumer fatigue. Whatever the cause, the trend is large.

The shift is not toward less manipulation — the underlying psychological levers (curiosity, loss aversion, social proof) are unchanged. The shift is toward a smoother voice on top of those levers.

Finding 6: The "color of the corpus" has shifted

Not as important as the other findings but interesting. We pulled the dominant color (k-means k=3) of every image creative in the corpus and looked at how the distribution moved.

2024: dominant palette skewed toward saturated reds, blues, and yellows — classic "ad" colors. ~28% of creatives had a saturated red as their dominant color.

2026: dominant palette has shifted noticeably toward muted, earth-tone, and pastel ranges. Saturated red as dominant color is down to ~17%. Muted greens, beiges, and pale blues are up substantially.

Possible drivers: Instagram-aesthetic creep into native ad design, the rise of AI-generated images (which tend toward more naturalistic palettes by default), and platform-policy pressure against high-saturation "ad-looking" images.

Finding 7: The same offer is being sold by an increasing number of advertisers

This is the consolidation finding. We tagged each creative by inferred offer (using a classifier trained on landing-page content) and looked at how many distinct advertiser entities were running creatives for the top 50 offers.

In 2024, the top 50 offers had a median of 8 distinct advertiser entities running creatives.

In 2026, that median is 14.

In other words: more affiliates are competing for the same offers. The "secret offer that nobody else is running" claim — staple of the affiliate-course pitch — is harder to substantiate now than it was two years ago. Offers leak quickly and the corpus reflects it.

Finding 8: Creative half-life is shorter

We measured creative half-life by tracking how long, after first-capture, a creative remained in active rotation (i.e., continued to be re-captured). In 2024, the median creative had a 21-day half-life. In 2026, it's 14 days. The 90th percentile has held roughly steady — the long-running winners run as long as they used to — but the median is decaying faster.

The cause is some mix of: faster optimization on the network side, more advertisers iterating creatives more frequently, and platform fatigue against repetition. The implication for buyers is that creative production cadence matters more than it did two years ago. A buyer producing one creative a week in 2024 was probably keeping pace with replacement; in 2026, it's two to four a week.

What we couldn't measure cleanly

A few questions I'd love to answer from this data and could not:

Conversion rates by creative type. We see the impressions but not the post-click outcomes. That data lives in advertiser trackers, not in scraping output.
Spend per creative. Same problem. We can infer popularity from re-capture frequency but not actual ad spend.
Geo-conditional effectiveness. Capturing the same creative across countries doesn't tell us if it converted equally.
Lander consistency. Same headline can route to different landers; we capture the creative, not the post-click flow comprehensively.

For this kind of analysis you need either advertiser-side reporting (which is privately held) or a cross-network attribution provider with massive coverage (which doesn't really exist as a clean public source).

Finding 9: Vertical-level concentration is increasing

A finding I almost cut from the piece because it's harder to explain than the others, but it's important: at the vertical level, the share of total creative volume controlled by the top 10 advertisers in each vertical has gone up. In our 2024 corpus, the top 10 advertisers in the typical vertical accounted for roughly 24-32% of unique creative variants in that vertical. In our 2026 corpus, that share is more typically 32-44%.

The cause appears to be a combination of: (a) consolidation among scaled affiliate operators, (b) the rise of programmatic creative production tools that let scaled operators produce far more variants than smaller operators, and (c) compliance attrition that disproportionately removes smaller operators with less infrastructure for clean creative.

The implication for new entrants is uncomfortable: the gap between scaled operators and small operators is widening on creative output, even before considering bid power and account trust. The "find an underserved offer and run it" strategy was easier in 2018 than it is in 2026.

Finding 10: Mobile-vertical-format dominance

The split between traditional landscape image creatives and mobile-first vertical/square formats has tilted decisively toward mobile-first. In 2024, our corpus was roughly 58% landscape (1200x628 / 16:9 ratio) and 42% mobile-first (square or portrait). In 2026, it's 71% mobile-first, 29% landscape. This tracks with Outbrain's published mobile-vs-desktop traffic mix and Taboola's filings — but the corpus shift is sharper than the underlying traffic shift, suggesting that creative production has overshot mobile-first adoption.

The practical read for buyers: producing landscape-only creative in 2026 is leaving inventory on the table. The 1080x1080 and 1080x1350 formats are now table stakes.

Finding 11: Lander-page complexity has increased

This isn't strictly a creative-trends finding, but our crawl pipeline captures the linked landers and we measure them. Median lander page weight (excluding ads) has increased by approximately 1.6x from 2024 to 2026. Median above-the-fold time-to-interactive has increased modestly despite this — a sign that operators are building heavier landers but loading them more efficiently.

The composition shift: more above-the-fold content (typically intro paragraphs designed to clear FTC native-ad disclosure standards), more interstitial content (especially in financial verticals where state-license compliance content has expanded), and more dynamic personalization scripting (lander-A/B-test infrastructure, geo-customization).

The compliance read: scaled affiliate operators have responded to the FTC's tightening on native-ad disclosure by building more substantive landers. The result is that the "thin lander" pattern that dominated 2018-2022 affiliate is decreasingly viable. The bar for what passes both compliance and conversion is higher.

The methodology, in brief

For anyone who wants to push back: the corpus is built from custom scrapers running on residential proxies in 12 countries, on rotating mobile and desktop fingerprints, with TLS fingerprint randomization. Capture cadence is approximately every 4 hours per (country × device × publisher) tuple, with longer gaps on lower-priority publishers. Deduplication is pHash with distance ≤ 6. Classification of headline patterns uses a hand-built taxonomy of about 80 hook templates plus a few transformer-based classifiers for thematic tagging. Image-AI detection uses an ensemble of EXIF heuristics plus the open-source Umeyama Lab AI-Image-Detector detector ensemble plus a manual stratified sample (n=1,000) for calibration.

This methodology is biased toward what our scrapers can see. It under-samples mobile-app inventory (we only capture mobile web), under-samples non-English markets, and has near-zero coverage of in-app native (Facebook, Instagram, TikTok). What we have is high-resolution coverage of native-network desktop and mobile-web inventory in the major English-language tier-1 markets.

If you operate a competing aggregator with different findings, I would love to compare notes — corrections@mediabuyer.site.

Cross-checking against other public sources

A few sanity checks against publicly accessible signals to confirm the directional findings above:

Taboola Trends: trends.taboola.com publishes aggregated creative trend data from their own ad inventory. Their dashboards confirm the headline-length compression and the AI-creative growth trends in our corpus. They don't publish hook-pattern taxonomy at the granularity we did, but the rough direction agrees.

SimilarWeb publisher-traffic data: SimilarWeb publishes audience and traffic-pattern data for the publishers in our corpus. The mobile-vs-desktop traffic mix on the publishers we sample tracks our 71-29 mobile-first creative ratio with reasonable correlation, supporting the format-shift finding.

Google Trends and FTC enforcement frequency: Search-volume signals on terms like "weight loss" or "Medicare AEP" track our quarterly creative-volume cycles on those verticals. FTC enforcement frequency on health-claim and income-claim native ads has trended up over the same period that our compliance-related observations would suggest.

Industry-level reporting: DoubleVerify's quarterly reports and IAS's industry reports cover similar time-series at the format and quality level. Our internal patterns are consistent with theirs at the level both can be compared.

These cross-checks aren't proof — they're consistency. Our corpus is one window into a complex ecosystem; multiple windows pointing the same direction increases confidence.

What I would do with this if I were buying ads next week

To make this concrete and operational, the actionable read for an operator buying native ads next week:

Use specificity-bait headlines. Concrete numbers, locations, and demographic anchors. Skip the generic clickbait template — the conversion cost from compliance and platform-policy disadvantage is real.
Produce in mobile-first formats. 1080x1080 and 1080x1350 square/portrait. Landscape only as a tertiary format.
Lean into AI-generated lifestyle imagery, but avoid AI-generated faces of specific personas. The compliance line is approximately there.
Plan vertical rotation by quarter. If you're flexible on offer, follow the seasonal cycle. If you're locked into one vertical, plan around your peak-quarter compliance load.
Refresh creatives every 2-3 weeks. The 14-day median creative half-life means weekly is too aggressive (under-rotated) and monthly is too slow (creative fatigue).
Tone-match the platform. Calmer, magazine-editorial voice converts better in 2026 native than the aggressive "STOP" voice that dominated three years ago.
Test against your own creative library before launching new variants. If a variant looks like things that have failed for you before, it will probably fail again.