Link Intelligence Methodology: How Dabudai Extracts and Interprets Links From AI Answers
Jump to
Dabudai measures AI visibility using the links and placements users can see in AI answers.
We record brand position exactly as displayed in the provider interface. We do not deduplicate links. If the same page appears multiple times, every occurrence is counted.
Share of Voice is position-weighted: higher placement contributes more.
We support three core experiences: ChatGPT, Google AI Overviews, and Google AI Mode.
Customers select country + language, so outputs match the real market.
We extract every visible link occurrence (your pages, other brands, and third-party sources) and analyze it at page level (no domain grouping).
Because AI answers vary, we rely on aggregated data from repeated runs, not one-off checks
Key definitions (foundation)
Link occurrence: one visible appearance of a clickable link in an AI answer; each appearance is counted, even if repeated.
Brand position: the placement of a brand in a list or structured answer, recorded as shown in the provider interface.
First-party link: a link to your own pages (your website URLs).
Other-brand link: a link to other brands’ pages, including competitors.
Third-party source: an external page URL linked or cited by the provider (tracked at page level, not grouped by domain).
Data fidelity rules (we record what users actually see)
Dabudai is built for accuracy and auditability.
We record outputs as displayed in the provider interface.
We do not alter placements or link occurrences.
Brand position is recorded as shown
We record brand position based on its placement in the AI answer.
We do not re-rank, normalize, or reorder positions.
We do not deduplicate links
We do not deduplicate links inside a single answer.
If the provider shows the same page multiple times, we record every occurrence.
We report raw occurrences so teams can review the same repetition users see.
This can affect page-level stats and Share of Voice.
Share of Voice is position-weighted (consistent weighting)
Share of Voice is position-weighted.
Higher placement in an AI answer contributes more to SOV.
Weighting is computed consistently by our backend across runs (no manual adjustments).
UTM removal (display-only cleaning)
We remove UTM parameters so teams can review clean URLs.
This does not change the destination page the provider linked to.
We do not group by domain
We do not group links by domain.
We keep page-level links exactly as returned in the AI answer.
This preserves page-level winners instead of hiding them inside domain totals.
Quick rule table
Rule | What we do | Why it matters |
Position as shown | Record brand placement exactly as displayed | Comparable placement over time |
No link deduplication | Count every visible link occurrence | Matches what users see in the UI |
Position-weighted SOV | Higher placement contributes more | Reflects real attention and impact |
Remove UTMs | Clean display URLs only | Faster review without changing meaning |
No domain grouping | Keep page-level links | Page-level insight stays accurate |
Providers and locale controls (where results come from)
Different providers can return different answers and show links differently.
Locale matters too: the same buyer question can produce different links and sources by country and language.
Primary providers
We measure across three core experiences:
ChatGPT
Google AI Overviews
Google AI Mode
Country + language selection (locale)
Customers select a country and language for analysis. We run measurements in that fixed locale so outputs match the target market. For clean comparisons over time, keep locale consistent.
What we extract from each answer (raw fields)
For every run, we store raw output so results are auditable and comparable over time.
We capture the same elements users can see in the provider interface.
Raw fields we capture (as shown)
From each AI answer, we record:
Provider (ChatGPT / Google AI Overviews / Google AI Mode)
Country + language (locale)
Timestamp
Buyer question ID
Full answer text
Position in the answer (brand placement, as displayed)
Every visible link occurrence (each appearance is recorded, no deduplication)
Link title (when shown)
Full URL (cleaned for readability by removing UTM parameters)
Domain (extracted from the full URL)
Visible sources (when displayed by the provider)
What we compute on the backend (derived metrics)
We then compute consistent metrics from the raw dataset, including:
Share of Voice (SOV)
Recommendation Rate (Top 5)
Aggregated views that summarize results for customers (across runs, topics, and buyer questions)
This is why we use aggregated reporting instead of one-off answers.
Share of Voice methodology (position-weighted, link-based)
Share of Voice (SOV) helps you compare AI visibility across brands.
It is designed to reflect what users are most likely to notice in an AI answer.
What SOV represents in Dabudai
SOV is a position-weighted share based on how brands appear in AI answers.
Higher placement contributes more than lower placement.
We count every visible link occurrence, even if the same page appears multiple times (no deduplication).
Position-weighted SOV (text formula)
Assign a weight to each occurrence by position: w(r) = 1 / r, where r is the position (1 = highest).
Sum weights for a brand b across all its occurrences: W_b = Σ (1 / r_i).
Compute SOV as a share of total weight across all brands: SOV_b = W_b / Σ W_k.
Why position affects SOV
Top placements get more attention than lower placements.
Position-weighting reflects this reality.
What influences SOV in your dataset
SOV can change based on:
Higher brand placements across answers
More visible link occurrences to your pages (including repeated appearances)
Stability across repeated runs (larger sample → stronger signal)
Locale differences (country + language)
Third-party source intelligence (external pages that shape answers)
AI answers often reference external pages.
These pages can influence which claims and links appear in recommendations.
Understanding them helps you increase the chance your pages get cited.
What we capture
We capture third-party links as shown in answers.
We keep them at page level (we do not group by domain).
We record every visible occurrence, even if a source page appears multiple times
How to use third-party sources (step by step)
Find external pages that appear most frequently and consistently across runs (in the same country + language).
Review what claims they support and which brands they reinforce.
Decide one distribution move: where to publish, pitch, or improve listings.
Ship one change and track whether your pages start appearing as sources over time.
Practical examples (so it’s clear)
If review pages dominate “best” questions, improve your listings and publish one proof page that supports your main claim.
If media articles dominate a topic, pitch a data-backed angle and publish a methodology page that can be cited.
Source actions table
Source page type | What it usually influences | Best action |
Media article | Narrative and credibility | Pitch data + publish a citable methodology/proof page |
Review page | Shortlists and comparisons | Improve listings + add proof and clear “best for” |
Forum thread | Buyer language and objections | Publish FAQs that answer objections with specifics |
Directory page | Category associations | Ensure positioning is accurate and links point to the right page |
Sampling and aggregation (why we use trends, not one-offs)
AI answers can vary even when the prompt stays the same.
That is normal. It is why one-off analysis is not reliable.
Our sampling approach
We run the same buyer questions multiple times per day and aggregate results across the full dataset. We use weekly windows to reduce noise and see stable trends.
In general, a larger sample size produces a more valid signal.
Sampling table
Sampling choice | Why we do it | What it improves |
Multiple runs per day | AI outputs vary | Reduces one-off bias |
Weekly aggregation | Trends matter more than snapshots | Stronger signal for decisions |
Fixed locale | Results differ by market | Comparable benchmarks |
Stable question set + stable format | Keeps inputs consistent | Clear before/after measurement |
Manual replication checklist (7 steps)
If you want to reproduce Dabudai manually, use this workflow.
It is verifiable, but it is time-consuming.
Take 20 buyer questions that match your ICP.
Choose providers (ChatGPT, Google AI experiences).
Fix country + language for the week.
Use a stable output request format (for example: “Top 5 + links when shown”).
Run every question multiple times per day in each provider.
Record brand positions exactly as shown.
Copy every visible link occurrence (including repeats), across your pages, other brands, and third-party sources.
Repeat for 7 days, then calculate metrics using the full dataset.
FAQ
What counts as a “link” in Dabudai?
A link is a visible, clickable URL shown in the AI answer.
If links are not visible in the interface, link-based outcomes are not observable for that run.
Why don’t you deduplicate links?
Because we want statistics to match what users see in the AI interface.
Each visible occurrence is counted, even if the same page appears multiple times.
Does repeated linking affect Share of Voice?
Yes. SOV is position-weighted.
More occurrences and higher placements increase a brand’s weighted share in the dataset.
Why do you remove UTM parameters?
UTMs make URLs harder to review.
We remove them for readability without changing the destination page.
Why don’t you group by domain?
Grouping by domain can hide page-level differences.
We keep page-level links so you can see which exact pages win visibility.
Why does locale (country + language) matter?
AI answers can differ across markets.
For clean comparisons, keep country + language fixed across your measurement window.
Why can the same prompt return different links?
Providers can vary due to model updates, ranking shifts, and context effects.
That is why we measure with repeated runs and aggregated trends.
What is a “page ranking” in Dabudai?
It is a ranking of pages by how often they appear in AI answers.
It is based on raw link occurrences (including repeats) across runs.
How should we use third-party source insights?
Use the list of third-party pages to guide distribution and PR. Track whether your own pages start appearing as cited sources over time












