← Terug naar blog

From human centric to agent native web search

AI

An Architectural, Economic, and Security Analysis of the New Information Fabric

Section 01: Executive Summary: 15 Key Insights

This report provides an exhaustive analysis of the ongoing, fundamental transformation of web search. It details the paradigm shift from a human-centric, ad-click-driven infrastructure (1998–2025) to an AI-native, agent-centric, and API-driven global information fabric. The analysis integrates technical architecture, economic modeling, advanced security threats, and regulatory governance to provide a holistic, forward-looking assessment for leaders, architects, and policymakers.

The 15 key strategic findings of this analysis are as follows:

Section 02: Historical Context: The PageRank-Centric Web (1996–2025)

To comprehend the magnitude of the current architectural and economic transformation, one must first deconstruct the foundational paradigm that governed the web for its first 25 years. This was the human-centric era, defined and dominated by a single, revolutionary algorithm: PageRank.

The Pre-Google Era (1993–1998): The Problem of Discovery

The nascent web of the early 1990s was a disconnected repository of files. The primary challenge was not quality but discovery. The first generation of “search engines” were simple tools built to address this. Archie (1990), the very first, was not a web crawler but an index of FTP file listings.42 It was followed by directory-based tools like Veronica 42 and, most famously, Yahoo! (1994).44 Yahoo! was initially a human-curated directory, where information seekers could browse categories rather than perform a keyword search.44

Simultaneously, the first true crawlers emerged. The World Wide Web Wanderer (1993) was the first robot, designed simply to track the web’s growth.42 Excite, which began as Architext in 1993, offered keyword search capabilities alongside its directory 42, and WebCrawler (1994) was the first to offer “full text” search, allowing users to search for any word on any webpage.44

This first generation of tools (Excite, Lycos, Infoseek, Inktomi, AltaVista) 44 solved the problem of discovery. However, they lacked a mechanism for authority or quality. Their reliance on full-text keyword matching made them trivially easy to manipulate. The earliest Search Engine Optimization (SEO) was simple “keyword stuffing,” which led to poor, irrelevant, and often-manipulated search results.47 Search engines like Inktomi became commoditized backends, powering portals like Goto.com and MSN Search, which then layered their own monetization (e.g., paid links) on top.48

The PageRank Revolution (1998): Authority as the Solution

The 1998 paper from Sergey Brin and Larry Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” 49 introduced a fundamentally new concept. The problem, they argued, was not finding pages but ranking them. Their solution, PageRank, was a link analysis algorithm that repurposed the web’s own “citation (link) graph” 50 as a global voting system to measure authority.

The algorithm’s brilliance was its recursive definition: “A page that is linked to by many pages with high PageRank receives a high rank itself”.51 A hyperlink was counted as a “vote of support”.51 PageRank was, in effect, a method for “objectively and mechanically… measuring the human interest and attention devoted to” web pages.52 This “wisdom-of-crowds” justification 50 provided a robust, global authority score that was highly resistant to the simple keyword-stuffing manipulation that plagued its predecessors.47

The Economic Model (2000–2025): Clicks as Currency

PageRank’s stable, high-quality authority signal did more than create a good user experience; it created a new, sustainable economic model. The core architecture of the Google paradigm was as follows:

This click became the fundamental, monetizable unit of the human-centric web. It created the inventory for the sponsored search advertising model 53, which became the economic engine for the entire Web 2.0 ecosystem.1

The profound second-order effect of this architecture was the creation of the multi-billion-dollar SEO industry.43 The entire industry is a direct economic consequence of a human-centric, link-authority-based algorithm. Its goal is to analyze and influence human-centric authority signals (backlinks, content relevance) to capture human traffic, which is then monetized via ads.

This entire, 25-year-old paradigm—built on human users, PageRank authority, and ad-click currency—is the system that is now collapsing. The shift to an agent-centric web is not an evolution; it is a replacement of every foundational assumption, as detailed in Table 1.

Table 1: The Paradigm Shift: Human-Centric vs. Agent-Centric Search

Feature****Paradigm 1: Human-Centric (1998–2025)****Paradigm 2: Agent-Centric (2025–)****Primary UserHuman EyeballsAI Agent 35Primary GoalDiscovery & NavigationTask Completion & Reasoning 39Core ProblemRanking a list of URLsSynthesizing a trusted, token-efficient answerKey AlgorithmPageRank (Link-based Authority) 51RAG + TTC (Hybrid Retrieval & Reasoning) 54Query TypeKeywords 47Semantic Objectives / Natural Language 4MonetizationAd-Click Revenue 1API Call / GPU Compute Cost 2Key AssetThe URL / Click-Through Rate (CTR) 1Token-Efficient Context 4Primary AttackSEO / Link Spam 47Index Poisoning / Prompt Injection 18

Section 03: Technical Decomposition: The AI-Native Search Architecture

The new agent-centric paradigm is enabled by a fundamentally different technical stack. While classical search was an information retrieval system, AI-native search is a retrieval and reasoning system. It is not designed to return a list but to synthesize an answer. This requires a hybrid architecture that balances classical precision with semantic depth, underpinned by a new, complex vector processing pipeline.

The Hybrid Retrieval Core: Sparse + Dense Fusion

AI-native search is not, as commonly misunderstood, a simple replacement of keyword search with “vector search.” Instead, it is a fused architecture that retains the strengths of classical Information Retrieval (IR) while integrating the power of dense embeddings.32

Vector Pipeline Architecture: The Embedding Engine

The core asset of the new search is the vector index. Storing and searching billions of high-dimensional vectors (embeddings) at high speed and low cost requires a specialized pipeline. This pipeline relies on Approximate Nearest Neighbor (ANN) search algorithms 61 rather than exact search.

Query Planning and Execution

The AI-native architecture must serve an AI agent, not a human. A human issues a query; an agent issues a task or objective. This requires a “query planning” layer. A complex, multi-hop objective (e.g., “Analyze the Q2 2025 financial performance of the top three EV companies and summarize their supply chain risks”) cannot be answered in a single retrieval.

The architecture must support a semantic reformulation loop where the agent:

The Core Architectural Tension: Drift vs. Freshness

This new, complex stack introduces a fundamental operational conflict that did not exist in classical search: the tension between model quality and index freshness.

This creates a high-stakes, co-dependent battle. AI models have a strong recency bias and perform better when citing fresher content.67 A stale index (low freshness) will cause the agent to retrieve outdated information and confidently “hallucinate” incorrect facts.15

However, the solution to context drift—re-crawling and re-embedding the index—can itself cause system failure. The agent’s query-planning logic may have been optimized for the structure and content of the old index. When the index is suddenly updated, the agent’s learned reasoning patterns may break. Therefore, the DataOps team managing index freshness 68 and the MLOps team managing model drift 14 are locked in a continuous, unstable operational balance, where a fix for one can break the other.

Section 04: Index Engineering: The New Frontier of Curation and Defense

In the PageRank paradigm, the index was a reflection of the web. In the AI-native paradigm, the index must be a curated, security-hardened asset. The astronomical cost of embedding and the new adversarial vectors of data poisoning fundamentally change the philosophy of crawling and indexing.

Recrawl Strategy: From Brute-Force to Intelligent Policy

Classical crawlers were optimized to minimize age (the time since a page changed) and maximize freshness (the binary state of being up-to-date).68 This often involved brute-force crawling strategies based on statistical update models, such as Poisson distributions.68

This “dense crawl” model is economically non-viable for an AI-native index. The “high cost-to-serve” 2 is not just in querying but also in building the index. The GPU burn-rate required to re-embed the entire web daily is prohibitive.69

The new architecture thus requires an “RL sparse recrawl” policy. This implies that a Reinforcement Learning (RL) agent, or a similar sophisticated policy, now manages the URL frontier and crawl budget. This agent must be trained to optimize a complex, multi-objective function:

Authority Scoring for a Post-PageRank World

The PageRank algorithm 51 is foundationally broken in an agent-centric world. Its core signal—the “citation (link) graph” 50—is no longer trustworthy. In an era of generative AI, an adversary can create infinite fake pages with infinite fake citations, making link-based authority meaningless.

The AI-native index requires a new, multi-modal “authority score.” This score is not about links; it is a trust and provenance score designed explicitly to mitigate hallucination propagation. The indexer must score content based on a new set of signals:

Security-Hardened Curation

The index is no longer a neutral repository; it is the primary data supply chain for every agent on the platform. As such, it must be treated as a primary, high-value target for attack. The index engineering pipeline must become an adversarial filtering process.

This involves active “index poisoning and poisoning defense.” The crawler and indexer must be designed to detect and neutralize content specifically engineered to be retrieved by RAG agents. This includes filtering for “indirect prompt injections” (see Section 08) and content designed to skew the embedding space. The index, in short, is no longer just built; it is defended.

Section 05: Agent-Centric Search Pipelines and Orchestration Patterns

The primary user of the new search is an AI agent. This “user” is not a human browsing but an automated system acting. This requires an entirely new set of “scaffolding”—the architectural components that enable an LLM to reason, plan, and execute complex tasks.

The Reasoning Synergy: RAG + TTC

The agent’s “brain” is a synergy of two key processes: Retrieval-Augmented Generation (RAG) and Test-Time Compute (TTC).

The combination of these two creates Agentic RAG. This is not a single, passive retrieval. It is an active, intelligent feedback loop.54 The agent retrieves (RAG), reasons on the context (TTC), identifies gaps, reformulates a new query (TTC), and retrieves again (RAG). This loop continues until the task is complete, allowing the agent to dynamically “try one more time” 65 and assemble fragmented clues from multiple sources.34

Orchestration Patterns: From Flat to Hierarchical

Simply connecting multiple agents in a “flat” system fails at scale. Production deployments have shown that without coordination, agents “duplicate work, operate on inconsistent states, and burn through token budgets”.7

The solution that has emerged as the dominant, scalable pattern is hierarchical multi-agent orchestration.72 Frameworks like “AgentOrchestra” 72 provide a clear blueprint for this architecture:

The Foundational Requirement: Shared Memory Engineering

This hierarchical orchestration is architecturally impossible without one final, critical component: a shared state. “Memory engineering” is the missing foundation for scalable multi-agent systems.7

The central problem is that agents, built on stateless LLMs, must operate in a stateful world. “Most multi-agent AI systems fail not because agents can’t communicate, but because they can’t remember”.7

The solution is a “shared persistent memory system,” a “computational exocortex” 7, or “shared memory structure” 8 that all agents can read from and write to. This component, which may be a vector database, a high-speed cache, or a message pool, acts as the collective memory and shared reality for the entire “orchestra”.75 It stores “memory units” (intermediate findings, artifacts, user context) 7, allowing an agent to see what other agents have already done, thereby preventing work duplication, resolving inconsistent states, and dramatically reducing communication overhead.7 This “Cache-Augmented Generation” (CAG) 76 is the true enabler of collective agent intelligence.

Failure Modes and Rollback

In this new paradigm, failure has kinetic consequences. Unlike generative AI, which produces content for a human to review, an agent acts.77 It can “push malformed pricing data into production” 78 or “book flights” 77, resulting in direct financial loss and safety risks.77

A critical failure mode is the recursive API call loop 79 or tool misuse, where an agent gets stuck or performs a destructive action. Traditional database rollback strategies are insufficient, as they would destroy hours of accumulated agent context.

The correct mitigation strategy is context snapshotting. At critical decision points (e.g., before an API call), the agent’s full state and memory are captured as a lightweight snapshot.80 If a failure occurs, the system does not restart from scratch; it resumes from the last known-good snapshot, preserving the accumulated reasoning and context.80

Section 06: Vendor Landscape and Architecture Trade-Off Map

The collapse of the old paradigm has triggered a Cambrian explosion of new, API-first vendors building the infrastructure for the agent-centric web. Simultaneously, incumbents are making massive strategic pivots to avoid being commoditized.

The AI-Native Challengers

A new class of startup has emerged, building proprietary search indexes from the ground up for AI agents.

The Incumbent’s Gambit: Microsoft’s Vertical Play

The most significant strategic move in the market has come from the largest incumbent. Microsoft has announced the full retirement of the general-purpose Bing Search APIs, effective August 11, 2025.5

This is not a simple product deprecation; it is a profound strategic pivot. Microsoft is forcing users to migrate to “Grounding with Bing Search as part of Azure AI Agents”.5

This move is a classic vertical integration and ecosystem-lock-in strategy.

Table 2: AI-Native Search Vendor Architecture Comparison

VendorIndex SourceCore PhilosophyKey OutputLatency / Depth****TavilyGoogle (Wrapper) 81Fast RAG integration; low overhead 82Structured answers 82Low Latency / Shallow Depth 82ExaProprietary 81Semantic Depth & Full Control 81Full Page Content 81High Latency / Deep Research 82Parallel AIProprietary 4Token-Relevance Ranking 4Information-Dense Excerpts 4Single-Call Resolution 4**Microsoft (New)**BingEcosystem-Locked Grounding 6Agent Response (in Azure) 83Azure-Dependent 6

Section 07: Economic Models: The Collapse of Clicks and the Rise of Compute

The agent-centric transformation is, at its core, an economic one. It represents the collapse of the 25-year-old ad-click economy and its replacement with a new, far more costly model based on GPU compute.

The Collapse of the Ad-Click Economy

The introduction of AI-native search and “AI Overviews” (like Google’s) 1 is an existential threat to the ad-based web. These tools eliminate the click. The user receives a synthesized answer on the results page, removing any incentive to click through to external publisher websites.

The economic consequences are immediate and severe:

Because the ad-sponsored search model (which generated ~$175B for Google in 2024) 1 is entirely dependent on clicks as its monetizable inventory, this trend renders the model economically non-viable.

The New Economics: GPU Burn Rate and Total Cost of Ownership (TCO)

The ad model is being replaced by a “cost-to-serve” model.2 Unlike traditional software, which has a near-zero marginal cost for replication, “each AI interaction consumes significant computational resources”.2

The key economic driver is the cost of GPU inference. According to Google’s own analysis, an AI-powered search query can be up to 10x more costly than a standard keyword search.2

This 10x cost-per-query creates “negative gross margin” dynamics.2 Companies that price their AI products like traditional SaaS (e.g., flat subscriptions) are reportedly “losing money” on their power users, as the compute costs of those users exceed their subscription revenue.3 This economic model is unsustainable at scale.

This brutal, GPU-driven cost curve creates a powerful incentive to outsource. The sheer cost and complexity of building and maintaining an in-house index, embedding pipeline, and inference stack 84 makes it impractical for most companies. This creates the market for the API-first vendors described in Section 06.

Vendor Lock-in and Mitigation Strategies

The incentive to outsource, combined with the immaturity of the AI stack, creates a new and severe risk of vendor lock-in.

The high switching costs and platform-specific dependencies (e.g., Azure AI Agents 87) create a strategic liability. Mitigation requires an architectural and philosophical commitment to interoperability.

Section 08: Security, Provenance & Zero Trust Search Architecture

The AI-native search stack does not just change the attack surface; it inverts it. The primary threat is no longer at the network perimeter but in the data supply chain and the model’s reasoning process. Defending this new stack requires a paradigm shift from perimeter security to a Zero Trust Architecture (ZTA).

The New Attack Surface: Hacking the RAG Pipeline

The RAG pipeline, which connects the LLM to the external index, is the new attack vector. An attacker no longer needs to penetrate the application; they only need to get their malicious payload into the index, knowing an agent will retrieve it. The OWASP Top 10 for Large Language Model Applications provides a clear taxonomy for these new threats.16

Table 3: AI-Native Attack Surface Risk Register (OWASP Mapping)

Threat VectorMechanismOWASP AI Top 10****Source(s)****Indirect Prompt InjectionThe primary threat. An attacker plants a malicious prompt (e.g., “Ignore all previous instructions and…”) on a public webpage. The agent’s RAG pipeline retrieves this “trusted” content, and the LLM executes the malicious instructions, leading to data exfiltration or content manipulation.LLM01: Prompt Injection 1617Index / Data PoisoningAn attacker floods the web with AI-generated spam or subtly poisoned documents. These are ingested by the crawler and contaminate the training data or retrieval index, skewing the model’s answers or creating backdoors.LLM03: Training Data Poisoning 1690Embedding Poisoning (SEP)A sophisticated deployment-phase supply chain attack. An attacker injects “imperceptible perturbations” directly into the embedding layer of a model (e.g., on Hugging Face). This bypasses safety alignment and induces harmful behavior with a >96% success rate.LLM05: Supply Chain Vulnerabilities 1619Agent Tool MisuseAn agent is tricked (typically via LLM01) into calling a tool or API with malicious parameters. This exploits the agent’s ability to act, enabling the attacker to execute arbitrary commands, exfiltrate data, or cause financial/kinetic harm.LLM08: Excessive Agency 1677Reasoning PoisoningA stealthy form of data poisoning where the attacker modifies only the reasoning path (Chain-of-Thought) in the training data, leaving the prompt and final answer “clean” to evade simple detection.LLM03: Training Data Poisoning92

Defense: A Zero Trust Architecture (ZTA) for AI Search

Traditional “perimeter” security, which trusts everything “inside the firewall” 93, is catastrophically unsuited for this paradigm. The threat (the poisoned data) is already inside.

The only logical defense model is a Zero Trust Architecture (ZTA), as defined by NIST Special Publication 800-207.22 ZTA’s core tenets are to “never trust, always verify” and to “focus on protecting resources… not network segments”.94

Applied to an AI-native search pipeline, a ZTA means:

Section 09: Governance & Compliance: Mapping Agents to Law

The new agentic architecture is not being deployed in a vacuum. It is emerging in a mature, and increasingly strict, regulatory landscape. The technical design of these systems is on a direct collision course with foundational principles of EU law, creating significant liability for their operators.

EU AI Act: Risk Classification

The EU AI Act classifies systems based on risk.95 The classification of an AI-native search engine is not monolithic; it depends entirely on its domain and application.

NIS2: Search as Critical Infrastructure

The most immediate and impactful regulation is the NIS2 Directive. This directive explicitly expands its scope to include “digital providers” such as “online search engines” and “digital infrastructure” (like data center providers).29

This reclassification has profound consequences. Operators of AI-native search (e.g., Parallel, Exa, Microsoft) are now legally designated as “operators of essential services” (i.e., critical infrastructure).29

This imposes new, legally binding cybersecurity obligations 29, including:

GDPR: The Inherent Conflict

While the AI Act is new, the 2018 General Data Protection Regulation (GDPR) presents the most fundamental architectural challenge to agentic AI.

Table 4: Governance & Compliance Mapping

AI Component / Function****EU AI Act (Risk)GDPR (Article)NIS2 (Obligation)General Web Search AgentLimited Risk (Transparency) 95N/ACritical Infrastructure 29Agent (High-Stakes Domain)High Risk (Full Burden) 96Art. 22 (Human Intervention) 27N/APersistent Agent MemoryN/AArt. 5(e) (Storage Limitation) 26N/ARAG Pipeline / IndexN/AArt. 35 (DPIA Required) 102Supply Chain Security 31Security Breach / PoisoningN/AArt. 33 (Breach Notification)24-Hour Incident Report 29

Section 10: Evaluation Frameworks & Benchmarks

The paradigm shift from a list of URLs to a synthesized answer renders classical IR evaluation metrics obsolete. A new framework is required to measure the quality of reasoning, not just the quality of retrieval.

The Obsolescence of Classical Metrics

Classical IR benchmarks (e.g., TREC) rely on metrics like mAP (mean Average Precision), nDCG@10 (normalized Discounted Cumulative Gain at 10), and R@50 (Recall at 50).32 These metrics are mathematically designed to do one thing: evaluate the quality of a ranked list of documents.103

In the AI-native paradigm, the final output is not a list; it is a generated, synthesized paragraph of text. A low-quality retrieval list could, in theory, be synthesized into a good answer, and a high-quality list could be “hallucinated” into a bad one. These metrics are no longer a proxy for user success.

The New Standard: Hybrid + Reasoning Metrics

Evaluation for an AI-native system must be bifurcated, measuring both the retrieval pipeline and the reasoning agent.

Benchmark Case Study: BrowseComp

The new benchmark philosophy is epitomized by BrowseComp.33 BrowseComp was created specifically because existing benchmarks for simple, isolated fact retrieval (like SimpleQA) were “saturated” and “solved” by modern browsing agents.34

BrowseComp is not designed to measure simple retrieval. It is designed to measure agentic persistence.33

This benchmark provides a direct, empirical link between agentic quality and economics. As shown in BrowseComp’s performance (Figure 1 in 106), there is a clear log-scale relationship between accuracy and test-time compute (TTC). Higher-performing agents are higher-performing because they expend more compute (i.e., “browsing effort”).106 This proves, at a benchmark level, that high-quality agentic search is a direct function of compute cost, reinforcing the economic model described in Section 07.

Section 11: Failure Modes & Adversarial Scenarios

While Section 08 detailed the tactical attack vectors on an individual RAG pipeline, this section analyzes the systemic, long-term failure modes and adversarial scenarios that threaten the entire AI-native ecosystem.

Agent-Level Failure: Kinetic and Operational Risk

As a recap of the risks introduced in Section 05, the primary tactical failure mode is that agents can act, not just write. An agent with “excessive agency” (OWASP LLM08) 16 that is compromised (e.g., by an indirect prompt injection) can “schedule meetings,” “book flights” 77, or “push malformed pricing data”.78 This creates immediate, real-world financial and safety risks 77, a failure class that did not exist in classical search. These tactical failures are mitigated by context snapshots 80 and ZTA (Section 08).

Ecosystem-Level Failure: Recursive Web Inflation and Model Collapse

The most profound, existential threat to the AI-native search paradigm is not a tactical hack but a systemic poisoning of the global data commons. This is the scenario of “recursive web inflation.”

The “Model Collapse” Phenomenon:

First detailed in a 2024 Nature paper 11, “model collapse” is a “phenomenon in which recursive iterations of training on synthetic data lead to performance degradation”.13

The Vicious Cycle for AI-Native Search:

This academic phenomenon becomes a catastrophic, real-world failure mode when applied to the AI-native search ecosystem.

This is the scenario of “large-scale adversarial saturation.” It is the primary existential risk to the long-term viability of an information fabric built on web-scale RAG.

Section 12: Foresight 2030: Scenarios and Trajectories

The technical, economic, and security pressures analyzed in this report project a 2030 information landscape that is radically different from the human-centric web of today. The trajectories point toward three interconnected scenarios: the agent as the primary user, the fragmentation of the global index, and the rise of a new agent-to-agent negotiation layer.

Scenario 1: The Agent is the Web

By 2030, the primary “user” of the internet will not be a human. Projections estimate that non-human traffic will exceed 80% of all web traffic by that year.35 This fundamentally reshapes the entire purpose of the web.

The web will bifurcate. The human-centric “pixel-perfect landing page” 36 designed for emotional engagement will wane in importance. The dominant web will be an “invisible userspace” 36 designed for machine-to-machine interaction. Marketing, brand discovery, and e-commerce will shift from influencing human attention to influencing AI agent decision-making.35 Success will depend on API-first discovery, stable data schemas, and machine-readable trust signals, not visual design or emotional copy.36

Scenario 2: The Fragmented Index Ecosystem

The “one index to rule them all” (Google) model is over. The enormous GPU costs of indexing (Section 07), the severe security risks of a contaminated public web (Section 11), and the data-governance mandates (Section 09) will force the global index to fragment.

The 2030 landscape will be a “fragmented index ecosystem” composed of:

Scenario 3: The Agent-to-Agent (A2A) Negotiation Layer

A fragmented index ecosystem requires a new, abstract layer for interoperability. This will be the “agent-to-agent (A2A) negotiation layer” 40, built on open standards like MCP and A2A.41

This “multi-agent orchestration layer” 39 will function as the new, invisible information fabric. A user’s complex task will not be solved by one agent. Instead, it will be executed by an orchestra of agents negotiating across the fragmented index:

This emergent, decentralized, and agent-driven fabric—not a single search engine—will be the “web” of 2030.

Section 13: Strategic Recommendations for Builders, Enterprises, and Regulators

The transformation to an agent-native fabric presents distinct, high-stakes challenges and opportunities. The following recommendations are designed to be actionable for the three key stakeholder groups: builders (labs and API developers), enterprises (C-level, architects, and SOCs), and regulators (policymakers).

For Builders (Frontier Labs, API Developers)

For Enterprises (C-Level, Architects, SOCs)

For Regulators and Policymakers

Infographic AI-Native Search

Geciteerd werk

DjimIT Nieuwsbrief

AI updates, praktijkcases en tool reviews — tweewekelijks, direct in uw inbox.

Gerelateerde artikelen