← Terug naar blog

Reframing AI regulation at its core

AI

The architecture of contemporary artificial intelligence governance faces increasing strain, revealing inherent limitations in its foundational assumptions 6. Despite numerous regulatory endeavors aimed at ensuring the ethical and responsible development and deployment of AI, these initiatives frequently remain constrained by a perspective that treats artificial intelligence as a self-contained technological entity. This viewpoint often fails to adequately recognize that AI is the product of intricate sociotechnical processes deeply interwoven with asymmetrical power dynamics. This article posits a fundamental reorientation of AI governance, asserting that data sovereignty should serve as the bedrock for establishing enforceable ethics and genuine legitimacy in this rapidly evolving field 12. This is not merely a critique of existing paradigms; it is a proposition for a comprehensive reconceptualization of how we approach the regulation of artificial intelligence.

I. From model-centricity to input ontology

The prevailing understanding that underpins AI regulation largely assumes that ethical considerations primarily arise and are addressed within the AI model itself 2. Consequently, the focus of regulatory efforts tends to gravitate towards the point at which decisions are made by AI systems. This model-centric epistemology manifests in a preoccupation with concepts such as explainability, fairness auditing of algorithms, and the deployment of transparency dashboards intended to illuminate the inner workings of AI. However, this article challenges this dominant assumption by shifting the analytical lens to the genesis of ethical concerns. It argues that the origins of ethical failures in AI systems are more accurately located within the data pipeline that feeds these models, rather than solely within the model’s parameters or decision-making processes.

The regulatory fallacy of model-centrism

Current regulatory approaches predominantly foreground the AI model as the primary target for intervention and oversight. Within this framework, harms generated by AI are often conceptualized as statistical anomalies that can be smoothed out through technical adjustments to the model, rather than as systemic reflections of deeper issues embedded within the data upon which these models are trained. This perspective leads to an oversupply of tools and techniques aimed at explaining the decisions made by AI, without ever fundamentally questioning the legitimacy or the ethical implications of the data that forms the very foundation of these decisions. While these model-centric approaches may offer a semblance of accountability at the level of outputs, they often fail to scrutinize the underlying biases, representational imbalances, and issues of consent that may be present in the training data 23. As a result, the focus on the model can inadvertently divert attention from the more fundamental ethical considerations related to the data that shapes the model’s behavior 1. This emphasis on model management, as noted in the context of data science teams’ perspectives on AI governance, highlights the prevailing tendency to view the model as the central element requiring governance 2. However, this approach has inherent limitations, as performance improvements achieved solely through model adjustments eventually plateau, and it can lead to overlooking biases and other critical issues within the data itself 23. Moreover, AI governance tools cannot ensure full accuracy due to biases in data and algorithms 7.

Data sovereignty as foundational construct

This article calls for a fundamental shift in perspective, a Kuhnian paradigm shift, in how we approach AI governance. This shift entails moving away from a primary focus on downstream accountability, which seeks to guard the decisions made by AI, towards an emphasis on upstream legitimacy, which focuses on governing the inputs that shape AI behavior. The core of this proposed reframing is the principle of data sovereignty. Sovereignty over data—encompassing its origins, the conditions under which it is captured, the modes of consent obtained for its use, and the embedded rights associated with it—must become the epistemic core around which regulatory knowledge and governance frameworks are organized. Data sovereignty, in this context, signifies that data should be governed by the laws and regulations of the jurisdiction where it originates 12. This concept extends beyond mere compliance with legal frameworks to encompass the rights of individuals and communities to control their data, including how it is collected, stored, processed, and used 16. In the realm of AI, embracing data sovereignty as the foundational construct for regulation offers the potential to decentralize innovation, ensuring that AI systems are trained on diverse and representative datasets that reflect the populations they serve 13. This epistemological reframing is not simply a matter of adding nuance to existing approaches; it fundamentally alters the structural assumptions upon which current AI governance regimes are built.

II. Methodological innovation and strategic transgression

A genuine transformation in AI governance necessitates more than incremental updates to existing regulations; it demands a methodological rupture with the prevailing orthodoxies of AI oversight. This article disrupts the conventional approach by demonstrating that the current methodological canon of AI oversight, which includes explainability, risk-tiering, and fairness metrics, rests on an unsustainable distinction between inputs (data) and outputs (model decisions).

Deconstructing the myth of contained ethics

The methodologies currently employed in AI governance often operate under the implicit assumption that ethical considerations can be contained and managed primarily at the level of the AI model’s outputs. However, this perspective overlooks the critical role of the data that serves as the foundation for these models. For instance, while explainability seeks to illuminate how an AI model arrives at a particular decision, it typically does not challenge or even inquire into the provenance or the ethical characteristics of the data upon which that decision is based. Similarly, risk-tiering frameworks often treat potential harms arising from AI as probabilistic inevitabilities to be categorized and managed, rather than as sociotechnical constructs that are deeply rooted in the way data is collected, processed, and used. These methods, while seemingly providing a structured approach to oversight, offer bureaucratic closure rather than genuine epistemic justice. They function as tools for containing perceived risks within the existing paradigm, rather than as instruments for fundamentally challenging and transforming the underlying assumptions and practices that lead to ethical concerns in the first place. The limitations of AI governance tools in ensuring full accuracy due to biases in data and algorithms further underscore the inadequacy of a purely output-focused approach 7. The narrow focus on models in current regulations also neglects the critical role of data in shaping AI capabilities and potential harms 1.

Speculative but actionable governance frameworks

To move beyond the limitations of current methodologies, this article ventures into the speculative realm, proposing governance frameworks that are not only feasible but also have the potential to be truly transformative. These proposals are intended as provocations towards a governance system that is more aligned with the realities of technological power and the fundamental importance of data in shaping AI outcomes.

AI input certification

Drawing inspiration from the well-established practice of nutritional labeling for food products, this framework proposes the implementation of “AI Input Certification”. These certifications would function as informative labels for AI datasets, providing crucial details about their origin, the provenance of consent obtained for their use, and an assessment of their representational balance across different demographic groups. Such a system would aim to bring a level of transparency to the often-opaque data supply chain of AI, enabling developers, regulators, and the public to make more informed decisions about the datasets being used to train AI models. By making the characteristics of training data visible and comparable, AI Input Certification could potentially foster a market demand for ethically sourced and well-governed data 29. Initiatives like the Data Provenance Initiative and the development of Data Provenance Standards by organizations such as the Data & Trust Alliance already represent steps in this direction, aiming to ensure metadata about the sourcing, quality, and permissions of datasets are provided in a consistent manner 31. These efforts underscore the growing recognition of the need for greater transparency in AI training data. Frameworks like IEEE CertifAIEd also aim to provide ethical certification for AI systems, potentially extending to data inputs 35.

Self-governing data capsules

Another proposed framework involves the concept of “Self-Governing Data Capsules”. These would be envisioned as portable data units that are embedded with smart contracts. These smart contracts would serve to specify the allowable usage of the data contained within the capsule, set expiration dates for its use, and outline protocols for the revocation of access rights. This approach envisions data as an active entity, imbued with embedded rights and usage rules that are automatically enforced through the technology of smart contracts 3. The idea of data capsules aligns with broader trends towards user-centric data ownership and control, potentially leveraging blockchain technology to create secure and transparent data assets 41. Smart contracts, which are self-executing contracts with the terms directly written into code, offer a mechanism to automate and enforce these data usage agreements 3. However, challenges in implementing such a system include establishing security and protection, managing quality and consistency, and defining ownership 42. This framework could empower individuals and communities to maintain control over their data, even after it has been shared for purposes such as AI model training.

Real-Time consent infrastructure

Recognizing the limitations of static, one-time consent models in the dynamic context of AI development and deployment, this article proposes the creation of a “Real-Time Consent Infrastructure”. Such systems would allow users to dynamically update, revoke, or extend their data sharing rights at any point in time, even after their data has been used for model training 47. This approach acknowledges that consent is not a fixed state but rather an ongoing and evolving relationship between data subjects and data users 53. The development of advanced Consent Management Platforms (CMPs) that incorporate AI capabilities to personalize privacy experiences and adapt to regulatory changes represents a technological foundation for such a system 47. Furthermore, regulatory initiatives like the EU AI Act, which emphasizes informed, explicit, and freely given consent, and legislative proposals such as the AI CONSENT Act in the US, which aims to require companies to obtain consent for using consumer data to train AI systems, highlight the growing importance of robust consent management in the age of AI 57. However, implementing a real-time consent infrastructure faces challenges such as ensuring transparency, managing consent across multiple channels, and adapting to changing regulations 56. Real-time consent infrastructure could significantly enhance user autonomy and build trust in AI systems by providing individuals with continuous control over their personal information.

These proposed frameworks are not intended as utopian ideals but rather as concrete provocations towards the development of a governance system that is better aligned with the realities of technological power and the fundamental rights of data subjects.

III. Ingestion, not cognition

The most consequential shift proposed by this article is theoretical in nature. It urges a fundamental redefinition of AI ethics, framing it as a problem of ingestion, rather than cognition. The underlying premise of this reconceptualization is that AI models do not spontaneously invent patterns; instead, they ingest, compress, and ultimately regurgitate social logics that are already present within the data they are trained on. Therefore, to regulate AI as if it possesses independent cognitive abilities, rather than as a reflection of the data it consumes, is to fundamentally misunderstand its nature and to pursue ineffective regulatory strategies.

Bias reimagined as upstream negligence

Within this theoretical reframing, the concept of bias in AI is not viewed as an emergent and somewhat unpredictable behavior of the model itself. Instead, bias is understood as a sedimented artifact, a consequence of systemic neglect that occurs upstream in the data pipeline 5. When datasets are collected without obtaining proper consent, when they are stripped of crucial context, or when they are unrepresentative of the populations that will be affected by the AI system, bias is not merely a glitch in the system; it is a design feature, an inherent characteristic baked into the very foundation of the model 66. This perspective aligns with the understanding that bias in AI, especially in machine learning models, often originates from training data that is unrepresentative or incomplete, leading to skewed outputs 66. Furthermore, biases can be introduced at various stages of the AI pipeline, starting with data collection and extending through data labeling and even model training 67. Recognizing bias as a form of upstream negligence underscores the responsibility of all actors involved in the data pipeline to ensure the ethical sourcing, preparation, and curation of data used to train AI models 74.

AI as epistemic mirror

This theoretical shift encourages us to move away from solely interrogating what an AI system is “thinking” or how it is making decisions. Instead, a more critical and revealing question to ask is: who gets to feed it, and under what conditions?. The AI model, in this view, is not an autonomous decision-making entity but rather a distorted mirror, reflecting the infrastructural injustices, the economic practices of data extractivism, and the legal permissiveness that characterize the environment from which its training data is drawn 75. This perspective highlights how AI systems can internalize implicit biases from their training data, which often reflects existing societal prejudices 66. The bias observed in AI outputs is not solely an AI problem but a reflection of human biases present in the data 75. Therefore, to understand and address bias in AI, we must critically examine the data itself and the social and political contexts in which it is generated and used.

False dichotomies exposed

This reconceptualization exposes profound contradictions that lie at the heart of modern AI ethics. For instance, the emphasis on explainability seeks to demystify the decisions made by AI systems without necessarily questioning the fundamental right of those decisions to exist, especially if they are based on data that is ethically compromised or inherently biased. Similarly, the focus on compliance often legitimizes the collection of consent at a single point in time, while largely ignoring the ongoing need for continuous data governance and the evolving nature of AI systems and their use cases. These approaches create a false sense of progress by focusing on transparency and adherence to formal requirements without addressing the deeper issues of data legitimacy and the ethical implications of the data itself. In essence, modern AI ethics often seeks clarity in the form of explainability without ensuring the underlying legitimacy of the data that fuels these systems.

IV. Transformative implications and the politics of sovereignty

This article is not concerned with incremental adjustments to the status quo; it advances a constructively insurgent agenda, one that seeks to move beyond mere harm mitigation towards a fundamental redesign of the structural underpinnings of AI governance.

From guardrails to gatekeeping

Current risk-based governance frameworks typically focus on providing “guardrails” for AI systems that are already in deployment, aiming to manage and mitigate potential harms after they have emerged. In contrast, this article argues for a proactive approach that demands we move upstream in the AI development lifecycle—to establish effective “gatekeeping” mechanisms that block the ingestion of unconsented and ethically problematic data before it can even be used to train AI models. This represents a fundamental shift from a reactive stance to a preventative one, encapsulated by the analogy: “stop the arsonist, not just build fire exits”. This approach aligns with the understanding that addressing bias and ensuring ethical considerations are integrated early in the AI development process is crucial 5.

Enforceable digital sovereignty

Envisioning a future where data is truly sovereign requires imagining a world where data itself carries its rights wherever it travels, effectively self-governing through embedded protocols 41. In this vision, consent is no longer a static checkbox marked at a single point in time but rather a continuously negotiable protocol, allowing individuals to maintain ongoing control over their information 53. Furthermore, the right to be forgotten would persist even beyond the training epochs of AI models, ensuring that individuals can effectively withdraw their data and have it removed from AI systems. Such a vision fundamentally transforms data from a passive raw material to an active, right-bearing agent within the digital ecosystem. This aligns with the growing recognition of digital sovereignty as the right to control one’s own data 12.

Democratizing the data economy

Embedding the principle of sovereignty directly at the data layer has profound implications for democratizing the data economy 84. It enables the establishment of community governance structures for culturally sensitive and Indigenous data, empowering these communities to control the collection, use, and interpretation of their information 12. It also fosters greater corporate accountability by enabling lineage audits that can trace the origins and usage of data, as well as mechanisms for retroactive deletion of data when necessary 42. Moreover, it enhances scientific reproducibility by ensuring that the training documentation for AI models includes detailed and traceable information about the data used 31. Perhaps the most radical proposal stemming from this perspective is to fundamentally flip the current value proposition, creating market mechanisms that make ethically sourced and sovereign data more valuable than data that has been scraped without consent or proper governance 86. This could involve introducing pricing mechanisms, taxation policies, or even tokenization strategies that reward organizations for adhering to high standards of data ethics and sovereignty 90.

V. Toward relational AI

This article not only offers a critique of the prevailing AI governance paradigm but also constructs a compelling counter-narrative. It actively resists the dominant portrayal of AI regulation as a purely technical endeavor focused on a race to keep pace with the ever-increasing capabilities of AI models.

Narrative inversion

This counter-narrative proposes a fundamental inversion of the dominant framing:

This shift in perspective calls for a move away from viewing AI as an abstract, autonomous entity and towards understanding it as a technology that is deeply embedded in social and ethical relationships 13. It emphasizes the importance of building AI systems in a manner that is fundamentally relational, grounded in respect for human rights, cultural values, and individual autonomy, rather than in isolation from these critical considerations.

VI. The road ahead

To truly actualize the reframing of AI governance advanced in this article, an ambitious and cross-sectoral research agenda must be pursued. This agenda requires collaborative efforts across various disciplines to address the complex challenges and opportunities presented by a data sovereignty-centric approach to AI regulation.

Technological research

Several key technological research questions need to be explored to support this shift:

Legal research

Legal scholars and practitioners must grapple with critical questions to translate data sovereignty into effective regulation:

Economic modeling

Economic research is needed to explore sustainable models for a data sovereignty-centric AI ecosystem:

Interdisciplinary ethics

Ethical considerations must be at the forefront of this research agenda:

These are not merely theoretical luxuries. They are governance imperatives.

VII. Toward a new social contract

Ultimately, this article’s thesis is not technical—it is philosophical.

Data is not inert. It is political, relational, and alive with rights 16.

To govern AI is not merely to discipline outputs, but to recognize the dignity of inputs. The true subject of AI governance is not the model—it is the human being whose digital traces animate its capabilities 84. This is not a call for more transparency. It is a call for a new social contract—between those who build AI and those whose knowledge, identity, and agency fuel it 86. Until governance begins at the point of ingestion, every ethical safeguard is downstream of the original injustice.

Geciteerd werk

DjimIT Nieuwsbrief

AI updates, praktijkcases en tool reviews — tweewekelijks, direct in uw inbox.

Gerelateerde artikelen