How AI Systems Can Violate DPDPA, Without You Knowing

India’s Digital Personal Data Protection Act (DPDPA), 2023 is now the law of the land and most organisations are still scrambling to understand what that means for their AI systems. Here is the unsettling truth: your AI may already be breaking the law, quietly, automatically, and at scale.

This is not about malicious intent. It is about the structural gap between how modern AI systems are designed and what DPDPA demands. Understanding that gap is the first step to closing it.

What DPDPA Actually Demands

Before examining the risks, it helps to be precise about what the Act requires. At its core,
DPDPA builds on three pillars:

Right to access, correction, grievance redress, and erasure for every individual Strict rules around children’s data including verifiable parental consent.

Most of these requirements sound straightforward when applied to a human analyst. They become dramatically complicated when the actor is an AI system making thousands of decisions per second.

Six Ways AI Systems Silently Violate DPDPA

Training Data That Was Never Yours to Use

Many AI models including in-house fine-tuned systems are trained on historical customer data. If that data was collected under terms that predate DPDPA, or was gathered for a different purpose entirely, the training itself constitutes a violation.

The Act does not grandfather in old data. If you are processing personal data today, even inside a model training pipeline, DPDPA applies. Organisations frequently overlook this because model training feels like an internal technical process it is not.

Feature Extraction and Shadow Profiling

Modern machine learning systems are voracious. They extract features behavioural patterns, inferred attributes, predicted propensities that far exceed what a user consented to share. A customer who signed up for a newsletter may find their browsing behaviour, purchase timing, and location signals feeding a scoring model they have no knowledge of.

This is shadow profiling: building a rich data portrait of an individual using inference rather than direct collection. DPDPA’s data minimisation principle makes this a direct violation, even if no raw personal data leaves your servers.

Third-Party Model APIs and Uncontrolled Data Flows

Organisations routinely send personal data sometimes entire customer records to third-party AI APIs for processing. Every such transfer is a data sharing event under DPDPA. It requires a valid basis, and the receiving party must operate as a Data Processor under a binding agreement.

In practice, this is almost never documented correctly. Engineering teams integrate AI APIs to solve business problems fast. Legal and compliance teams find out months later, if at all. By then, personal data has transited multiple jurisdictions, been used to improve the vendor’s model, and cannot be recalled.

Automated Decision-Making With No Explainability

DPDPA grants Data Principals the right to know that automated decisions are being made about them and to seek human review or explanation. This directly implicates AI-driven credit scoring, job application screening, insurance underwriting, and content moderation.

Most neural network models even those your own teams built cannot produce the kind of plain-language explanation DPDPA envisages. A probability score is not an explanation. A feature importance chart is not accessible to a layperson. When a model denies a loan or flags an account, the organisation must be able to say why, clearly, to the person affected.

Data Retention Buried in Model Weights

Here is the subtlest risk of all. When a model is trained on personal data, that data does not simply disappear. Research has demonstrated that models can memorise and later regurgitate specific data points names, addresses, account numbers – that appeared in their training corpus.

DPDPA’s erasure obligations apply to all forms of data retention. If a user exercises their right to erasure and your model has effectively memorised their data, you may be non-compliant even after deleting every database record. This is a frontier problem that most legal teams are not yet equipped to handle.

Children's Data Processed Without Verification

DPDPA imposes some of its strictest rules around data belonging to children defined as anyone under the age of eighteen. Organisations must obtain verifiable parental consent and are prohibited from profiling or tracking children.

AI-powered recommendation engines, gaming platforms, and edtech applications routinely process data from users whose age has never been verified. Age-gating through a checkbox does not constitute verifiable consent. If your AI system serves any digital product where children might plausibly be users, this is a live exposure.

Why Organisations Miss These Risks

The violations above rarely result from negligence. They emerge from a structural mismatch between how AI is built and how privacy law is written.

AI development is iterative and fast. Data privacy compliance is deliberate and slow. The teams responsible for each rarely speak the same language. Engineers think in pipelines and model accuracy; compliance officers think in consent flows and data subject rights. Without a shared framework, gaps multiply invisibly.

Add to this the complexity of modern AI stacks pre-trained base models, third-party APIs, vector databases, and retrieval-augmented systems and the data lineage question becomes genuinely difficult. Knowing where personal data lives, and in what form, is hard. Knowing when it has been processed in violation of DPDPA is harder still.

What Compliance Actually Looks Like

Achieving DPDPA compliance for AI systems is not a one-time audit. It is an ongoing programme built around four disciplines:

Data lineage mapping: Know exactly which personal data enters your AI pipelines, from what source, under what legal basis, and where it flows.
Privacy-by-design in ML workflows: Consent verification, data minimisation, and purpose checks must be built into the engineering lifecycle not bolted on at launch.
Third-party due diligence: Every AI vendor or API that processes personal data on your behalf needs a Data Processing Agreement aligned to DPDPA requirements.
Model governance: Maintain records of training data provenance, deploy explainability layers for high-stakes models, and build machine unlearning capability into your model lifecycle.

None of this is simple. But the penalties under DPDPA are significant up to Rs 250 crore per breach and regulators have made clear that AI-driven processing will receive close scrutiny.

The Bottom Line

DPDPA does not distinguish between a human employee reading a file and an AI model processing the same information at scale. The law sees both as data processing, and demands the same standards of consent, purpose, and accountability.

The organisations that will navigate this well are those that start asking hard questions now: What personal data does our AI touch? On what legal basis? Can we explain every automated decision? Can we erase a person’s data truly, completely when they ask?

If the honest answer to any of those questions is “we are not sure,” that is where compliance work begins & that is where we ComplyPlanet can step in as your compliance partner.

Start early and let ComplyPlanet help you build a compliant, secure, and privacy-driven future.