The continued development of AI models and technology requires and depends on the collection and exploitation of information, both ambient non-personal data and personally identifiable information, with much of it being (often imperfectly) anonymized personal data. Companies use data derived from online profiling and digital fingerprinting, purchase datasets from data brokers/aggregators and can infer new data from vast datasets – all of which can be used to feed AI systems. Social media platforms and online services in particular are equipped with the functionality to collect, infer and generate data on their users. All of which presents an eminent privacy risk to the individual, who is typically under-equipped to protect their data in a meaningful capacity and under-informed as to the scope of collection occurring and where their data is going. The technical nature of the issue also means that data collection as an economic practice and the motivations behind it are opaque and somewhat esoteric to the average user, requiring a standard of information that cannot be expected of the average citizen in order to have the ability to even give informed consent. Fundamentally, digital advertising is more effective and more profitable the more data is collected, data collection is also easiest and most effective when the user base providing the data is unaware that the collection is happening. Therefore, companies are economically incentivized to keep users uninformed and if obliged to inform their users – to either conceal, obfuscate or deliberately make the information confusing or inconvenient to access in an effort for the user to not understand the information. Consent per the GDPR is not a sufficient mechanism for combatting information hoarding. Because the online environment and the various mechanisms of data collection are too diffuse and too overwhelmingly spread out into every aspect of the modern technological space and are therefore nigh-unavoidable, consenting to data collection in such an environment is meaningless.
Users are expected to personally curate the use of their data via GDPR rights to object, erasure, rectification and restriction – the contention is that these mechanisms rely on an unfair expectation of knowledge from users and fail to adequately apply accountability to data processors as they are only accountable for providing explanations about their actions to users (largely under-informed and underpowered to challenge) when they should really be accountable for providing explanations to institutions (well-equipped, wielding fines, trained professionals). Processors are also providing explanations only about some aspects of the data collection, when a better — accountability-based — transparency requirement would be to communicate the reasoning behind design decisions or choices made by their content moderation systems and explain why processors choose to distribute information in a given way. Because those decisions are presently informed by processed user data and so should be considered an extension of “explaining what we do with your data”.
Intended to remedy this, the data minimization principle of the GDPR requires the use and collection of personal data to be adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed. There are six lawful bases from which to collect and process data presented in article 6(1), falling under two main categories: When the processing is consented to for one or more specific purposes, or when processing the data is necessary: Either for the fulfilment of a contract, compliance to a legal obligation, to protect a vital interest, for the performance of a task in the public interest, or simply necessary for the purposes of ‘legitimate interests pursued by the controller’.
Article 6(1(f)) (legitimate purpose) in particular, however, opens up unclear avenues for interpretation. Whilst it prohibits data collection without a legitimate purpose, determining what constitutes ‘legitimate’ is not clear: Is the pursuit of the business’s self-set goals a legitimate aim? Can maximising advertising revenue or training proprietary AI models be considered a legitimate and necessary business objective that justifies extensive data collection? Who are they demonstrating these grounds to? Ultimately, who is deciding which set of interests is more legitimate? Although the processing is subjected to a balancing test wherein the interests of the controller must not override the interests or fundamental rights and freedoms of the data subject, there’s explicitly no way to make value judgements with regard to all possible interests, and whilst the A.29 Working Party and GDPR recitals provide guidance as to what factors to consider, the guidance given with regard to assessing proportionality between the two interests is ultimately presented for judgement on a case-by-case basis. What makes the definition confusing to implement is that there is no clear example of what an “illegitimate interest” would be. As it stands, the implication is that any course of action is functionally legitimate until challenged, and even then – legitimate so long as it does not violate fundamental rights or freedoms. Although this is arguably fair at a conceptual level, the burden of proving one’s fundamental rights to be under threat in response to what are often minor data infractions will inevitably be a discouraging factor, if only because it is inconvenient. Sending complaints and pursuing escalation to a Data Protection Officer is time-consuming and many situations which ought to be resolved by a simple “unsubscribe” button are instead able to persist by nature of being too minor or too inconvenient for the individual to pursue – yet have collectively contributed to a culture of data collection and subsequently data hoarding.
“This is a problem in the same way that dropping an empty Coke can on the street is a problem. Your action alone won’t ruin the neighbourhood, but littering laws exist for a simple reason that even a child can understand: when everyone does it, we end up living in trash.” – L. Jannsen “Illegitimate Interest“
In practice, this means that the ambiguity of ‘legitimate’ ends up imposing on the data subject’s curation controls. Take Art.21, The Right to Object, here, the data subject can object to processing occurring under both articles 6(1e) or (1f) (public interest, official authority, or legitimate interest) and the controller is obliged to demonstrate ‘compelling, legitimate grounds’ for processing which override the interests, rights and freedoms of the data subject. So, under this dynamic –
- The data subject is required to contact the controller with their objection
- The controller is allowed to assess their own legitimacy
- Should the controller refuse the objection, the subject can either give up, or escalate to a data protection authority
- The subject is now reliant on the efficacy of their national data protection authority for exercising their rights
First, the data subject being required to contact the controller creates a biased playing field, the user must know specifically if a company is basing their processing on legitimate interest. The data subject – in the average case – is not technologically savvy nor aware of the details of data processing which they might otherwise object to, nor familiar with how to exercise their rights. These details are primarily communicated to them in lengthy terms-of-service agreements and privacy policies which typically run into thousands of words, the average user is not likely to read in detail what is ostensibly a legal document and in most cases will simply skip the task. The GDPR obliges controllers to communicate their activities to users in a concise, transparent and easily intelligible way – But when a majority of users are electing to avoid reading the agreement, then the information cannot be said to be being adequately communicated – or at least – there is likely to be some flaw in its’ presentation. One might argue that users have a responsibility to inform themselves and the controller can’t be held responsible for users not reading communicated terms, and that by failing to read the agreement, there is tacit permission. A counterpoint would be that users statistically dislike data processing activity when actually aware of it, and controllers can be interpreted under the GDPR as having a responsibility to ensure that the information is understood, despite user apathy. Possible reasons they are not informing themselves are that:
- The agreements are not digestible enough – to which there is research to suggest that offering shorter policies to read (in addition to long ones) can have positive impacts on users’ information acquisition and willingness to read.
- Users can be coaxed into willfully avoiding privacy information – and with just minor user interface abstraction this can be leveraged to psychologically trick users into trading away their data for little benefit.
- Apathetic outlooks – they may not feel there is a viable alternative anyway
The relative rarity of objections also results in issues; Many companies will simply take the view that the subject’s situation does not change their assessment of their own legitimacy, indeed there is a fundamental conflict of interest between the controller who benefits from data processing and fair assessment of what a legitimate aim is. Data protection authorities are an insufficient fallback in this regard, as not only is there disparity in efficacy between different national authorities, there is a trend towards DPA inefficacy union-wide. In 2022 all EU DPAs had a combined 140,106 proceedings, but issued only 1819 fines, meaning an only 1.3% consequence rate. From this we can extrapolate that DPAs are either receiving overwhelming volumes of false/inaccurate reports, or largely failing to administer fines and deter the issue. Both interpretations point to flaws in the system.
Ultimately, the starting premise of the objection mechanism is flawed – subjects are not the ones appropriately positioned to challenge controllers on legitimate interest – because subjects are under informed, further – subjects are largely presented with a positive responsibility to protect their own data interests in an environment that is biased against them doing so – successfully challenging a company on data processing relating to you does not often extend your victory to others, individuals are largely responsible for their own data relationship with the processor. Such a dynamic will work best if the entire process of asserting and understanding one’s rights is not made inconvenient for the individual – from the terms-of-service agreements to the challenge mechanisms themselves, but this means that processors and data-collectors will be incentivised to make it so as that benefits their interests.

Leave a comment