Posts

ESRs presenting their research at ACM FAccT

ESRs Maciej Zuziak, Onntje Hinrichs and Aizhan Abdrassulova collaborative work got accepted for this years’ ACM conference on Fairness, Accountability, and Transparency (ACM FAccT) that was held in Chicago from 12-15th of June 2023. Maciej Zuziak presented their article on “Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance” during a paper session dedicated to Privacy.

In their collaboration, the ESRs combined their different research interests in law and machine learning, which enabled them to develop an interdisciplinary perspective on challenges posed by different approaches to data governance. The predominant part of the current discussion in EU data policy is centered around the challenging task of creating a data governance framework where data is ‘as open as possible and as closed as necessary’. In their article, the ESRs further elaborate on the concept of data collaboratives powered by decentralised learning techniques as a possible remedy to the shortcomings of existing data governance schemes.

Data collaboratives have been described as new emerging forms of partnerships where privately held data is made accessible for analysis and where collaboration between participants is facilitated to unlock the public good potential of previously siloed data. [1] They thus fit well the EU policy shift that has taken place over the past years. Whilst for decades, the introduction of exclusive property rights was discussed as a potential tool to empower data subjects with regard to ‘their’ data and to facilitate the emergence of data markets, this changed with the 2020 European Data Strategy. Now, the potential of the data economy should be unlocked by facilitating data access and sharing. The authors presented their concept of data collaboratives powered by decentralised learning techniques, which can be used as a tool to reach the goals of the current EU data strategy: facilitate access to data while protecting privacy and intellectual property interests which individuals and companies might have with regard to ‘their’ data. The collaboration between the ESRs thus reflected the idea behind the LeADS project that solutions to existing tensions in the data economy require an interdisciplinary perspective from both law and data science.

The paper can be accessed in the Digital ACM library via this link.

 

[1] 2020. Wanted: Data Stewards – (Re-) Defining the Roles and Responsibilities of Data Stewards for an Age of Data Collaboration

 

Unchaining data portability

The role of data portability in the EU Digital Strategy

The concept of data portability relates to the characteristic of a set of data to be moved to, from and among applications, operating systems, or devices, with minimal friction.

The European legislator individuated  data portability as fundamental means to develop digital policies that benefit both citizens, giving them higher levels of control over their data, and the market, revamping competition thanks to clearer rules and easier mechanisms for data sharing, access and re-use.

In practice, the possibility for end-users and businesses to move to and from digital service providers seamlessly, without losing content or disrupt their services is, at least in theory, the perfect arena to fuel competition and better services.

Yet the realization of such seemingly simple capability of data is hindered by an unparalleled amount of legal, economical, and technological complications. Data portability is in fact hard to regulate. Its complexity is due to its inbred, bi-parted soul: one half being its economic-driven capacity of impacting market competition, the other being its human-rights-driven capacity of enabling people’s informational self-determination. Additionally, the two souls of data portability are inseparable, meaning that it is impossible to regulate only for data protection leaving competition untouched. The “inseparability of souls“ is a characteristic that in itself impacts the regulation of a right to data portability, because of its cascade implications on multiple domains, ranging from technological interoperability and industrial standardization to human and economic rights, to market competition and consumer’s rights, to policy in data sharing and governance.

Historically, the concept of portability stemmed from “number portability” enshrined in art 30 of the Universal Services Directive. Yet ever since, in the legislative action of the EU legislator, the concept of portability has assumed different objects, affected different subjects, required for new technologies, as well as the development of theoretical frameworks for data sharing and governance to keep pace with the evolution of the European digital market.

Objects of portability legislation have so become personal data (GDPR, proposals for the Digital Governance Act (‘DGA’), Digital Markets Act (‘DMA’) and Data Act –even special categories thereof (such as health data in the proposed European Health Data Space Regulation (‘EHDS’)), non-personal data (Free Flow of non-personal data Regulation ‘FFNPD’, Open Data Directive, and again DSA and DMA), but also the services and online content of European users (Content Portability Regulation and Digital Content Directive (‘DCD’)). Self-evidently, such heterogeneous legislative framework, among Directives and Regulations, realizing diverse political strategies and each with their specific objectives, through horizontal and vertical regulation, over the span of 20 years –and importantly: the last twenty years—in the context of the ever changing economics, technologies and societal structures of the digital ecosystem, have created a legislative omnishambles.

Such legislative omnishambles is extremely hard to navigate –even for legal experts—but  does have legal effects and does create rights and obligations for end-users as well as businesses.

Even within GDPR, the text of art. 20 allows for multiple interpretations of rights and obligations. For instance, it is unclear for users and providers what personal data shall be portable, considering “data provided by the data subject” can be that personally generated, or also that observed by the provider, or even inferred. Also unclear is, which types of formats that are structured, machine-readable, and commonly-used are also functional to ensure interoperability, and what is the legal relationship between interoperability and portability. Another fundamental, unanswered question is about where to trace the line of a “portable minimum” for each service so that the ported data are still meaningful to the data subject. There is in fact difference in being able to port single photos v. albums, or entries of a list of contacts v. a social graph of relationships. In these cases, keeping the structures and the collections of the single data entries can sometimes be as vital to the service as their entries themselves. It is in such cases that the law does not clarify what’s the minimum “data unit”.

Snowballing, the portability of single data or collections thereof might create issues with the “rights of others”, such as privacy or intellectual property. As for privacy, data portability requests may encompass personal data of others, e.g. in the case of contact lists, conversations with, or pictures of others, where it will be hard to balance the interests at stake or even find legitimate grounds for processing. Moreover, when the personal data of others is finally in the control of the requesting data subject, they fall under the data subject’s household exemption. Because of this exemption, they get no longer protected under the GDPR, which is a problem in terms of security –and a big one, should the downloaded datasets contain hundreds of contacts of vulnerable subjects or hazardous content, or be sent via insecure means. As for intellectual property, there will be cases where pictures were adjusted with proprietary filters, or collections created by the service provider on the basis of the generated or observed subject’s personal data, or content generated by a “prosumer”. These cases raise questions of legitimacy of the portability requests and need legal and technical answers.

Nowadays, the reality of fact is that, on one side, the majority of people do not know about the existence of their right to data portability, or would anyways not know how to enjoy it. On another side, there are no clear rules for service providers on how to address such requests without infringing somewhere down the line some stakeholders’ legitimate interests, nor how to create portability services that comply with all the potentially applicable rules. And finally, the European and national courts as well as Regulatory Authorities have not yet clarified questions on portability –only the case UK drivers v. Ola decided on a data portability requests, but without solving any of the above; same holds for the guidelines adopted by the WP29/EDPB.

“Politics” of data portability

Data portability is not a necessary function in most information systems. It is instead a function that the architects of an information exchange system may want to, or are obliged to embed —by a regulatory constraint in the case of Art. 20 GDPR. This means that, decisions on the existence and extent of data portability functions are the result of a normative decision of either the developers or, in our case, the legislator. Such decisions relate to how should (what?) data (personal, non-personal, content, etc…) be governed, who should decide what to do with it, and who should be responsible for making that possible. With the capability of enabling or hindering such decisions, data portability is a crucial means within the realm of “politics of data”.

The political goal of the EU legislator is to unchain the power of European data first by destabilising the de facto ownership over data by the –mostly American—information industry, and then using data the “European way”, that is fairly, securely, to the benefit of its people and businesses, and in respect of fundamental rights. To operationalize such “data sovereignty” strategy, the results of a heated political and academic conversation about governance models have rewarded forms of data openness, access, sharing and re-use, which are allegedly better at reaching goals of information privacy, innovation and competition, as opposed to data property, ownership, and other exclusivity models.

Technologies for data portability

From a technological perspective, to realize portability is quite impractical. Data migration and re-adaptation does not happen smoothly and the lack of mandated top-down coordination from the EU is not helping the standardisation process. As for data formats, practical research on portability requests showed that respondents favour some types of data formats depending on their field of service, of which only a few are GDPR compliant according to the interpretation of the U.K. ICO. As for information systems enabling portability, the EU is moving on multiple fronts. First is the upcoming roll-out of the vertical European Data Spaces, with the Health one already at proposal stage. Additionally, the development of personal information management systems (‘PIMS’) is underway, which will allow users to be “holders” of personal information to manage in secure, local or online storage systems, and to share them at will. Reading from the EDPS’ TechDispatch 3/2020,

“PIMS can usually offer personal data and other metadata describing their properties in machine readable formats, as well as programming interfaces (APIs) for data access and processing. This last feature implies the use of standard policies and system protocols. This is an essential element, the lack thereof currently also represents a limit for PIMS adoption.”

Private projects such as Nextcloud, Solid, and MyData have made promising steps toward portability-enabling systems, but have not reached a level of technology readiness to allow for market acceptance and critical adoption. Unsurprisingly, there are open-source, industry-led initiatives, championed by the Data Transfer Project of Google, Meta, Twitter and Apple, which aim to ensure the entering the market of products and services to address the consumers’ requests of downloadable user data in structured, commonly used formats (Google Takeout, etc.), as well as of direct, seamless data portability from one service to another. These projects, however, may encounter legal obstacles in European competition law ex art. 102 TFUE, but also political ones: the consideration of the EC policy agenda regarding data governance altogether excludes that American big-tech players will unilaterally establish the de facto standard data formats and systems for data portability.

 

Conclusions

Under normal circumstances, and considering the results of the EC’s impact assessments, the interplay of the mentioned regulatory efforts should shape a digital market that benefits everyone. The EU has seemingly found a silver bullet that makes market players and consumers happy, both economically and in the respect of the fundamental rights to privacy, intellectual property, and fairness in the distribution of data value.

In reality, nobody seems interested in using such silver bullet. Why is that?

After months of research, my educated guess is that the reasons are to be found in a mixture of the following:

  • From a market perspective, portability has potentially disruptive, market-wide economic effects mostly stacked against big market players. The EU has been extremely careful in (not) imposing rules and technologies for full harmonization, with a hope that multi-stakeholderism could find its ways. Citing Alek Tarkowski from the Open Future Foundation “no one tried hard to make it work, while others tried very hard not to make it work”.
  • From a law and economics perspective, although it is said that portability will benefit users and businesses, there have not been exhaustive and conclusive economic analyses providing evidence of benefits for big tech companies, nor for Small and Medium Enterprises.
  • From a regulatory perspective, the careful, delicate approach of participatory regulation and technological neutrality has been excessively open, creating uncertainties that have benefited the maintenance of the status quo –meaning, the monopolistic control over data of big tech players.
  • From a technological perspective, there remains the need to develop information systems enabling data portability. The problem is that, in privacy engineering, such development starts with the identification of the requirements, both legal and technical, and in such a moment of regulatory turmoil these are hard to identify, let alone systematize, operate, and put into the market.