Posts

ESRs presenting their research at ACM FAccT

ESRs Maciej Zuziak, Onntje Hinrichs and Aizhan Abdrassulova collaborative work got accepted for this years’ ACM conference on Fairness, Accountability, and Transparency (ACM FAccT) that was held in Chicago from 12-15th of June 2023. Maciej Zuziak presented their article on “Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance” during a paper session dedicated to Privacy.

In their collaboration, the ESRs combined their different research interests in law and machine learning, which enabled them to develop an interdisciplinary perspective on challenges posed by different approaches to data governance. The predominant part of the current discussion in EU data policy is centered around the challenging task of creating a data governance framework where data is ‘as open as possible and as closed as necessary’. In their article, the ESRs further elaborate on the concept of data collaboratives powered by decentralised learning techniques as a possible remedy to the shortcomings of existing data governance schemes.

Data collaboratives have been described as new emerging forms of partnerships where privately held data is made accessible for analysis and where collaboration between participants is facilitated to unlock the public good potential of previously siloed data. [1] They thus fit well the EU policy shift that has taken place over the past years. Whilst for decades, the introduction of exclusive property rights was discussed as a potential tool to empower data subjects with regard to ‘their’ data and to facilitate the emergence of data markets, this changed with the 2020 European Data Strategy. Now, the potential of the data economy should be unlocked by facilitating data access and sharing. The authors presented their concept of data collaboratives powered by decentralised learning techniques, which can be used as a tool to reach the goals of the current EU data strategy: facilitate access to data while protecting privacy and intellectual property interests which individuals and companies might have with regard to ‘their’ data. The collaboration between the ESRs thus reflected the idea behind the LeADS project that solutions to existing tensions in the data economy require an interdisciplinary perspective from both law and data science.

The paper can be accessed in the Digital ACM library via this link.

 

[1] 2020. Wanted: Data Stewards – (Re-) Defining the Roles and Responsibilities of Data Stewards for an Age of Data Collaboration

 

Public-private data sharing from “dataveillance” to “data relevance”

Data sharing has become a common practice between public and private entities all over the world. The reasons for this are broad and varied, ranging from making more data available for data-rich scientific research to allowing law enforcement agencies to pursue criminal activities with greater precision. While data collection remains a fundamental activity, as it enables an ever-growing amount of data to exist, the sharing of data and its subsequent repurposing can enable further major economic and social value. A single data controller can collect so little information in comparison to the data that can be made available from several third parties.

Regulators have taken notice of this and are planning accordingly to reap the supposed benefits of the data economy by further enabling and pushing for the sharing of data. In this sense, the recent European Strategy for Data puts this practice at its core, envisaging an environment of trusted data-driven innovations fueled by data sharing between digital platforms, governments, and individuals alike.[1]

In the field of law enforcement, the amount of data available also caught the attention of competent authorities a long time ago, as it allows for more ‘smart’ crime prevention yet at the expense of more privacy-invasive practices.[2] In this respect, the increasing amount of available data is highly interesting for the deployment of a forever-expanding surveillance apparatus by public authorities.[3] This has led to the emergence of what has been described as ‘dataveillance’[4] and its considerable expansion in the last decades, rooting itself in our society to become a troublesome practice.[5] In this context, the private sector makes available, either voluntary or not, a considerable portion of their data to law enforcement agencies,[6] with limitations.[7]

Data sharing can also involve access to public sector generated data by private businesses. In this respect, the open data movement has been for years pushing in this direction and, certain cases, triggering legislation that reduces the obstacles to making such data available for re-use by, for example, companies. While it is possible to find certain regulations that either foster or mandate such data sharing practices, all of them must be subject to general applications rules, such as the General Data Protection Regulation (GDPR).

As mentioned above, regulators intend to foster data sharing between private and public sectors. As the recent European Strategy for Data points out, certain kinds of information, such as that generated within smart cities, can provide an interesting field where public-private data sharing would be beneficial to society and individuals.[8] For example, data generated by the financial services industry provides a considerable amount of information, both in quantity and quality.[9] Nevertheless, a single payment can provide a sensitive insight into an individual’s life, from health data -for example from recurring pharmacy expenses- up to religious information -as in the case of monthly contributions to a religious organization-. This could be overcome by sharing certain information about payments in an aggregated manner, for example merely their time and date, which could help in understanding citizens movements in a city and plan city’s policies accordingly to accommodate for citizens’ benefit.[10]

But how can we avoid that these public-private data-sharing activities end up contributing to more ‘dataveillance’? While the GDPR covers a significant amount of data processing activities, we also need to involve other relevant pieces of legislation that contemplate public authorities, particularly law enforcement agencies, such as the Law Enforcement Directive. While the obligations and rights within the relevant legal framework diverse, it is possible to highlight that most of these activities should be conducted following some common principles.

Among these we point out that only accurate and relevant data should be used for a particular and specific purpose. In this respect, we can ask when the data are relevant enough for the intended purposes; in other words, we need to question when we have “good enough data” [11] for the intended public-private data sharing. By doing so, we can assess whether compliance with these rules has been reached. Through this, we can effectively implement the principles of data accuracy and minimization, alongside other applicable and relevant principles.

Understanding how these rules are effectively applied to, and guide, these public-private data sharing practices is crucial as regulators seek to foster them. For example, the European Union is currently working on a proposal for a Data Governance Act, which introduces data sharing services,[12] as well as data altruism.[13] Both of these categories, with their particularities, seek to foster data-sharing activities between private and public entities alike. Data protection watchdogs have raised their concerns regarding the current wording and extent of this proposal.[14] Among these concerns, the lack of clear integration between them and, in particular, the GDPR was highlighted as a troublesome issue.

Public-private data sharing activities are not likely to stop. On the contrary, the current data strategy for the European Union is to further expand the sharing of data in an automated manner using APIs, such as in the case of open finance.[15] The question that remains open on this front is whether these new data governance schemes can make us move from a dataveillance perspective towards a data relevance scenario. Within this context, we intend to explore this broad question in the different crossroads that this topic is present in the LeADS project and seek ideas to tackle the matter in an interdisciplinary manner.

Authors: Prof. dr. Paul de Hert, Prof. dr. Gloria González Fuster, Andrés Chomczyk Penedo

[1] ‘Citizens will trust and embrace data-driven innovations only if they are confident that

any personal data sharing in the EU will be subject to full compliance with the EU’s strict

data protection rules’ (see ‘Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions: A European Strategy for Data’ (European Commission 2020) COM(2020) 66 final.)

[2] David Wright and others, ‘Sorting out Smart Surveillance’ (2010) 26 Computer Law & Security Review 343.

[3] Margaret Hu, ‘Small Data Surveillance v. Big Data Cybersurveillance’ (2015) 42 Pepperdine Law Review 773.

[4] Roger Clarke, ‘Information Technology and Dataveillance’ (1988) 31 Communications of the ACM 498.

[5] Roger Clarke and Graham Greenleaf, ‘Dataveillance Regulation: A Research Framework’ (2017) 25 Journal of Law, Information and Science 104.

[6] David Lyon, Surveillance After Snowden (John Wiley & Sons 2015).

[7] ‘Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions: A European Strategy for Data’ (n 1).

[8] ibid.

[9] V Ferrari, ‘Crosshatching Privacy: Financial Intermediaries’ Data Practices Between Law Enforcement and Data Economy’ (2020) 6 European Data Protection Law Review 522.

[10] Ine van Zeeland and Ruben D’Hauwers, ‘Open Banking Data in Smart Cities’ (VUB Chair Data Protection on the Ground – VUB Smart Cities Chair – imec-SMIT-VUB 2021) Round table report <https://smit.vub.ac.be/wp-content/uploads/2021/02/Report-roundtable-Open-Banking-Smart-Cities_def.pdf> accessed 6 September 2021.

[11] Angela Daly, Monique Mann and S Kate Devitt, Good Data (Institute of Network Cultures 2019).

[12] According to the current wording of the proposal, under this service, we can include: (i) intermediate between data holders and data users for the exchange of data through different means; (ii) intermediate between data subjects and data users for the exchange of data through different means for the purpose of exercising data rights provided for in the GDPR, mainly right to portability; and (iii) provide data cooperatives services, i.e. negotiate on behalf of data subjects and certain data holders terms and conditions for the processing of personal data.

[13] According to the current wording of the proposal, under this term, we are referring to “(…) the consent by data subjects to process personal data pertaining to them, or permissions of other data holders to allow the use of their non-personal data without seeking a reward, for purposes of general interest, such as scientific research purposes or improving public services, such as scientific research purposes or improving public services”.

[14] ‘Joint Opinion 03/2021 on the Proposal for a Regulation of the European Parliament and of the Council on European Data Governance (Data Governance Act)’ (European Data Protection Board – European Data Protection Supervisor 2021) Joint Opinion 03/2021 <https://edpb.europa.eu/sites/edpb/files/files/file1/edpb-edps_joint_opinion_dga_en.pdf> accessed 25 March 2021.

[15] ‘Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions on a Digital Finance Strategy for the EU’ (European Commission 2020) Communication from the Commission (2020) 591 <https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:52020DC0591&from=EN> accessed 1 December 2020.

Data Privacy in the Financial and Industrial Sectors

Nowadays, one of the most important issues for enterprises across the financial services industry is privacy and data protection. Records and in particular financial records are considered sensitive for most of the consumers and good data handling practices are promoted by the respective regulators targeting for increased customer profiling in order to identify any potential opportunities and make a risk management analysis. To this extend, the management of data privacy and data protection is of great importance throughout the customer cycle. For example, there are several use cases in the finance sector that involve sharing of data across different organizations (e.g., sharing of customer data for customer protection or faster KYC, sharing of businesses’ data for improved credit risk assessment, sharing of customer insurance data for faster claims management and more).

To facilitate such cases, several EU funded projects have already discussed the need to reconsider data usage and regulation in order to unlock the value of data while fostering consumer trust and protecting fundamental rights. Permissioned blockchain infrastructure is utilized in order to provide privacy control, auditability, secure data sharing, as well as faster operations. The core of the blockchain infrastructure is enhanced in two directions: (i) Integration of tokenization features and relevant cryptography, as a means of enabling assets trading (e.g., personal data trading) through the platform; and (ii) utilization of Multi-Party Computation (MPC) and Linear Secret Sharing (LSS) algorithms in order to enable querying of encrypted data as a means of offering higher data privacy guarantees. Based on these enhancements the project will enable the implementation disruptive business models for personalization, such as personal data markets.

LeADS builds upon those results and steps forward by setting a more ambitious goal: to experiment, in partnership with businesses and regulators, on a way to pursue not only lawfulness of data mining and AI development, but both the amplest protection for fundamental rights and, simultaneously, the largest possible data exploitation in the digital economy using coexisting characteristics of the data driven financial services, LeADS helps to define: Trust; Involving; Empowering; Sharing. Participation in several of the mentioned projects (e.g. XAI, SoBigData++) and/or close scientific connections with the research teams (e.g. CompuLaw) by several consortium members ensure close collaboration with the named projects. There are great potentials to be found in data science and AI development entailing both great risks in terms of privacy and industrial data protection. Even considering legal novelties like: the Digital Single Market strategy, the GDPR, the Network and Information Security (NIS) directive, the e-privacy directive and the new e- privacy regulation, legal answers are often regarded as inadequate compromises, where individual interests are not really protected. As far it concerns the academic elaboration with the subject, there are several challenges that still need to be addressed in data-driven financial services that LeADS could met regarding the empowerment of individuals (users, clients, stakeholders, etc.) in their data processing, through “data protection rights” or “by design” technologies, like for example blockchain as described before.

The approach LeADS Early-Stage Researchers will be developed in two folds: 1) The study of digital innovation and business models (e.g. multisided markets, freemium) dependent on the collection and use of data in the financial sector. It will also: a) link this analysis to the exploration of online behaviour and reactions of users to different types of recommendations (i.e. personalized recommendations by financial/industrial applications) that generate additional data as well as large network effects; b) assess (efficiency, impact studies) the many specific privacy regulations that apply to online platforms, business models, and behaviours, and 2) Proposal of a user centric data valorisation scheme by analysing user-centric patterns, the project aims to: a) Identify alternative schemes to data concentration, to place the user at the heart of control and economic valorisation of “his” data, whether personal or not (VRM platforms, personal cloud, private open data); b) Assess the economic impact of these new schemes, their efficiency, and the legal dimension at stake in terms of liability and respect of privacy. The project will also suggest new models allowing the user to obtain results regarding the explainability of the algorithms that are being utilized by financial organizations to provide the aforementioned personalized recommendations for their offerings. LeADS research will overcome contrasting views that consider privacy as either a fundamental right or a commodity. It will enable clear distinctions between notions of privacy that relate to data as an asset and those which relate to personal information affecting fundamental rights.

Against this background, LeADS innovative theoretical model, based on new concepts such as “Un- anonymity” and “Data Privaticity”, will be assessed within several legal domains (e.g. consumer sales and financial services, information society contracts, etc.) and in tight connection with actual business practices and models and the software they use. Finally, due to the increasing potential of Artificial Intelligence information processing, a fully renewed approach to data protection and data exploitation is introduced by LeADS by building a new paradigm for information and privacy as a framework that will empower individuals’ awareness in the data economy; wherein data is constantly gathered and processed without awareness, and the potential for discrimination is hidden in the design of the algorithms used. Thus, LeADS will set the theoretical framework and the practical implementation template of financial smart models for co- processing and joint-controlling information, thereby answering the specific need to clarify and operationalize these newly- introduced notions in the GDPR.

The beginning of the LeADS era

On January 1st 2021 LeADS (Legality Attentive Data Scientists) started its journey. A Consortium of 7 prominent European universities and research centres along with 6 important industrial partners and 2 Supervisory Authorities is exploring ways to create a new generation of LEgality Attentive Data Scientists while investigating the interplay between and across many sciences.

LeADS envisages a research and training programme that will blend ground-breaking applied research and pragmatic problem-solving from the involved industries, regulators, and policy makers. The skills produced by LeADS and tested by the ESR will be able to tackle the confusion created by the blurred borders between personal and commercial information and between personality and property rights typical of the big data environment. Both processes constitute a silent revolution—developed by new digital business models, industrial standards, and customs—that is already embedded in soft law instruments (such as stakeholders’ agreements) and emerging in case law and legislation (Regulation EU 2016/679 and the e-privacy directive to begin with), while data scientists are mostly unaware of them. They cut across the emergence of the Digital Transformation, and call for a more comprehensive and innovative regulatory framework. Against this background, LeADS is animated by the idea that in the digital economy data protection holds the keys for both protecting fundamental rights and fostering the kind of competition that will sustain the growth and “completion” of the “Digital Single Market” and the competitive ability of European businesses outside the EU. Under LeADS, the General Data Protection Regulation (GDPR) and other EU rules will dictate the transnational standard for the global data economy while training researchers able to drive the process and set an example

The data economy or better way the data society we increasingly live is our explorative target under many angles (from the technological to the legal and ethics one). This new generation is needed to better answer to the challenges of the data economy and the unfolding of the digital transformation scoping. Our Early Stage Researchers (ESRs) will come from many experiences and backgrounds (law, computer science, economics, statistics, management, engineering, policy studies, and mathematics,..).

ESRs will find an enthusiastic transnational, interdisciplinary team of teams tackling the relevant issues from their many angles. Their research will be supported by these research teams in setting the theoretical framework and the practical implementation template of a common language.

LeADS research plan, although already envisages 15 specific topics to be interdisciplinary investigated, remain open-ended.

It is natural in the fields we have selected for which we identified crossover concepts in need of a common understanding of concepts useful for future researchers, policy makers, software developers, lawyers and market actors.

LeADS research strives to create, share cross disciplinary languages and integrate the respective background domain knowledge of its participants in one shared idiolect that it wants to share with a wider audience.

It is LeADS understanding that regulatory issues in data science and AI development and deployment are often perceived (and sometimes are) hurdles to innovation, markets and above all research. Our unwritten goal is to contribute to turn regulatory and ethical constraints which are needed in opportunities for better developments.

LADS aims at nurturing a data science capable of maintaining its innovative solutions within the borders of law – by design and by default – and of helping expand the legal frontiers in line with innovation needs, preventing the enactments of legal rules technologically unattainable.

By Giovanni Comandé