Posts

Data Privacy in the Financial and Industrial Sectors

Nowadays, one of the most important issues for enterprises across the financial services industry is privacy and data protection. Records and in particular financial records are considered sensitive for most of the consumers and good data handling practices are promoted by the respective regulators targeting for increased customer profiling in order to identify any potential opportunities and make a risk management analysis. To this extend, the management of data privacy and data protection is of great importance throughout the customer cycle. For example, there are several use cases in the finance sector that involve sharing of data across different organizations (e.g., sharing of customer data for customer protection or faster KYC, sharing of businesses’ data for improved credit risk assessment, sharing of customer insurance data for faster claims management and more).

To facilitate such cases, several EU funded projects have already discussed the need to reconsider data usage and regulation in order to unlock the value of data while fostering consumer trust and protecting fundamental rights. Permissioned blockchain infrastructure is utilized in order to provide privacy control, auditability, secure data sharing, as well as faster operations. The core of the blockchain infrastructure is enhanced in two directions: (i) Integration of tokenization features and relevant cryptography, as a means of enabling assets trading (e.g., personal data trading) through the platform; and (ii) utilization of Multi-Party Computation (MPC) and Linear Secret Sharing (LSS) algorithms in order to enable querying of encrypted data as a means of offering higher data privacy guarantees. Based on these enhancements the project will enable the implementation disruptive business models for personalization, such as personal data markets.

LeADS builds upon those results and steps forward by setting a more ambitious goal: to experiment, in partnership with businesses and regulators, on a way to pursue not only lawfulness of data mining and AI development, but both the amplest protection for fundamental rights and, simultaneously, the largest possible data exploitation in the digital economy using coexisting characteristics of the data driven financial services, LeADS helps to define: Trust; Involving; Empowering; Sharing. Participation in several of the mentioned projects (e.g. XAI, SoBigData++) and/or close scientific connections with the research teams (e.g. CompuLaw) by several consortium members ensure close collaboration with the named projects. There are great potentials to be found in data science and AI development entailing both great risks in terms of privacy and industrial data protection. Even considering legal novelties like: the Digital Single Market strategy, the GDPR, the Network and Information Security (NIS) directive, the e-privacy directive and the new e- privacy regulation, legal answers are often regarded as inadequate compromises, where individual interests are not really protected. As far it concerns the academic elaboration with the subject, there are several challenges that still need to be addressed in data-driven financial services that LeADS could met regarding the empowerment of individuals (users, clients, stakeholders, etc.) in their data processing, through “data protection rights” or “by design” technologies, like for example blockchain as described before.

The approach LeADS Early-Stage Researchers will be developed in two folds: 1) The study of digital innovation and business models (e.g. multisided markets, freemium) dependent on the collection and use of data in the financial sector. It will also: a) link this analysis to the exploration of online behaviour and reactions of users to different types of recommendations (i.e. personalized recommendations by financial/industrial applications) that generate additional data as well as large network effects; b) assess (efficiency, impact studies) the many specific privacy regulations that apply to online platforms, business models, and behaviours, and 2) Proposal of a user centric data valorisation scheme by analysing user-centric patterns, the project aims to: a) Identify alternative schemes to data concentration, to place the user at the heart of control and economic valorisation of “his” data, whether personal or not (VRM platforms, personal cloud, private open data); b) Assess the economic impact of these new schemes, their efficiency, and the legal dimension at stake in terms of liability and respect of privacy. The project will also suggest new models allowing the user to obtain results regarding the explainability of the algorithms that are being utilized by financial organizations to provide the aforementioned personalized recommendations for their offerings. LeADS research will overcome contrasting views that consider privacy as either a fundamental right or a commodity. It will enable clear distinctions between notions of privacy that relate to data as an asset and those which relate to personal information affecting fundamental rights.

Against this background, LeADS innovative theoretical model, based on new concepts such as “Un- anonymity” and “Data Privaticity”, will be assessed within several legal domains (e.g. consumer sales and financial services, information society contracts, etc.) and in tight connection with actual business practices and models and the software they use. Finally, due to the increasing potential of Artificial Intelligence information processing, a fully renewed approach to data protection and data exploitation is introduced by LeADS by building a new paradigm for information and privacy as a framework that will empower individuals’ awareness in the data economy; wherein data is constantly gathered and processed without awareness, and the potential for discrimination is hidden in the design of the algorithms used. Thus, LeADS will set the theoretical framework and the practical implementation template of financial smart models for co- processing and joint-controlling information, thereby answering the specific need to clarify and operationalize these newly- introduced notions in the GDPR.

Is blockchain THE reliability solution for big data?

Blockchains have sparked great enthusiasm from the data science community who believes this technology will be THE solution to data authenticity, data privacy protection, data quality guarantee, smooth data access and real time analysis [1], [2]. Data being considered as the new digital oil, data science and blockchain seem to be the perfect match [3]. Indeed, data science allows people/organizations to extract valuable knowledge from humongous volume of structured or unstructured data. So, blockchain provides security and reliability of the manipulated data. But does it sound too good to be true?

 

Blockchain is a way to implement a decentralized repository (a.k.a Distributed Ledger Technology) managed by a group of participants, without necessity of assuming trust among each other. Blockchain groups data records into blocks that are cryptographically signed and chained by back-linking each block to its predecessor. Blockchain was initially proposed for cryptocurrency (e.g., Bitcoin). This first generation of blockchain applications is called Blockchain 1.0. Later, smart contracts were introduced, paving the way to decentralized applications referred as Blockchain 2.0. Today, Blockchain 3.0 explores a wider spectrum of target applications like e-health, smart cities, identity management, etc [4].

 

Big data is one of the possible Blockchain 3.0 applications. Deepa et al [5] recently published a survey on the use of blockchain technology for big data which shows that projects try to apply blockchain-based solutions at different steps of big data processing. This includes big data acquisition (data collection, data transmission and data sharing [6]), big data storage (by securing decentralized file systems or by detecting malicious updates in databases [7]) or big data analytics (for machine learning model sharing, decentralized intelligence and trusted decision-making of machine learning [8]).

 

Although blockchain technology appears to be a good candidate to secure big data, this technology is not flawless [9] [10] [11] and security threats/vulnerabilities have been identified at each layer of the blockchain stack model [12]. First of all, blockchains depend on the underlying network services and attacks on routing protocols or on DNS can harm a blockchain network. At the consensus layer, which is the core component that directly dictates the behavior and the performance of the blockchain, the situation is also complex [13]. The classic Proof of Work protocol is far from being a panacea and is a non-sense from the environment point of view [14]. In addition, most miners are gathering around mining pools to increase their processing capability, and thus, their chance of adding a new block to the blockchain. At the time of writing, the blockchain.com website estimates that six bitcoin mining pools (F2Pool, AntPool, Poolin, ViaBTC, Huobi.pool and SlushPool) represent 63% of the hash rate [15]. If they collude with each other, they can launch the 51% attack and destabilize the whole bitcoin network [13]. Consequently, more and more consensus algorithms are studied, proposed, and extended such as proof of stake, of authority, of activity, RBFT, YAC, etc. However, an ideal consensus algorithm is still missing as almost all algorithms have significant disadvantages in one way or another with respect to their security and performance, as concluded in [13]. The Replicated State Machine layer, which is responsible for the interpretation and execution of transactions, can be vulnerable too. Blockchain technology doesn’t guarantee the reliability of the data, only the integrity of the blocks. For instance, Karapapa et al. [16] showed how to make ransomwares available using Ethereum smart contracts. Confidentiality of data is also not always embedded in the blockchain. Finally, blockchain is implemented as software running on computers and thus attackers can exploit security holes and misconfigurations. E.g., white hat hackers found more than 40 bugs in blockchain and cryptocurrency platforms during a one month bug bounty session in 2019 – 4 of them were buffer overflows which made possible to inject arbitrary code [17].

 

To conclude, blockchain technology offers promising features to big data. However, one should acknowledge the current technical limitations of the technology. Another consideration is legal aspects. Indeed, the European Parliamentary Research Service observed many points of tension between blockchains and the GDPR [18]. When all these issues will be answered then yes … blockchain will be a serious candidate for being the reliability solution for big data.

 

By Romain Laborde

 

References

[1]       “Why Data Scientists Are Falling in Love with Blockchain Tech,” Techopedia.com. https://www.techopedia.com/why-data-scientists-are-falling-in-love-with-blockchain-technology/2/33356 (accessed Apr. 21, 2021).

[2]       2021 at 1:00pm Posted by Isaac Rallo on March 15 and V. Blog, “Six use cases in Blockchain Analysis.” https://www.datasciencecentral.com/profiles/blogs/six-use-cases-in-blockchain-analysis (accessed Apr. 21, 2021).

[3]       “What Makes Blockchain and Data Science a Perfect Combination.” https://www.rubiscape.io/blog/focus-on-data-diversity-to-make-your-ai-initiatives-successful-0 (accessed Apr. 21, 2021).

[4]       D. Di Francesco Maesa and P. Mori, “Blockchain 3.0: applications survey,” Journal of Parallel and Distributed Computing, vol. 138, pp. 99–114, Apr. 2020, doi: 10.1016/j.jpdc.2019.12.019.

[5]       N. Deepa et al., “A survey on blockchain for big data: Approaches, opportunities, and future directions,” arXiv preprint arXiv:2009.00858, 2020.

[6]       N. Tariq et al., “The Security of Big Data in Fog-Enabled IoT Applications Including Blockchain: A Survey,” Sensors, vol. 19, no. 8, Art. no. 8, Jan. 2019, doi: 10.3390/s19081788.

[7]       N. Zahed Benisi, M. Aminian, and B. Javadi, “Blockchain-based decentralized storage networks: A survey,” Journal of Network and Computer Applications, vol. 162, p. 102656, Jul. 2020, doi: 10.1016/j.jnca.2020.102656.

[8]       Y. Liu, F. R. Yu, X. Li, H. Ji, and V. C. M. Leung, “Blockchain and Machine Learning for Communications and Networking Systems,” IEEE Communications Surveys Tutorials, vol. 22, no. 2, pp. 1392–1431, Secondquarter 2020, doi: 10.1109/COMST.2020.2975911.

[9]       X. Li, P. Jiang, T. Chen, X. Luo, and Q. Wen, “A survey on the security of blockchain systems,” Future Generation Computer Systems, vol. 107, pp. 841–853, 2020.

[10]     M. Saad et al., “Exploring the attack surface of blockchain: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 3, pp. 1977–2008, 2020.

[11]     Y. Wen, F. Lu, Y. Liu, and X. Huang, “Attacks and countermeasures on blockchains: A survey from layering perspective,” Computer Networks, vol. 191, p. 107978, 2021.

[12]     I. Homoliak, S. Venugopalan, D. Reijsbergen, Q. Hum, R. Schumi, and P. Szalachowski, “The Security Reference Architecture for Blockchains: Toward a Standardized Model for Studying Vulnerabilities, Threats, and Defenses,” IEEE Communications Surveys & Tutorials, vol. 23, no. 1, pp. 341–390, 2020.

[13]     M. Sadek Ferdous, M. Jabed Morshed Chowdhury, M. A. Hoque, and A. Colman, “Blockchain Consensus Algorithms: A Survey,” arXiv e-prints, p. arXiv-2001, 2020.

[14]     A. B. Business CNN, “Bitcoin mining in China could soon generate as much carbon emissions as some European countries, study finds,” CNN. https://www.cnn.com/2021/04/09/business/bitcoin-mining-emissions/index.html (accessed Apr. 21, 2021).

[15]     “pools,” Blockchain.com. https://www.blockchain.com/charts/pools (accessed May 03, 2021).

[16]     C. Karapapas, I. Pittaras, N. Fotiou, and G. C. Polyzos, “Ransomware as a Service using Smart Contracts and IPFS,” in 2020 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 2020, pp. 1–5.

[17]     Mix, “Security researchers found over 40 bugs in blockchain platforms in 30 days,” TNW | Hardfork, Mar. 14, 2019. https://thenextweb.com/news/blockchain-cryptocurrency-vulnerability-bug (accessed Apr. 28, 2021).

[18]     M. Finck, “Blockchain and the General Data Protection Regulation: Can distributed ledgers be squared with European data protection law?,” PE 634.44, Jul. 2019. [Online]. Available: https://www.europarl.europa.eu/RegData/etudes/STUD/2019/634445/EPRS_STU(2019)634445_EN.pdf.