Data Technology Trend # 1: Trusted

This article is a section of a multi-part series Data Technology Trends (parent article). Refer to previous article, Data Technology — Foundational and next article Data Technology Trend — Strategic.

Ring-fencing Data

For this trend, obviously, I would like to begin from where I left that is the “Data Governance, Privacy, Security and Protection” — cannot insist more on how important that is. Alone with that, my first trend “Trusted” is mainly about Differential Privacy and Authenticated Provenance / Data prominence or lineage.

Having authentic data, where provenance is the key. Even though Blockchain promises trusted source, the question of — how do we know that what we are tracking on the blockchain is real to begin with? Though this question remains unanswered, Authenticated / Data Provenance is an emerging technology and is maturing as Blockchain is maturing.

Trend #1.1: Differential Privacy

Differential privacy matters for digital rights. In today’s world, the fear is data compromise. as we hold key-value pairs of data, the data compromise gives complete information to the hacker. Differentially private systems are assessed by a single value, represented by the Greek letter epsilon indicate more accurate, less private answers. Differential privacy enables the data to have less meaning directly yet useful. This does by adding “noise” to the data.

There are two ways to achieve Differential Privacy — local privacy vs. global privacy.

Local privacy — there is no curator / central trust authority. each data owner adds noise to the data before rendering/sharing the data.

Local Privacy — No curator

Global privacy — there is a trusted curator, who adds noise before rendering answers to the queries to the untrusted querier.

Global Privacy with trusted curator

Further reading: Refer to source on a very good writeup on Differential Privacy.

Differential privacy is increasingly adopted in the field of Machine learning by the top firms such as Facebook, Google, etc.,

Trend #1.2: Authenticated / Data Provenance (aka lineage)

One of the upcoming trends is the “Authenticated Provenance” and how do you know the data is real and valid when it is created. Authenticated provenance is part of “Algorithmic Trust”, Blockchain helps to track the origin. This is especially useful for niche and costly and unique products in place or extremely sensitive/critical information that flows through. Garbage-in is Garbage-Out — Data Provenance is the key to understand the source and authenticity of data.

Data Provenance AKA Data Lineage is paired with metadata to detail the origin of the data.

Source: W3C Provenance.

Trend #1.3: AI / Analytics Governance

Artificial Narrow Intelligence is the call of the day and we are moving towards Artificial general intelligence. Will review MLOps / AIOps in Democratization trend.

While we are at it and using machine learning to support human decision making, AI / Analytics Governance. AI Governance is split into two parts

AI Blackbox Issues:

  • justice and equality
  • use of force
  • safety and certification privacy
  • displacement of labor and taxation
  • information asymmetric
  • finding normative consensus
  • Government mismatches

AI as Technology related:

- Social and Legal layer

o Norms, regulations, legislation

- Ethical layer

o Criteria, principles

- Technical layer algorithms and data

o Data governance, algorithm accountability, standards

Summary: Unless I can trust my data, how can I use it? What if all your watsapp forwards and all your facebook posts comes with two values “The orginal creator of the message” and “the message”, the immutable data set if travelled across will make sure that fake messages travels less faster. A Blockchain like messaging if you will with Data Provenance!

For other articles refer to luxananda.medium.com.

All the views expressed here are my own views and does not represent views of my firm that I work for. Data | Big Data | Cloud | ML