This article is a part of a multi-part series Data Technology Trends (parent article). Previous part — Data Technology Trend #6: Actionable outcomes and Next part — Data Technology Trend #8: Data Next — part 1.

Trend 7.1: Data & AI Market Places and exchanges platforms

Data can be best monetized if the organization can build Data & AI marketplace and exchange platforms are it for internal or external use.

1. Strategy: Choosing a plan for the organization to create a Data value strategy

2. Improving data monetization capabilities — Building strong data pipelines, data platforms.

3. Designing the extendible and live information solutions

4. Continuously improving and generating value…


In this multi-part AWS series, I intend to cover the general aspects of AWS in simple terms, the business case for cloud, some deep dives where required, migration strategy, AllOps, security by design framework, reference architectures, and/or demo, and more. I am putting up a Lego bricks approach with multiple layers (in conjunction with the OSI / TCP/IP Layer) and will be adding several Reference architectures (for Web, Batch, Mobile, Data Lake, Big Data, Machine Learning, etc) after assorting and categorizing these Lego pieces. …


In God we trust, all others bring data. — W. Edwards Deming.

We are in the information age where we have abundant data. Every organization is generating massive amount of data and wants to easily access data on-demand preferably from a single place. Getting more value from the Data quickly with the highest quality is increasingly becoming a challenge for many organizations whatever the size the organization is.

With tremendous data growth in the organizations,

  • security, privacy, and governance of data with a strong data strategy
  • the data source and authenticity of data
  • ability to gather explainable summary from disparate data systems
  • answering the business concerns of “so what” and providing value…


Modern Cloud Data War — DataBricks Fourth part

Challenge 4: Machine Learning & Analytics

Different types of Machine Learning algorithms run on these massive data sets from Recommendation engines to fraud detection etc.

ML & outcome Analytics. Image created by the author

DataBricks provides a very good unified stack that enables your organization to have a very efficient Lakehouse architecture. Have all your data in one place, share data from that place with permissions, shared by different users such as Enterprise Data service users, business users, IT users, Data Scientists, and Enterprise Datawarehous users.

Solution 1: MlFlow Machine Learning

Managed MLFlow

The general lifecycle of how Data goes to Machine Learning is depicted in the image below.


Modern Cloud Data platform war — DataBricks Third part

This article is a part of a multi-part series Modern Cloud Data Platform War (parent article). Previous part — Modern Cloud Data Platform War — DataBricks (Part 2) — Data Fluctuations.

Challenge 3: Massive loads of Data sharing

Massive loads of Data Sharing: Another case for the same firm is that it has to share loads of data with other organizations — say every month-end they transfer 100 PB of data over.

Images by the author

Solution 1: Delta Sharing

In the previous section of the article, we saw for Company X, we could create Data Lake as step 1 and progress to building Delta Lake aka the Lakehouse Architecture in step 2. Now data…


Modern Cloud Data Platform War — DataBricks Second part

This article is a part of a multi-part series Modern Cloud Data Platform War (parent article). Previous part — Modern Cloud Data Platform War — DataBricks (Part 1) — Massive Data Input.

Challenge 2: Data fluctuations

Imagine that there is a flux of data at a different point in time. Between Jan to Jun this year, the data fluctuations was between 500 M to 900 M. With Indonesia Data Architecture relying heavily on the on-premise Data platform, such fluctuations demand scalability (on-premise = vertical scalability, mostly) warrants provisioning for a minimum of 1200 M records and should be able to readily scale anytime. …


Modern Cloud Data platform War — DataBricks First part

This article is a part of a multi-part series Modern Cloud Data Platform War (parent article). Next part — Modern Cloud Data Platform War — DataBricks (Part 2) — Data Fluctuations.

Why do I want to mention DataBricks to be the first platform for the Modern Cloud Data Platform?

I genuinely think Delta Lake will be adopted by more and more organizations and Delta Lake is the future, especially with the Lakehouse architecture where for the organization we can build a unified platform for all the organization’s Data, Big Data Analytics, and AI workloads.

For every organization whether it is…


Image by the author

What problem does it try to solve?

Azure offers a cloud analytics stack that helps to build modern analytics solutions, integration services that enables data movement and transformation becomes imperative — ADF is a cloud data integration service. It provides ETL (Extract, Transform and Load), ELT (Extract, Load and Transform), and data integration services.

For example, assume that a massive financial institution gathers data from multiple upstream systems and generates consolidated reports sliced and diced on multiple data points that enable them to see if these upstream systems are compliant with the regulatory requirements as per the standards expected. To gain such rich insights from data across…


The Azure Data Platform

No organization that is on the growth path or intending to have a more customer base and new entry into the market will restrict its infrastructure and design for one Database option. There are two levels of Database selection

a. The needs assessment — Key questions you should ask before starting the D of the Data part. What is the primary goal? What are the key considerations in selecting your database of choice?

  • Where housed — Cloud-native or on-premise or hybrid or multi-cloud (Poly-cloud)
  • Read / write-heavy? — What are the Throughput needs? — what is the pattern of data…


In the previous sections, we covered the Core Infrastructure for Single Server Deployment. All that we discussed are relevant for the Multi Scaleable deployment along with a couple of resources/services.

Image created by the author

As part of Single Server Deployment we covered the below resources:

  1. Virtual Net (VNet)
  2. Subnet
  3. The Link (VNet Peering, VPN Gateway, Express Route, Service Endpoints and Private Link,
  4. Virtual Machines
  5. Azure VMware Solutions
  6. Storage Accounts and
  7. Infrastructure as Code

8. Virtual Machine Scaling Sets

Azure Virtual Machine Scaling Sets lets you create and manage group load-balanced virtual machines.

Say, your organization has to have a solution that we need to consistently scale your solution to…

LAKSHMI VENKATESH

Application Development Head | Data Strategy | Big Data | Analytics & BI | Data Governance | Cloud

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store