In a changing digital world, old business models are being disrupted and the winners will be those who can adapt quickly. We are in the transition from web 2 into web 3. Web 1, 2 or 3, it doesn’t really matter, because it is all data, right? Then pay attention because it is not that simple.
“it doesn’t really matter, because it is all data, right? “
Web 2 is company centric, with a need for data availability, quality and interoperability for monetization. Topics most companies struggle with. Web 3 is user centric, meaning users own and control (their) data, with the potential share, collaborate and monetize data.
The last decade, the focus was on web 2. You can recognize this in the high volume of cookies and apps to capture consumer data. The growth of central data platforms that store, harmonize and share data. And the rise of the function of data scientist.
How will the web 3 differ? Web3 is powered by a decentral network of peers that enable data sharing between users and applications. This allows for a more easy, transparent, and secure exchange of data. Exactly the struggles of web 2.
This data exchange is gaining traction in the mainstream, supported by the now acceleration increase of blockchain and on the horizon, the (meta)verse (The Metaverse: new frontier that re-imagines retail & health. And data re-imagines the Metaverse. – D8A directors). Decentral data sharing in web 3 allows users to interact directly without a central intermediary company. It also allows for an even faster growth of data, which will require adjustments in the analysis of data as well as an opportunity to increase development of AI solutions. And – important – it gives the user control where and how to share which data with who. In theory, the ultimate data privacy.
Web 3 does require rethinking data security, good to keep that top of mind.
Blockchain (or smart contracts) is a good example of how data provenance (what is the source of the data, initial data quality and which data processing activities have taken place. Provenance is important for companies and consumers alike) can be controlled. By capturing every step of the way in smart contracts, every step is retraceable. Relevant for companies, e.g. for fraud detection and prevention. Or being able to build a – decentral – large volume of scientific data, one of the key challenges today.
And for users, blockchain is relevant to understand how their data is used and if it is high in demand, thereby creating monetization possibilities for users. Increasing the incentive for data sharing. Decentralized data democratises data access, enabling companies AND users to create, deploy, and use specific instruments that were before restricted Here the shift from company-centric to user-centric becomes clear. Data can become your income.
Either in web 2 or 3, data is the foundational layer of innovation. Web 3 extends that with building meaningful relationships between companies and users (cut-out the middle men), improved quality of data, elevated possibilities and value of digital transactions and increased data privacy.
As a user, what do you need to have in place? Create a thorough understanding of your rights as owner of your own personal data. Share your data were and when you want, but act wisely. Understand which data can have value where and monetize accordingly. Have your data available in a personal data vault in accordance with Self Sovereign Identity principles. In short, become a data expert!.
There has never been a better time to create impact with data & analytics. More and more data is available, computing power is increasing fast and analytical techniques are getting mature. Being data driven is the talk of the town, for sure it is part of the strategy of your organization. In the last decade, most companies have invested in data & analytics initiatives to enhance efficiency, increase sales and comply to regulation. Yet, these initiatives have not yet resulted into full business value. Organisations are getting ready for the next wave; getting value out of data & analytics products.
1. Climbing fast; the importance of data in value creation
Data is an asset and has (future) economic benefits. During the last years, the volume, complexity and richness of data has grown exponentially — mainly driven by e-commerce and Internet of Things or sensor data-and is expected to continue to do so (see also: McKinsey). In fact, so much of the world is instrumented that it is difficult to actually avoid generating data. We have entered a new era in which the physical world has become increasingly connected to the digital world. Data is generated by everything from cameras and traffic sensors to heart rate monitors, enabling richer insights into human and “thing” behavior.
Add to that the current growth in analytical power (e.g. analytics, machine learning, artificial intelligence). And the confluence of data, analytical and computational power today is setting the set for the next wave of digital disruption and data driven potential for growth.
“… the confluence of data, analytical and computational power today is setting the set for the next wave of digital disruption …”
This growth has a number of preconditions. Of course, organisations need to recognize that data is an asset. It also requires that required data is correct, available and (re-)usable. And potential revenue generation need to be qualified, e.g. through data marketplaces, data-as-a-service integration, digitization of customer interactions, product development, cost reduction, optimize operations and improving productivity.
In the last ten years data & analytics initiatives within organisations mainly focused on:
→ Controlled data & analytics, e.g. data organisation, data governance, privacy or trusted analytics;
→ Centralized source of available data, e.g. data platform or data engineering;
→ Insights value chain, e.g. deriving insights through use-case based machine learning, process mining or BI self-service by a team of analytics experts.
Not all initiatives bring the desired results. For example, deriving new insights is often considered as innovative, but any executive will recognize the sprawl of self-generated BI reports each claiming their own version of the truth, making it complex and time consuming to turn insights in to company steering. And although these initiatives are by-and-large based on business cases such as efficient reporting, comply to regulations or end-of-life for legacy systems, controlling and centralizing data & analytics, they are in fact supporting hygiene factors. And there is the trend that algorithms seem to be commoditizing, e.g. Google and Amazon are providing free access to their analytics libraries. In the end, this trend will transform any insights value chain into becoming a hygiene factor as well.
In any case, the results of most current data & analytics initiatives are not a breakthrough innovation or digital disruption.
2. Approach your data with a product mindset
So, while most data & analytics efforts are — still — performed to facilitate and improve the insights value chain, the real innovation is productization of data & analytics. Organisations need to look beyond their team of skilled data & analytics professionals with governed data sources, latest analytics tools and technologies if they want to leverage data to improve and increase revenue. To actively contribute to this, organization should start viewing data & analytics through a product development lens.
This means that we need to transform from data & analytics (point) solutions mainly focused on internal value towards the creation of full-fledged data & analytics products. Productization involves abstracting and standardizing the underlying principles, algorithms and code of successful point solutions until they can be used to solve an array of similar and relevant business problems. In the end, this should lead to a robust portfolio of data & analytics products.
To enable this organisations need to have the following foundation in place:
Bridge the gap between data & analytics and business – In many organisations, data & analytics and business execution are totally separate. The business lacks understanding of what is possible and therefore will ask for everything, without prioritization and lacking a requests funnel. This leads to the development of data & analytics point “solutions” without full business potential. Move beyond the current hype of ‘data literacy’ and actually involve relevant (business) stakeholders into data & analytics. Embrace change. And be practical by starting with data quality, ownership or relevant use-cases to improve daily operations through analytics, BI or robotics. Expand from there and be persistent. Truly and sustainably embedding data & analytics in an organisation is a long-term process.
Anchor data & analytics competences at executive level – Business impact from data-derived insights only happens when data & analytics is implemented deep within and consistently throughout the organization. This requires commitment, ownership, sponsorship and direction of a leader with the authority and sufficient understanding of data & analytics and its potential.
Understand potentials— the value of data & analytics depend on uniqueness and end uses How to monetize the potential of data & analytics? Its value comes down to how unique it is, how it will be used, and by whom. Understanding the value is a tricky proposition, particularly since organizations cannot nail down the value until they are able to clearly specify its uses, either immediate or potential. Data & analytics may yield nothing, or it may yield the key to launching a new product line or making a scientific breakthrough. It might affect only a small percentage of a company’s revenue today, but it can be a key driver of growth in the future. General rule of thumb is that uniqueness of data will increase its value, so find that (hiddenn) gem. Where possible, join unique sets to further enhance the value and potential of data. Product development takes an investment in time, people and technology. Set up a — technical — test-and-learn lab environment where pilots and beta-version products can be developed and by which the value can be further explored and understood. Include domain experts, data scientists, data experts. Capture client- and end-user needs in this lab environment and transform it into solutions and products. Identify quick wins for early adopter clients, to learn and develop how products work in a client environment. Set up a cooperation with sales departments and potential partners. Standardise, improve and scale products .
Take sufficient time — be lean where possible – Many organisations have invested resources and investments in data & analytics initiatives, e.g. hiring and/or educating data scientists, data lake implementations and data ownership. They are eager to finally monetize the data so that it indeed is ‘the new oil’. But if products are unclear or without market relevance, there is the risk of missing targets and being overtaken by competitors. At the same time, be opportunistic for quick results, perform pilots as much as possible to create an early adopters client base.
3. The coming wave: data & analytics product opportunities
Potential data driven product opportunities are well researched, identified and described. Think about e.g. IoT-based analytics for leasing companies and car insurers, real-time supply and demand matching for automotive, logistics and smart cities, personalized e-commerce and media, data integration between banks and B2B customers and data driven life sciences discovery. Besides, resolving the above mentioned foundation, a detailed approach on to realise these opportunities is less clearly defined. This paragraph contains the 5 main steps that all organisations should follow:
Step 1: Conceptualize the product
Identify a data & analytics product that meets the market needs within the lab environment. To identify relevant opportunities, include product expert, business groups (e.g. super users, sales and marketing) and -potential — clients. The process involves product definition and identification of data required for the product (which should include sourcing data creatively). Organisations with unique internal data have an increased opportunity to create highly valuable products with a good competitive edge. E.g. a bank with an agricultural background can use unique data which are highly sought after by other financial institutes. And a supply chain company can enhance their planning software with integrated robotics to increase efficiency for their clients, enhancing churn and sales opportunities. Take the uniqueness of available data into account early on in the process. Determine the market position and potential business model for the prototype. There are three main prototype categories, i.e. a data-as-a-service product, algorithms code performing robotics and analytics & BI and software code containing interactive insights based on analytics algorithms.
Step 2: Acquire and build
Data as foundation for new products is traditionally captured internally and externally to support daily operations and reporting & insights. Given the vast amounts of data being available from commercial and public sources, extend the purpose of data acquisition for productization. Acquired data needs to be correct, timely, understandable, with a clear provenance — including restrictions for usage & storage and in accordance with regulatory compliance such as privacy laws.
To design and build correct algorithms supporting robotics and/or analytics products, an analytics pipeline needs to be established. In this pipeline correctness, reusability, bias, quality and provenance of algorithms and quality of code will be managed. Integrating CI/CD (continuous integration / continuous development) supports a lean and agile analytics pipeline with fast testing of the prototype value. Data and code need to be stored in an agile, scalable and secured environment. And finally, data & analytics products gain value from the context of their use, user interface and/or ease of use. So, incorporate UX design into the product development approach.
Step 3: Refine and validate
Once data is identified and (algorithm and software) code and user interfaces are designed and build, they need to be enriched, refined and validated.
Step 4: Readiness
Store data in an advanced environment where it can be integrated, queried, processed and searched. This makes data sufficiently, fast and reliable available for data-as-a-service products. Ensure that this is supported by a solid and robust data architecture. Distribution channels of algorithms and software code can be numerous, e.g. in a cloud environment where it can integrate with web- and mobile solutions. Custom made build into client environments. Or through pre-defined (API) connections made to measure for clients.
Step 5: Market and AI feedback
The competitive nature of the information product space, availability of new data sources and demand for timely decision support require an ongoing emphasis on innovation, pricing and monitoring product usage. Adding this step at this stage of the analytics-based data product development process is consistent with the iterative nature of product development in a “lean startup” context. Once again, the evolution of new technologies has provided a mechanism for facilitating a feedback and information extraction process from the marketplace
Brief recap: companies are eager to utilize the new data-oil. Not every organisation is able to do that successfully. By taking a comprehensive approach, persevering through sufficient knowledge building on ALL organisational level and starting small based on a step by step approach, you can be successful with data products and services.
The concept of synthetic data generation is the following; taking an original dataset which is based on actual events. And create a new, artificial dataset with similar statistical properties from that original dataset. These similar properties allow for the same statistical conclusions if the original dataset would have been used.
Generating synthetic data increases the amount of data by adding slightly modified copies of already existing data or newly created synthetic data from existing data. It creates new and representative data that can be processed into output that plausibly could have been drawn from the original dataset.
Synthetic data is created through the use of generative models. This is unsupervised machine learning based on automatically discovering and learning of regularities / patterns of the original data.
Why is synthetic data important now?
With the rise of Artificial Intelligence (AI) and Machine Learning, the need for large and rich (test & training) data sets increases rapidly. This is because AI and Machine Learning are trained with an incredible amount of data which is often difficult to obtain or generate without synthethic data. Large datasets are in most sectors not yet available at scale, think about health data , autonomous vehicle sensor data, image recognition data and financial services data. By generating synthetic data, more and more data will become available. At the same time, consistency and availability of large data sets are a solid foundation of a mature Development/Test/Acceptance/Production (DTAP) process, which is becoming a standard approach for data products & outputs.
Existing initiatives on federated AI (where data availability is increased by maintaining the data within the source, the AI model is sent to the source to perform the AI algorithms there) have proven to be complex due to differences between (the quality) of these data sources. In other words, data synthetization achieves more reliability and consistency than federated AI.
An additional benefit of generating synthetic data is compliance to privacy legislations. Synthesized data is less (but not zero) easy directly referable to an identified or identifiable person. This increases opportunities to use data, enabling data transfers to cross-borders cloud servers, extend data sharing with trusted 3rd parties and selling data to customers & partners.
Synthetisation increases data privacy but is not an assurance for privacy regulations.
A good synthethisation solution will:
include multiple data transformation techniques (e.g., data aggregation);
remove potential sensitive data;
include ‘noise’ (randomization to datasets);
perform manual stress-testing.
Companies must realize that even with these techniques, additional measures such as anonymization can still be relevant.
Outliers may be missing: Synthetic data mimics the real-world data, it is not an exact replica of it. So, synthetic data may not over some outliers that the original data has. Yet, outliers are important for training & test data.
Quality of synthetic data depends on the quality of the data source. This should be taken into account when working with synthetic data.
Although data synthetization is taking center stage in current hype cycles, for most companies it is still in the pioneering phase. This means that at this stage the full effect of unsupervised data generation is unclear. In other words, it is data generated by machine learning for machine learning. A potential double black box. Companies need to build evaluation systems for quality of synthetic datasets. As use of synthetic data methods increases, assessment of quality of their output will be required. A trusted synthetization solution must always include good information on the origin of the set, its potential purposes for usage, its requirements for usage, a data quality indication, a data diversity indication, a description on (potential) bias and risk descriptions including mitigating measures based on a risk evaluation framework.
Synthetic data is a new phenomenon for most digital companies. Understanding the potential and risk will allow you to keep up with the latest development and ahead of your competition or even your clients!
“Where is your data stored?” is not asked often enough. An interesting topic, brought up by an owner of a Dutch cloud company during a radio program.
It surprised me. As a Data Professional who has lost count of the number of times I have asked this question. Why is this still not a common concern?
When I ask the question, the reactions vary. Often times the answer is a simple “I don’t know” When I do find the right specialist to tell me these details, the conversation goes something like this:
“Where is the data stored?” “In the cloud.“ “Yes, but where are those servers located?” “In Europe” “Yes, what country in Europe?” “In The Netherlands. “ “Yes, do you know in which city?” “Yes, in Amsterdam.”
The tedious process goes to show that it’s a question that is not asked often enough. But it matters.
It’s interesting, it seems we don’t often enough realise that although data is digital, it always has a physical aspect to it. Much like us, the data has to live somewhere. In turn, cyber security is not just about the digital security of who can access your data, but also the physical one. What physical measures are in place to ensure that no one breaches your data centres? Do you know who can access the building where your data lives? However, security is but one concern when it comes to the physical location of data.
Laws and regulations mandate where data may be stored and processed. And processing includes ‘viewing’ data. GDPR, for instance, requires that data is stored in Europe. That means that some international cloud providers are automatically a non-option, when they don’t have data centres in Europe. If your data centres are located in a different country, you’re automatically dealing with cross-country transfers. Be vocal about this towards your cloud provider. If you don’t decide where your data is physically stored, they will. Then it’s out of your control. And most likely not in line with legal requirements.
Did you know that there are countries that demand their data is stored and processed locally only?
Location may also influence the reliability of you data. If your data centres are located in an area where power outages are common, you will be dealing with limited availability. Or if it is a politically unstable region, your data centre may be put out of the running all together. Next to that, the further away the physical location, the higher the risk for limited connectivity. Both the quality of the network and the distance can greatly impact connectivity. All these aspects can influence your ability to deliver the data-driven digital products and services your customers are paying for. In other words, thinking about the location of your data is business critical.
So my takeaway for you, start asking the question: do you know where your data is stored?
In the digital world, there are two main flavours, those with extensive data and those that require extensive data.
– In this article, we leave out the data-native (Big Tech) companies -.
Those with extensive data, are in fact the (international) corporations with trusted brands, mature system landscapes and long long-lasting relationships with customers and partners. They can build upon large quantities of (historical) data, consistently used for existing processes and products. These could do much more with their data, maneuvering (the Gambit) real value out of their data.
Most corporations already invested in structural advantages for a competitive data edge: a supporting platform infrastructure, data quality monitoring, established data science teams and a data steward / data scientist attitude. For a maximal return on those investments, companies need to go the extra mile.
A strategy for data
The most common pitfall of a data strategy is that it becomes an overview of big words only, with a (too!) high focus on technology and analytics. Yet technology should be an enabler. And analytics is just a manifestation. Don’t gamble with data (products), a good data strategy starts with a clear vision, related to market, technology and (regulatory) developments. Include a target operating model to achieve the strategy. But most of all, include on the value of data. Determine use-case types that will create most value. Large corporations have an unparalleled knowledge of industry and markets and are uniquely positioned to oversee this. Of course, there are value-cases for efficiency gains and productivity improvements. Limiting to these obvious values, tends to close doors on new opportunities. Companies must have a clear ambition pathway to data-driven revenue. This new revenue can include rewiring customer interaction, creating a completely new product or business and stepping into new markets.
In practice, data driven revenues proof to be more difficult than imagined. The effort to introduce new products within new markets combined with uncertain results make companies hesitant. Without a solid and funded ambition and a defined risk appetite, this can result into only minimal innovations, such as adding data features (apps!). Compared to data-native companies, this minimal innovation sometimes seems small potatoes. A clear data strategy gives companies mature guidance for innovation KPIs, investments, risks, and market opportunities. The data strategy will help to build success and develop new services, products and even ventures.
Data equals assets
In general, there are two flavours when it comes to data within companies. Companies have less data than they realize. Or companies have more data than they realize and have an under-utilization of the data, due to insufficient awareness of its value. Understanding the value of your data is based on 5 pillars:
Historical data cannot be easily replicated, years of data about customers, productions, operations, financial performance, sales, maintenance, and IP are enormously valuable. Such historical data is beneficial for increasing operational efficiency, building new data products and growing customer intimacy. Although Big Tech companies have been around for some years already, they can not compete with dedicated historical data sets. If the (meta) data is of good quality, the value increases even more. Mapping where this data resides gives an up-to-date overview of relevant data throughout the system landscape.
Corporations are highly aware of the relevance of privacy regulations and have adopted data privacy measures and controls into their data operations. This way, the data that is available is for sure in accordance with (global) data privacy legislation.
Being part of a – traditional – chain with external suppliers and receivers (e.g. supplying materials to a manufacture who sells it to a retailer) can leverage the data into multiple views on e.g. sourcing and warehouse management. Established corporations are uniquely situated to build data-chains. Having a trusted brand creates traction for cooperation and partnerships to capture, integrate, store, refine and offer data & insights to existing and new markets.
“Understanding the value of data means requires real entrepreneurship”
Large corporations can enhance existing & new products with data, e.g., through sensor data. Big Tech companies are doing that now mostly for software products. More traditional companies are particularly capable to do this for hardware products. This way of thinking is still very much underdeveloped, because it is difficult to introduce a new product or even worse, enter a new market with a new product. Yet, it is also the ultimate opportunity! Build data entrepreneurship, by starting small while understanding the full potential of data. Examples of small starts are identifying if a data model can be IP – e.g., when it is part of a larger hardware product. In real life, starting small often means focusing on a solution that is close to home, e.g., joining multiple data sets into one and/or build dashboard, which can be offered to customers as extended service. These are often chosen because of feasibility reasons. From a data product perspective, don’t consider such an approach as not small; consider it as not even starting. Companies that do not progress beyond these products should at least have a simultaneous experimental track, building and failing new products and services for lessons learned what works and what doesn’t. Understanding the value of data requires entrepreneurship (see also the example of Rolls Roycehere.)
Large and established corporations are the epiphany of entrepreneurship. It is at their very core. Yet, often not enough for data. Data can be so alien to them that experimenting for value is hesitant or not happening. And this is where start-up companies are not lacking. They might not have the large historical data sets, trusted data chains or easy connections with available hardware products. They do have the entrepreneurial spirit and are highly aware of the value of data. And have the capability to experiment and become successful with new products.
Becoming data entrepreneurial means knowing which data you have, understanding the (potential) value and daring to look beyond the obvious.
All good data starts with business metadata. This business metadata is the information we need to build a dataset. There is someone in the business who approved the collection and processing of data in the first place. He/she also provides requirements and descriptions on what is needed. The challenge is that this information is often not managed good enough throughout time which leads business metadata quality to decrease. And thereby decrease good data. And that affects your AI solutions, BI reporting and integration with platforms.
When we know what business stakeholders want, we can design and implement this into physical form through technical metadata. We can now build the solution or buy it of the shelf and map it to the business metadata.
Now that we know what data we need, what it means and have a place to store and process data; we can start doing business. Doing business will generate operational metadata. Operational metadata is very valuable in monitoring our data processes. We get insights in what data is processed, how often, the speed and frequency. This is great input in analysing the performance of our IT landscape and see where improvements can be made. Further we monitor the access to systems and data. When we take it a step further we can even start analysing patterns and possibly spot odd behaviour as signals of threats to our data.
Social Media metadata
Finally we take the social metadata as an inspiration. This is where the value of your data becomes even more tangible. Value is determined by the benefit the user experiences. The way that he uses the data is then an indicator of value. Thus if we start measuring what data is used often by many users, this data must be important and valuable. Invest in improving the quality of thatdata to improve the value created. Behaviour is also a good indicator to measure. How much time is spent on content and which content is skipped quickly. Apparently that content doesn’t match up with what the user is looking for.
Governance metadata All metadata required to correctly control the data like retention, purpose, classifications and responsibilities. – Data ownership & responsibilities – Data retention – Data sensitivity classifications – Purpose limitations
Descriptive metadata All metadata that helps understand and use and find the data. – Business terms, data descriptions, definitions and business tags – Data quality and descriptions of (incidental) events to the data – Business data models & bus. lineage
Administrative metadata All metadata that allows for tracking authorisations on data. – Metadata versioning & creation – Access requests, approval & permissions
Structural metadata All metadata that relates to the structure of the data itself required to properly process it. – Data types – Schemas – Data Models – Design lineage
Preservation metadata All metadata that is required for assurance of the storage & integrity of the data. – Data storage characteristics – Technical environment
Connectivity metadata All metadata that is necessary for exchanging data like API’s and Topics. – Configurations & system names – Data scheduling
Execution metadata All metadata generated and captured in execution of data processes. – Data process statistics (record counts, start & end times, error logs, functions applied) – Runtime lineage & ETL/ actions on data
Monitoring metadata All metadata that keeps track of the data processing performance & reliability. – Data processing runtime, performance & exceptions – storage usage
Controlling (logging) metadata All metadata required for security monitoring & proof of operational compliance. – Data access & frequency, audit logs – Irregular access patterns
User metadata All metadata generated by users of data to – User provided content – User tags (groups) – Ratings & reviews
Behavior metadata All metadata that can be derived from observation to – Time content viewed – Number of users/ views/ likes/ shares
Avoid mistakes. How data dependent is the newest digital development?
The Metaverse, generated in science fiction and frequently applied to gaming platforms is trending in e-commerce and increasingly in healthcare.
Simply stated, the Metaverse is a connection between the physical and virtual worlds and is seen as the successor to the mobile internet.
Purchasing a digital product in one ecosystem in the Metaverse (e.g., Facebook) allows you to use it in another (like TikTok). Or buy a physical product, which includes a digital twin — a digital representation as well as a statistic — for your online persona. And vice versa, the digital twin could be used to increase sale at a physical location, if a physical product is not available at a shop, the digital twin can be shown as example. Want to model how a car would behave with the same conditions as in the physical world your are in right now (weather, population, other vehicles on the road) then you can in the Metaverse. Or just google Fortnite’s Ariana Grande concert in the Metaverse.
Meta-commerce is you like, beyond e-commerce. Metaverse is also beyond virtual reality. It is a hybrid of VR, AR, mixed reality and can interact with real life.
So interoperability between eco-systems is key. This is how the timestamps of blockchain can show its worth beyond crypto! The same goes for data. To understand the value of good data, you need to think beyond the structured data that often comes to mind when we talk about data, AI and innovations.
The Metaverse is all about unstructured data or files; e.g., images, videos, music, SEO. Often a neglected area when it comes to data quality concepts such as accuracy (high, medium, low quality), timeliness, versioning (which originates from archiving principles and is now becoming directly related to the core business & product life cycle management), format and completeness. Each file needs to have sufficient data quality rules, definitions and other meta data to enable the above mentioned interoperability. It needs to be totally clear that the digital twin you’re being is from this season or last season for instance. And don’t forget hygiene factors such as ownership (who owns the digital twin? Who can re-sell it?), customization possibilities, portability, sharing agreements, security and most of all privacy. Hot topic, privacy and The Metaverse. Being in accordance with legislation is key in a highly digital world.
From an analytics perspective, the Metaverse is similar to AR &VR. It needs high quality training data. Which means data sets needs to be accurate and fit for use, removing bias and including good data labeling — based on standard classifications.
The Metaverse within the healthcare sector seems a logical next move. Here the ownership, portability and privacy are even more significant. Further increasing the value of good data quality, governed by a fitting regime.
In short, the Metaverse is the upcoming opportunity to increase the value of good data. And for businesses to become further data driven.
Enabling or blocking? Sovereignty of personal data.
Within the digital world, individuals are mostly viewed as — potential — consumers (obviously already a high share) or patients (currently growing share). The data of individuals needs to comply to the regulations within the country or region where the data is collected, i.e., it needs to fit with privacy and security.
Companies are building views on individuals, based from the name, address, email etc, which have been provided through every registration to an online service. As well as online behaviour, e.g., through tracking cookies. These centralised views or centralised identities are stored within silo-based platforms. Neither personal data or individual behaviour are well portable. This means that your digital identity exists in many small pieces with several companies knowing different information about you. This also means that you have to create a unique password for every profile you make, which can be cumbersome, and many tend to use the same password more than once. All of this creates security risks, since your personal data is being stored and managed by many entities and because a password breach might give access to several of your accounts.
An attempt to address these issues is federated identities. Individual identities are managed in a company or government centralized system. The system then distributes the data from the individual to a digital service. Examples where this is in use is within banks, insurers, retail and health. A federated identity enables easier digital activities through a single-sign-on solution However, a federated identity is still silo-based, since it only can be used with web services that accept this solution.
“………That’s right, SSI sets data ownership at the individual level.”
A next generation of identity solutions that is currently being developed and taken into use is self-sovereign identities (SSI). This type of digital identity is a user-centric identity solution that allows you to be in control of your data and only share the strictly relevant information. An example would a situation where you need to prove that you are of age. With an SSI you can document that you are over 18, without disclosing your exact age. Or documenting that you have received a specific vaccine, without disclosing information about all the vaccines you have ever gotten or other sensitive health data. Other examples are sharing that you have graduated to your — future — employer, your medical record with a hospital and your bank account with a store. In your own personal vault if you like (also: a ‘holder’ or ‘wallet’), you own and manage your data. That’s right, SSI sets data ownership at the individual level. Data ownership would resolve a large topic, that often proofs to be a blocker for companies to fulfill their digital ambitions. From this vault you decide to which companies & organisation you want to share your personal data to be defined per specific purpose. For this purpose, personal data needs to be classified (e.g., in accordance with privacy & security regulations) which data is open for all, which is private and which is secure data. The vault provider needs to have good technical solutions (e.g., with verifiers and encryption), a sufficient governance regime and controls in place to support this.
SSI will mean that individuals need to understand what ownership comprises of, what potential risks are and what good practices are to share data. Data literacy should be extended from mostly companies to more individuals. And companies should prevent technical, legal, ethical, fairness and security pitfalls (see also: 10 principles for SSI), e.g, for transparency for systems & algorithms as well as data monetization.
What is data monetization? According to McKinsey, it is the process of using data to increase revenue, which the highest performing and fastest growing companies have adopted and made an important part of their strategy. Internal or direct methods including using data to make measurable business performance improvements and informed decision making. External or direct methods include data sharing to gain beneficial terms or conditions from business partners, information bartering, selling data outright, or offering information products and services (Definition of Data Monetization — IT Glossary | Gartner).
How to deploy a data monetization strategy
Companies that innovate through data monetization recognize this, monetization can be tricky. Get it right, and you have happy customers and users who are willing to pay for your product. But mis-prioritize and your audience numbers quickly drop, along with your revenue. Building data monetization based on the principles of ‘trusted data’ ( mitigates the risk of mis-prioritisations).
There is no clear-cut answer how a data-driven product generates revenue, or when that is appropriate. And there will, of course, be some products that never monetize.
Having a strategy will deliver guidance. A data monetization strategy is, simply put, a plan to generate revenue for data-driven products. Like with any plan, it guides and brings structure. It is not something that is fixed — it should be flexible enough to develop with the product, the market the product exists in and its users. Goals can and will change over time, and so strategies need to evolve to continuously achieve the goals they’re designed to target. Data products can be based on loyalty & subscription models or a on-time purchase model. It is important to understand at the beginning of the strategy which model(s) the data can leverage to create focus and scalable results.
Data monetization strategy must be built upon the following pillars:
* Understanding how data can be converted into value (see below) and the associated opportunities and challenges of data-based value creation;
* Strategic insights into improving and preparing data to support monetization;
* Strategic insights in the potential value, markets and ecosystems.
Opportunities for monetization
Data driven business models help to understand how data can discover new opportunities. This can be focused on value for efficiency (reducing costs and risks), value for legislation (comply with relevant regulations) and value by maximizingprofits, by increasing impact on customers, partnerships and markets. This can include embedding data models, metadata and analytics into products and services. Data monetization needs to be scalable, flexible and user friendly, thereby providing advantages for the company and its customers.
Indirect monetization includes data-drive optimization. This involves analyzing your data to gather insights in opportunities to improve business performance. Analytics can identify how to communicate best with customers and understand customer behavior to drive sales. Analytics can also highlight where and how to save costs, avoid risk and streamline operations and supply chains.
“Having a full understanding of monetization possibilities will help to keep an open mind.”
Direct monetization of data is where things get interesting. It involves selling direct access to data, e.g. to third parties or consumers. This can be in raw form (heavily anonymised due to privacy regulations), aggregated, metadata only or transformed into analysis and insights.
This is the most direct data monetization method. Data is sold directly to customers (or shared with partners for a mutual benefit). The data is e.g., aggregated and/or anonymised, to be fully in accordance with legislation. And to enable trusted data. Buyers mine the data for insights, including combining it with their own data. Or use it for AI solutions within a software. Ecosystem play is the newest area for Data-as-a-service.
This applies analytics to (combined) internal and external data. It focuses on the insights created using data, rather than the data itself. Either the insights are sold directly or provided as e.g. analytics enabled apps.
This is a more flexible type of data monetization. The data ecosystem provides highly versatile, scalable and shareable data and/or analytics, when needed in real-time. Standardized exchanges and federated data management enable using data from any source and any format.
This is the most advanced and exciting way of monetizing data. Analytics-as-a-service seamlessly integrates features such as dashboards, reports and visualization to new and existing applications. This opens up new revenue streams and provides powerful competitive advantage.
Having a full understanding of monetization possibilities will help to keep an open mind. Where many companies are focusing on analytics products & services, there are more opportunities! Always stay within legal & ethical boundaries, but explore all opportunity formats to grasp new markets.
How the Legal, Economical, Technology and Scientific reflex impact data innovations.
Much has been said about proper data ownership. Many companies struggle to have successful data quality, data monetization and even advanced analytics, when ownership is unclear. Or not pro-actively practiced. In short: a data owner is accountable for correct and sufficient availability of good and trusted data quality. That accountability can only be applied if it is applied at the right organisational level, i.e. on the executive level. Any company that tries to divers data ownership to the tactical level (e.g. a data team lead) or operational level (e.g., a data steward) will not reach their data driven goals & objectives.
For those organisations without proper data ownership with a clear strategy, defined & monitored KPIs and supporting governance structure, there usually is only 1 road ahead.Employees on the organisational level will continuously firefight data issues, while e.g., AI solutions, data innovations and compliance goals are not met.
In highly sensitive organisational politics environments, substitute solutions for ownership are often sought. You can identify them as the L.E.T.S reflex:
Solutions with a high focus on Legislation (e.g., privacy, financial or health regulations);
Solutions with a high focus on Economical benefits (e.g., monetizing insights, increasing operational effectiveness through robotics);
Solutions with a high focus on Technology (e.g., a data platform/eco- systems, AI & Machine Learning, datawarehouses, BI tooling);
Solutions with a high focus on Science (e.g., researching data or analytics solutions, whitepapers, subsidized research).
Data driven companies will recognize at least one of these reflexes. Data innovations is complex, due to many factors. Which makes owning up to a solution for this complexity, a challenge. It is often easier to focus on one or multiple reflexes. Yet none of these reflexes can be seen as a substitute for data ownership.