A data mesh has emerged as a possible solution to the challenges of data access plaguing many large organizations. This approach takes data out of stovepipes and puts it directly in the hands of business users, but in a controlled manner that maintains strong governance.
Done well, a data mesh can speed time to market for data-driven applications and give rise to more powerful and scalable data products. These benefits have strategic implications. But it’s essential to approach the buildout in the right way. Otherwise, well-intentioned programs can collapse under their own weight. A leading life sciences company, for example, was prepared, from a technological standpoint, for the hard work a data mesh would require. But what it was unprepared for—and found far more challenging—was harmonizing data-management practices and building agreement among different business groups on which data products and use cases to centralize. Failing to anticipate these issues forced the project to pause midstream, creating confusion and prompting business users to revert to older and less efficient ways of managing data.
By understanding what domain-based data management is and hewing to a few core precepts, companies can avoid the learning pitfalls others have faced and begin reaping the rewards of a data mesh more quickly.
What exactly is a data mesh?
The term “data mesh” was coined by Zhamak Dehghani in 2019, when she was a principal at Thoughtworks. It caught on as a way of capturing the idea of distributed data access. But interpretations of what that means in practice abound. Is it a new technology, does it make existing data repositories obsolete, or is it a theoretical construct?
McKinsey defines a data mesh as a data-management paradigm that organizes data in domains, treats it as a product, enables self-service access, and supports these activities with federated governance (Exhibit 1). Here is why each of these elements is important.
Domain-based data management allows data to sit anywhere. Business teams own the data and are responsible for its quality, accessibility, and security. Domains are collections of data organized around a particular business purpose, such as marketing, procurement, or a particular customer segment or region. They contain raw data as well as self-contained elements known as data products. These data products bundle data to support different business applications, and they are designed with the internal wiring needed to plug directly into relevant apps or systems. A self-serve data infrastructure underlies the data mesh and acts as a central platform, providing a common place for business users to find and access data, regardless of where it is hosted.
Governance is managed in a federated “hub-and-spoke” way. Under this approach, a small central team sets controls, and a supporting data infrastructure enforces them. Standards defined in code enable data product teams within the business to comply with metadata documentation, data classification, and data quality monitoring.
Together, these elements create a self-organizing mesh in which different groups around the business can come together, define their data requirements, agree on how new data is to be shared, and align on the best ways to employ that data.
Executed well, a data mesh can deliver powerful advantages
Most product and solution breakthroughs occur within the business—and few such breakthroughs can occur today without data. Data meshes allow business users to get their hands on critical information more quickly, delivering the following benefits:
- Speeding time to market for data-analytics applications: Data products can react more responsively to data demand and provide business users with scalable access to high-quality data through the direct exchange between data producers and data consumers.
- Unlocking self-service data access for business users: Domain-based structures reduce dependency on centrally located teams, putting insights within more immediate reach of business users and enabling them to get “skin in the game.” In addition, a high degree of self-service boosts adoption, allowing nontechnical users to feel comfortable engaging with data and using data products to answer business questions and prepare fact-based decisions.
- Enhancing data IQ: Greater engagement with data builds learning, enabling business users to design increasingly sophisticated applications over time. By shaping the data and assets they use, business users ensure that what’s created is fit for purpose, driving greater return on investment. For example, a large industrial company established self-service dashboards that enabled staff to discover existing data products and build individual reports. Together with a communications campaign, the effort activated 300 new data users.
Prior to implementing a data mesh, a large mining organization had hundreds of siloed operational databases scattered around the world, and developing analytics use cases took months. After shifting to a data mesh, the company cut time spent on data-engineering activities dramatically and developed use cases seven times faster than before while also increasing data stability and reusability.
A data mesh involves the entire business
Obtaining the full benefits of a data mesh requires careful choreography. While domain-based architectures have attracted growing interest, the technological discussion often predominates, overshadowing other critical elements.
Business users, for instance, may recognize that their current data-management systems are problematic but feel it’s better to stick with what is known than undergo the disruption of assuming direct ownership for data domains and products.
Even those eager to get started may not realize how organizational structures need to adapt to enable a steady flow of data products and use cases. For example, it’s not uncommon for organizations setting up a data mesh to discover that needed documentation is missing, taxonomies are incomplete, or new processes need to be created before data can be used. These issues can delay completion unless businesses make provision for them in their resourcing. For nontechnical professionals particularly, the learning curve can be steep and momentum for domain-based data ownership can sputter unless properly supported.
The following practices can help companies mitigate these learning-curve issues and increase the odds of a successful data mesh implementation.
Put the business in the lead
Stewardship of the data mesh implementation must come from the business, supported by executive sponsors and backed by a formal change-management team. Data mesh evangelists within the change team can help business departments analyze their data landscape and define the most valuable data products to share with the organization. Some organizations have found it helpful to position the data mesh as part of a strategic initiative such as a digital transformation. That can help set the context and the case for change. There also needs to be a committed data product owner within the business who is willing to take on the challenge of “selling” data internally to other business users and application teams. In addition, there should be a central data-infrastructure team that can implement “data governance as code” in tools that are not yet fully mature.
Let ROI guide data provisioning
Organizations sometimes get stuck trying to determine whether a centralized or decentralized approach to data management is best, but the answer is that both methods can be effective (Exhibit 2). Companies with a modern IT landscape and well-established local data repositories might get more value from exposing data through virtualized links (while still registering it in a central data marketplace or catalog). By contrast, those that are in the middle of an enterprise resource planning (ERP) transformation or other large IT change might find it better to first move toward a central data platform and create a single logic on core data products.
There can occasionally be an argument where fully centralized approaches deliver superior ROI—for example, if the majority of data use cases and data products are used globally. Fully decentralized approaches are rare at present, since they require a level of data-management orchestration that large enterprises may feel is currently out of their reach.
In practice, most organizations begin with a mix of centralized and localized data products that reflect their particular business, technology, capabilities, and go-to-market requirements. How hard to lean on centralized versus decentralized structures is often a matter of degree.
Finance, operations, and marketing, for instance, often require niche sets of data and analytics, so a company might choose to localize these functions’ data management. Cross-spanning data assets required by multiple functions can be managed by a centralized group and shared with the relevant functions accordingly.
Start with a few high-value data domains and applications
The data mesh does not need to be constructed in one fell swoop. Many companies attain positive results by taking serial steps. A biotech company began by providing data from an operational data warehouse through a data mesh to feed into operational reporting of its production performance (monitoring production variables). The data product team worked closely with business users to understand their needs, improve data quality and velocity, and standardize data into a harmonized format. Business users were able to explore and develop new applications more quickly at the proof-of-concept stage and then scale them to full production.
Centralized standards for data quality, data architecture, and data sovereignty must also be established and adopted by all data product owners. Some companies that already have centralized standards in place can adjust them to reflect the needs of a decentralized data organization. Others start by defining standards for a data domain, test them for practical applicability, and improve them as needed. Then, they roll the standards out in waves to the rest of the organization along with training and capability building to ensure the governance is consistently applied across the organization.
Identify capability gaps and fill them
Executive and nontechnical business users will all need a basic level of data literacy for data mesh success. Coaching, hackathons, online programs, and analytics academies can all work well. Business teams responsible for managing domains will need more extensive training, which should be ongoing so that users can continually grow their skill sets. Otherwise, companies can end up with a narrow set of data capabilities, enough to get started but not sufficient to create the momentum needed to sustain growth or scale.
Keep the conversation going
In most cases, building a data mesh is a continuum. Leaders make a point of regularly communicating with the organization, in large-scale town halls and intimate team meetings, on what the company is trying to achieve and what the road map looks like in terms of timing and capability building. They use internal communications to share success stories, acknowledge the individuals involved in the effort, and remain open about the inevitable challenges. Regular dialog helps to sustain long-term change efforts, keeping the transition alive in people’s minds and reinforcing its steadily accruing benefits.
A data mesh can help close the insights gap and grease the wheels of innovation, allowing companies to better predict the direction of change and proactively respond to it. But bringing a data mesh from concept to reality requires managing it as a business transformation, not a technological one.