Data ecosystems made simple

By Ahmed Abdulla, Ewa Janiszewska-Kiewra, and Jannik Podlesny

Ecosystems—interconnected sets of services in a single integrated experience—have emerged across a range of industries, from financial services to retail to healthcare. Ecosystems are not limited to a single sector; indeed, many transcend multiple sectors.

For traditional incumbents, ecosystems can provide a golden opportunity to increase their influence and fend off potential disruption by faster-moving digital attackers. For example, banks are at risk of losing half of their margins to fintechs, but they have the opportunity to increase margins by a similar amount by orchestrating an ecosystem.

In our experience, many ecosystems focus on the provision of data: exchange, availability, and analysis. Incumbents seeking to excel in these areas must develop the proper data strategy, business model, and architecture.

What is a data ecosystem?

Simply put, a data ecosystem is a platform that combines data from numerous providers and builds value through the usage of processed data. A successful ecosystem balances two priorities:

  • Building economies of scale by attracting participants through lower barriers to entry. In addition, the ecosystem must generate clear customer benefits and dependencies beyond the core product to establish high exit barriers over the long term.
  • Cultivating a collaboration network that motivates a large number of parties with similar interests (such as app developers) to join forces and pursue similar objectives. One of the key benefits of the ecosystem comes from the participation of multiple categories of players (such as app developers and app users).

What are the data-ecosystem archetypes?

As data ecosystems have evolved, five archetypes have emerged. They vary based on the model for data aggregation, the types of services offered, and the engagement methods of other participants in the ecosystem.

  1. Data utilities. By aggregating data sets, data utilities provide value-adding tools and services to other businesses. The category includes credit bureaus, consumer-insights firms, and insurance-claim platforms.
  2. Operations optimization and efficiency centers of excellence. This archetype vertically integrates data within the business and the wider value chain to achieve operational efficiencies. An example is an ecosystem that integrates data from entities across a supply chain to offer greater transparency and management capabilities.
  3. End-to-end cross-sectorial platforms. By integrating multiple partner activities and data, this archetype provides an end-to-end service to the customers or business through a single platform. Car reselling, testing platforms, and partnership networks with a shared loyalty program exemplify this archetype.
  4. Marketplace platforms. These platforms offer products and services as a conduit between suppliers and consumers or businesses. Amazon and Alibaba are leading examples.
  5. B2B infrastructure (platform as a business). This archetype builds a core infrastructure and tech platform on which other companies establish their ecosystem business. Examples of such businesses are data-management platforms and payment-infrastructure providers.

The ingredients for a successful data ecosystem

Data ecosystems have the potential to generate significant value. However, the entry barriers to establishing an ecosystem are typically high, so companies must understand the landscape and potential obstacles. Typically, the hardest pieces to figure out are finding the best business model to generate revenues for the orchestrator and ensuring participation.

If the market already has a large, established player, companies may find it difficult to stake out a position. To choose the right partners, executives need to pinpoint the value they can offer and then select collaborators who complement and support their strategic ambitions. Similarly, companies should look to create a unique value proposition and excellent customer experience to attract both end customers and other collaborators. Working with third parties often requires additional resources, such as negotiating teams supported by legal specialists to negotiate and structure the collaboration with potential partners. Ideally, partnerships should be mutually beneficial arrangements between the ecosystem leader and other participants.

As companies look to enable data pooling and the benefits it can generate, they must be aware of laws regarding competition. Companies that agree to share access to data, technology, and collection methods restrict access for other companies, which could raise anti-competition concerns. Executives must also ensure that they address privacy concerns, which can differ by geography.

Other capabilities and resources are needed to create and build an ecosystem. For example, to find and recruit specialists and tech talent, organizations must create career opportunities and a welcoming environment. Significant investments will also be needed to cover the costs of data-migration projects and ecosystem maintenance.

Ensuring ecosystem participants have access to data

Once a company selects its data-ecosystem archetype, executives should then focus on setting up the right infrastructure to supports its operation. An ecosystem can’t deliver on its promise to participants without ensuring access to data, and that critical element relies on the design of the data architecture. We have identified five questions that incumbents must resolve when setting up their data ecosystem.

How do we exchange data among partners in the ecosystem?

Industry experience shows that standard data-exchange mechanisms among partners, such as cookie handshakes, for example, can be effective. The data exchange typically follows three steps: establishing a secure connection, exchanging data through browsers and clients, and storing results centrally when necessary.

How do we manage identity and access?

Companies can pursue two strategies to select and implement an identity-management system. The more common approach is to centralize identity management through solutions such as Okta, OpenID, or Ping. An emerging approach is to decentralize and federate identity management—for example, by using blockchain ledger mechanisms.

How can we define data domains and storage?

Traditionally, an ecosystem orchestrator would centralize data within each domain. More recent trends in data-asset management favor an open data-mesh architecture (exhibit). Data mesh challenges conventional centralization of data ownership within one party by using existing definitions and domain assets within each party based on each use case or product. Certain use cases may still require centralized domain definitions with central storage. In addition, global data-governance standards must be defined to ensure interoperability of data assets.

Blueprint of IT architecture for data ecosystems

How do we manage access to non-local data assets, and how can we possibly consolidate?

Most use cases can be implemented with periodic data loads through application programming interfaces (APIs). This approach results in a majority of use cases having decentralized data storage. Pursuing this environment requires two enablers: a central API catalog that defines all APIs available to ensure consistency of approach, and strong group governance for data sharing.

How do we scale the ecosystem, given its heterogeneous and loosely coupled nature?

Enabling rapid and decentralized access to data or data outputs is the key to scaling the ecosystem. This objective can be achieved by having robust governance to ensure that all participants of the ecosystem do the following:

  • Make their data assets discoverable, addressable, versioned, and trustworthy in terms of accuracy
  • Use self-describing semantics and open standards for data exchange
  • Support secure exchanges while allowing access at a granular level

The success of a data-ecosystem strategy depends on data availability and digitization, API readiness to enable integration, data privacy and compliance—for example, General Data Protection Regulation (GDPR)—and user access in a distributed setup. This range of attributes requires companies to design their data architecture to check all these boxes.

As incumbents consider establishing data ecosystems, we recommend they develop a road map that specifically addresses the common challenges. They should then look to define their architecture to ensure that the benefits to participants and themselves come to fruition. The good news is that the data-architecture requirements for ecosystems are not complex. The priority components are identity and access management, a minimum set of tools to manage data and analytics, and central data storage.

Ahmed Abdulla is a consultant in McKinsey’s Dubai office, Ewa Janiszewska-Kiewra is manager of data engineering in the Wroclaw office, and Jannik Podlesny is a specialist in the Berlin office.