What do internal functions as diverse as risk assessment, capital-expenditure planning, and workforce planning have in common? Each is fundamentally about understanding demand—making demand forecasting an essential analytical process. Amid rising pressure to increase forecasting accuracy, more companies have come to rely on AI algorithms, which have become increasingly sophisticated in learning from historical patterns.
AI models have clear advantages over traditional spreadsheet-based analytic methods. Applying AI-driven forecasting to supply chain management, for example, can reduce errors by between 20 and 50 percent—and translate into a reduction in lost sales and product unavailability of up to 65 percent. Continuing the virtuous circle, warehousing costs can fall by 5 to 10 percent, and administration costs by 25 to 40 percent. Companies in the telecommunications, electric power, natural gas, and healthcare industries have found that AI forecasting engines can automate up to 50 percent of workforce-management tasks, leading to cost reductions of 10 to 15 percent while gradually improving hiring decisions—and operational resilience (Exhibit 1).
Automated AI-driven forecasting promotes these benefits by consuming real-time data and continuously identifying new patterns. This capacity enables fast, agile actions because the model anticipates demand changes rather than just responding to them. In contrast, traditional approaches to demand forecasting require constant manual updating of data and adjustments to forecast outputs. These interventions are typically time-consuming and do not allow for agile responses to immediate changes in demand patterns.
Yet despite AI’s numerous advantages, organizations have faced challenges that limit its adoption. As of 2021, a solid majority—56 percent—of surveyed organizations reported that they had adopted AI in at least one function. That’s progress compared with the 47 percent figure reported in 2018—but the rate of growth suggests that serious barriers remain, particularly in reaching scale beyond a single function. For many organizations, limited data availability—or limited usefulness of the data that are available—is still a problem.
But these concerns are proving to be less of a disadvantage than leaders may think, thanks to recent advances in AI technologies. While it’s generally true that more data can improve results, the experiences of companies with widely disparate levels of data quality show that most organizations have enough data to derive value from AI-driven forecasting. It’s a matter of building specific and actionable strategies to apply these models even in data-light environments.
Four strategies for data-light environments
From a technical standpoint, companies can use up to four strategies, individually or in combination, to create reliable outputs in data-light environments.
- Choosing the right AI model. The first step is to identify the most appropriate AI algorithm, based on the amount and quality of available data. In many instances, machine-learning (ML) models can test and validate multiple models to find the optimal choice, with minimal human involvement.
- Leveraging data-smoothing and augmentation techniques. This technique works when a period within a time series is not representative of the rest of the data. For example, sales data during the COVID-19 pandemic has usually shown anomalous trends and seasonality.
- Preparing for prediction uncertainties. Sophisticated scenario-planning tools that let people insert a wide range of parameters can help when forecasting models do not achieve satisfactory accuracy or when only minimal historical data are available.
- Incorporating external data APIs. This option is applicable when external data sources (for example, relating to weather or foot traffic) are necessary to inform the forecast values.
Choosing the right AI model
Although having more historical data generally makes for more-robust forecasting, a call-center example illustrates that forecasting can be effective even when historical information is limited.
The complexity of forecasting at a call center can be daunting because of the variety of customer interactions call centers tend to handle. One utility company’s call center identified 15 types of calls, categorized by the underlying reason—such as for technical support or payment processing. Within each call type, managers tracked three metrics: total volume of calls, average handle time (AHT) for calls directed to internal staff, and a separate “external” AHT metric for calls addressed by vendors. The company aggregated the forecasts at three levels: monthly, daily, and hourly. The combination of call types, metrics, and aggregation levels required 135 independent forecasts.
As one would expect, long-running historical data are not available for all call types or metrics. In such situations, a successful forecast provides reasonable outputs for cases with a low sample size while maximizing the accuracy of outputs for cases with long-term historical data.
Moreover, other factors could further complicate the forecasting process. For example, patterns of seasonality may be very complex, varying on a weekly, monthly, and yearly basis. These patterns might also gradually change over time, owing to different business initiatives. Similarly, observed trends may be inconsistent over time and show multiple change points throughout. The charts in Exhibit 2 illustrate some of these complexities.
The call center applied an ensemble of forecasting models with different strengths to all call types and metrics, in an automated way. The algorithm chose simpler models when the data yielded only a small sample size, and more complex ones when larger sample sizes were available. Simpler models have smaller parameters to tune and therefore would require a smaller sample size to train. By contrast, complex models generally perform better when large amounts of data are available, because the numerous parameters in these models require many iterations to train.
Testing a range of models with different complexity levels for every data set improved forecast accuracy by almost 10 percent for volume, and about half that for average holding time. Overall, this forecasting approach reduced costs by about 10 to 15 percent, while improving service levels by 5 to 10 percent—particularly by enabling faster transaction time.
Leveraging data-smoothing and augmentation techniques
Often, time-series data are influenced by anomalous periods that disrupt overall trend patterns and make it extremely difficult for any AI model to learn and forecast properly. Smoothing is a technique to reduce the significant variation between time steps. It removes noise and creates a more representative data set for models to learn from.
The impact of smoothing becomes more evident when the time-series data are affected by a particular event in the past that is not expected to recur regularly in the future. In the example shown in Exhibit 3, the company’s goal was to forecast sales in its retail stores. Although the drop in sales volume during April and May seemed to have been a one-time event, it significantly affected the machine-learning process. The anomalous period has completely different patterns of seasonality and trend compared with the rest of the time series. But the machine-learning models will not automatically treat this period as anomalous. Instead, they will try to learn from it alongside the rest of the time series as they generalize the overall patterns. In this example, the anomalous period confused the model, and it was unable to learn the intrinsic seasonality patterns as expected.
Exhibit 3 shows the forecast output before and after smoothing the highlighted period. Smoothing helped the model to better learn the weekly seasonality, resulting in a more accurate forecast.
Preparing for prediction uncertainties
Relying on statistical forecasts alone may not achieve the required business insight. This is especially true for long-term forecasting because unexpected events that affect trends and seasonality make it harder to learn from historical patterns. Given the intrinsic uncertainty of forecasting analysis in such cases, it is valuable to use what-if scenarios.
What-if scenarios are particularly important when data samples are too small, which makes forecasting with high confidence almost impossible. They are also useful to address the high uncertainty of forecasting in a time period in the distant future.
The utility used this methodology both for long-term workforce planning and for specifying required head count over the next year. The workforce was spread across a region and included more than 5,000 technicians working on sites. To estimate the required head count, the company designed a statistical forecast model that uses three years of monthly demand data (a total of 36 data points) to forecast the next 12 months.
This forecasting model seeks to capture the year-over-year trends and seasonality. However, the model does not enable planning for unexpected events in the future. To address the uncertainties, specialists designed the user interface (UI) shown in Exhibit 4, which enables users to change specific parameters and create scenarios.
Two factors proved critical in making this tool successful:
- Define the critical parameters that could potentially affect the target variables. Businesses that use the tool should be involved in designing the critical parameters. It is important to define reasonable ranges for these parameters to avoid creating outputs that are unrealistic or would lead to bad business decisions. In the utility’s long-range head count–planning tool, for example, the critical input for demand is expected volume of work over the next year, together with estimates for job duration. From the supply standpoint, important inputs relate to expected overtime allowance, which is derived from company policies and expected absenteeism. The tool enabled the users to see how changing these parameters influenced the overall required head count, which forms part of the basis for hiring decisions.
- Design an interactive UI. Although some base scenarios can be designed in advance, it is critical to create an interactive tool for users. Enabling users to define new scenarios is a powerful way to account for any unforeseen trends in forecasting, such as the impact of COVID-19 on demand patterns: overlaying human intelligence and expert opinions helps address issues that may not arise in historical data.
For example, in the UI shown in Exhibit 4, business users can change the total number of installation or repair jobs by moving sliders for a given month. Users can also adjust the expected delivery date for different products. In addition to changing demand-related parameters, users can adjust supply-side parameters, such as expected lengths of shifts and permissible overtime hours. Having a web-based UI not only provides a better user experience and minimizes the chance of making errors but also enforces a consistent approach to scenario planning across different teams, creating a single source of truth. And the resulting transparency sheds light on the logic behind business decisions.
What-if scenario tools are particularly valuable when demand and supply patterns are volatile and multiple new business initiatives arise in close succession. In such cases, AI-driven forecasting models that are heavily dependent on historic data fall short. But what-if scenario tools have shown to improve the delivery rate of capital projects by 10 to 15 percent, while applying a consistent and transparent scenario planning process across different business units and teams has increased workforce flexibility by about 20 percent.
Incorporating external data APIs
Externally sourced data can cover a variety of sources and content, including social-media activity, web-scraping content, financial transactions, weather forecasts, mobile-device location data, and satellite images. Incorporating these data sets can significantly improve forecast accuracy, especially in data-light environments. These sources provide an excellent option for the inputs required for AI-driven models and create reasonable outputs. The market for external data is expected to have a CAGR of 58 percent, reflecting the increasing popularity of these data sources and the significant expansion in the types of external data available.
Providers of external data often offer their services through APIs, making it convenient for users to access data and integrate the data into their current processes. In our utility example, because weather data have an important impact on demand forecasts for maintenance of equipment and infrastructure, the utility used a weather API to forecast the demand for maintenance work required at specific locations.
Exhibit 5 illustrates the forecast outputs for maintenance work in a single region, with and without weather data. The chart shows that the number of maintenance work tickets increases in February. However, this is not reflected in the original forecast (which does not include the weather data), because the same period last year did not experience a similar anomaly that the model could learn from. Providing the model with weather information allows it to learn that certain temperatures and humidity levels are associated with higher levels of maintenance work. The model applies this insight to successfully predict a higher level of maintenance work for February.
The need for accurate forecasting in operations seems here to stay. After applying AI-driven forecasting across a multitude of industries with different data landscapes, companies have seen a tremendous performance improvement in their models over traditional and spreadsheet-based forecasting. By leveraging AI techniques that are suitable for data-light environments, companies can improve their operations significantly.