Rapid throughput improvement at mature semiconductor fabs

A shortage of automotive chips has shut down OEM production lines. How can fabs rapidly increase wafer output as vehicle demand surges in the wake of the COVID-19 pandemic?

When COVID-19 began to spread, the automotive sector felt some of the earliest and most severe economic shocks. Demand for new cars fell by as much as 80 percent in some regions during the early days of the pandemic, and over 90 percent of automotive factories in China, Europe, and North America closed at the height of the crisis.

Fewer than two years later, with vaccines available and economies gaining strength, the situation has reversed. In fact, automotive sales bounced back more quickly than expected, leaving OEMs struggling to meet demand. The main constraint isn’t a lack of capacity but rather a shortage of the sophisticated semiconductors that enable advanced driver-assistance systems and connected-car features, such as online navigation and infotainment. The situation is so dire that some automakers have stopped manufacturing certain cars or slowed production lines. Other OEMs are just building the shells of cars and waiting to add the systems requiring semiconductors at a later date. And still others are shipping vehicles with features removed, going against the long-held adage that you can’t deliver 99 percent of a car.

Automakers naturally boosted their chip orders when vehicle sales began to surge, but fabs book their production lines months to years in advance. By the time OEMs came calling, fabs had already reallocated most of their capacity to businesses in other sectors, such as consumer electronics, and an immediate shift to new orders was impossible. Even when more capacity becomes available, OEMs are still likely to find themselves in a holding pattern, since the complex setup and manufacturing process for a new batch is a six- to nine-month process.

With robust automotive demand, fabs have a big incentive to reduce their lead times. But that might be difficult for the mature fabs that manufacture six- and eight-inch wafers, which are primarily used in various automotive electronic control units, because they tend to be older and less efficient. Increased automation and expanding production through new facilities and tool setup can help shorten production timelines, but they are challenging and time consuming to implement, even in modern facilities. To achieve more rapid productivity gains, mature fabs can follow a more stringent, comprehensive approach that focuses on four critical activities: improving overall equipment efficiency (OEE), proactively managing labor, derisking infrastructure, and dynamically managing performance. Some of these techniques are well known in other industries—and even within some segments of the semiconductor industry—but they are uncommon at mature fabs.

The growing demand–supply imbalance

By 2019, the automotive sector accounted for almost 10 percent of total semiconductor sales and generated about $41 billion in revenue. Since the semiconductor industry had come to rely on OEMs for a robust and steady revenue stream, it quickly felt the impact of lower car sales during the pandemic.

When automotive demand fell, OEMs and semiconductor distributors held off on new orders because they are cautious by nature and only make purchases when inventory runs low. This hesitancy may be prudent in ordinary times, but it has left many OEMs unprepared to meet the rapid and unprecedented surge in vehicle demand from its recent lows. Although some mature fabs may attempt to alleviate shortages by increasing production capacity, ramp-up requires six months or longer. They could attempt to improve efficiency to capture some earlier gains, but most past efforts encountered unexpected obstacles. Based on our experience working with multiple fabs, these stem from four types of issues:

  • Missed opportunities to increase capacity. Mature fabs often lack a complete understanding of their full capacity because their data are based on old recipes or tool capabilities. Some, for instance, may underestimate the true amount of tool uptime that’s possible, often because they factor in excessive buffers. Others have difficulty calculating capacity because they use outdated information in their computations, don’t factor in product-mix changes, or don’t optimize layer allocation to tools. These issues may be so complex and confounding that fabs cannot identify their additional capacity, even when they are looking for strategies to increase it.
  • Labor management. Labor deployment often isn’t tracked and measured consistently or correctly, resulting in suboptimal allocation by tool groups, an underutilized workforce, and a lack of understanding of the required full-time equivalent hours needed for ramp-up. Furthermore, traditional labor deployment may not be optimal for changes that come with new tools, resulting in non-value-added time—for instance, by having a floor layout that requires employees to walk to other bays to work on a new piece of equipment. Another common challenge relates to flexibility, since operators who aren’t properly cross-trained on multiple tools can’t be reallocated to different stations as bottlenecks shift and staffing needs arise. Any inefficiencies can be compounded if there is low morale and less adaption to change in the labor workforce, making it harder to course correct quickly.
  • Reliance on older facilities, systems, and tools. At most mature fabs, the same machines and equipment have been in operation for more than 20 years, which leads to a greater risk of frequent failures and production losses. That’s especially true if fabs seldom assess machine performance and overlook early warning signs. Many mature fabs also lack the required redundancy because cost-reduction measures historically delayed investment in critical systems that should have scaled up based on output and technology changes. Adding to the infrastructure issues, mature fabs often have subpar spare-parts management arrangements and service contracts with suppliers, which can result in more downtime. For instance, contracts might not specify that compensation will be partly based on machine uptime.
  • The need for more proactive performance management. Performance management is one of the largest drivers of output variation across shifts. Targets often aren’t established at the fab floor level or, at an even more detailed level, for each shift and tool group. On-floor leaders may lack training on the critical management skills required to have effective daily performance dialogues, adjust priorities, and track deployment, shift start and end times, and breaks. In addition, difficult pass downs from shift to shift, redundant quality checks, and unnecessary manual data recording contribute to inconsistent performance. Finally, floor leaders may lack information about critical goals, making it difficult for them to establish priorities.

A new approach to increasing efficiency

The challenges that have interfered with past efforts to improve fab efficiency still exist today, but mature fabs can eliminate many of these issues by focusing on four tasks.

Improving overall equipment efficiency and removing tool bottlenecks

Most mature fabs know that they have performance issues and need more information about the root sources. To develop solutions that address their specific problems, they must go beyond generalizations and determine how and where bottlenecks occur at the tool level. What’s more, they must consider how these bottlenecks may vary when the product mix changes, since problems that bedevil one manufacturing process may be absent in another.

Getting detailed insights about bottlenecks will require some mundane but critical groundwork: listing all tools, as well as the products, recipe steps, and wafer outputs (current and planned) for each one. Engineering and planning teams should be able to provide most of this information, which fabs can use to determine the actual and required capacity. When estimating raw processing time, fabs supplement engineering data and recipes with detailed floor-level observations. This may reveal why there are variations between tools and manufacturing steps. For instance, fabs may discover that a one-off tool setting wasn’t changed back, causing inefficiencies, or that the oxide layer on wafers varies in thickness and uniformity.

Fabs should then examine benchmarks for OEE that show the maximum capacity for each tool using a two-pillar approach that compares theoretical maximum capacity with actual output (Exhibit 1). To identify the specific factors that contribute to suboptimal capacity, fab managers can examine data logs, interview key staff, and conduct floor observations. This research will help them quantify downtime, determine equipment speed, and surface other issues, such as inefficient changeover times. In one case, a tool-level analysis revealed that a fab’s wafer output was 43 percent below its true capacity.

The two-pillar approach to overall equipment efficiency provides a detailed view of each loss category by tool set.
We strive to provide individuals with disabilities equal access to our website. If you would like information about this content we will be happy to work with you. Please email us at: McKinsey_Website_Accessibility@mckinsey.com

With the problems identified, companies will be able to launch improvement initiatives. Some of the most important ones relate to single-minute exchange of die (SMED) principles, which call for equipment changeover steps to occur when machines are running, whenever possible.

SMED principles, which are common at leading-edge fabs, also stipulate that any remaining process steps should be streamlined. For example, one fab that investigated strategies for reducing preventive-maintenance (PM) time found that its fastening process was excessively long because employees were given incorrect tools for installing shield clips and repairing bent shields. The company also lacked a platform for lid shield changes, complicating removal and installation, and stored hex tools haphazardly, making them difficult to find. Meanwhile, recurrent flow sensor errors and programmable-logic-controller errors resulted in frequent shutdowns and restarts. When the fab addressed these issues, as well as other problems that surfaced, it was able to reduce total PM downtime by six hours (23 percent).

Proactively managing labor

Fabs have traditionally changed their production systems in response to problems rather than trying to anticipate them. They have also tended to prioritize cost reduction over increasing efficiency. With the urgent need for automotive chips, it’s time for a more proactive, systematic approach that focuses on maximizing capacity rather than controlling short-term expenditures.

As a first step, fabs should model the end-to-end production flow and note the technical constraints of each step. They can then determine the best staffing and production-sequencing choices. In some cases, they may need to reconsider batch sizes and sequences each week.

Using a model, one company shifted both production steps and staffing before testing the new arrangement in a ten-day proof-of-concept trial. Some simple adjustments, such as reallocating work among employees and arranging for lunchtime coverage, increased output by 30 percent, raised productivity by 10 percent, and improved etch-tool utilization. Quality and safety were unchanged.

Another company that examined its workflow focused on improving operator touch time—the number of hours employees spend engaged with equipment (as opposed to extraneous activities, such as waiting for a product to arrive on the line and walking from one part of a fab to another). The company observed all technicians in a bay to understand how operators spent their day. After analyzing the data, it determined that touch time took up only 31 percent of operators’ time. To improve productivity, the company rebalanced the workload and shifted some employees to another bay. It also made some other changes, such as providing break coverage. With these changes, touch time rose to 46 percent.

Derisking infrastructure through predictive maintenance

In mature facilities, machine failures and other infrastructure issues are so routine that fabs may have difficulty determining where PM is most warranted or what machines should be first in line for repairs. When setting priorities, fabs can minimize risks and reduce the impact of infrastructure failures through a structured approach that identifies the most critical tools. For instance, machines that have high usage rates, low capacity, and no backups should often be given top consideration. A similar process works for systems, such as those for electrical distribution and fume extraction. In this case, top priority goes to the systems for which a stoppage would have immediate and severe consequences for production, especially if their likelihood of failure is high.

When analyzing systems, fabs should look at all subsystems independently. One company that wanted to improve its process-cooling-water systems first examined about 17 subsystems separately (Exhibit 2). For everything from office pumps to cooling coils, it held workshops, interviewed vendors, and conducted expert inspections. The company also identified possible failure modes, such as oil condensation, inadequate chiller maintenance, and compressor-pump failure, for each subsystem.

Assessment of subsystems can help pinpoint problems.
We strive to provide individuals with disabilities equal access to our website. If you would like information about this content we will be happy to work with you. Please email us at: McKinsey_Website_Accessibility@mckinsey.com

These assessments revealed that four subsystems needed critical attention. Both Fab A and Fab B chillers had no backups in summer and had failed previously. Significant downtime was inevitable after such failures because contractors typically require 24 to 48 hours for repairs. The Fab A and Fab B cooling towers had similar issues. For both subsystems, the most common failure, fan breakdown, decreased cooling-system performance and had an immediate impact on production.

After identifying the main issues, the company developed mitigation plans that described steps for reducing risk in all four subsystems, such as investigating pipework to understand why the Fab A chillers had recently failed and working with vendors to identify critical spare parts that needed to be kept in stock. In addition to discussing what actions were required, the mitigation plans specified responsibilities for different tasks and set timelines for completion.

By moving to a strong predictive-maintenance routine rather than fighting crises haphazardly, companies may be able to reduce downtime by 30 to 50 percent. The downtime that does occur will be planned rather than unexpected, so fabs will have a chance to deploy workers elsewhere. Part shortages will also be less common, since fabs will have a good idea of what they need for scheduled maintenance.

Implementing dynamic performance management

At many mature fabs, performance management is haphazard and stuck in the past. To improve, they need a new approach that has clear shop floor metrics and targets. Some, for instance, might quantify how much output drops during shifts and breaks and then set targets for improvement. Performance dashboards—now in use in many modern facilities—can help mature fabs identify the root cause of any problems. Other important elements of dynamic performance management include the following:

  • A clear issue-resolution process. Fabs should create job aids, such as templates and checklists, to facilitate different processes and improve shift handover. There should also be a clear process for escalating concerns.
  • Training. Floor leaders may be accustomed to a more informal performance dialogue process and require training on metrics and targets. Ideally, fabs will conduct trial runs for any new processes and solicit operator feedback about what works and what needs to change. They can then make changes before scale-up.
  • Regular performance dialogues. Operators and supervisors should hold dialogues in front of a whiteboard at the end of each shift for about five to ten minutes. They can have a standard agenda and record any issues that surface for later discussion. Leaders should remember that providing positive feedback is just as important as reviewing problems during performance dialogues.
  • Automated dashboards and action tracking. Fabs may initially use manual dashboards, but they can eventually automate them to save time and increase transparency. Ideally, the dashboards should include visuals that make it easy to see and interpret critical data, which should be updated frequently and come straight from the lines.

At fabs where employees are accustomed to long-standing processes, the greatest implementation obstacle may involve mindsets and behaviors. Historically, given the cost reduction focus of mature fabs, the importance on throughput improvement practices has been low, and most leaders are tempted to ignore them. For best results, and to sustain progress, leaders should ensure that they address the mindset issue head on through clear communication on the organizations’ near-term priorities.


As vehicles become increasingly sophisticated, electrified, and autonomous, their semiconductor content will increase to even higher levels. That could intensify the current shortage of automotive chips, resulting in further production slowdowns at OEMs. Mature fabs can’t address the severe shortage of automotive semiconductors overnight, but they can take steps now to begin alleviating the problem.

Explore a career with us

Related Articles