AWS Sagemaker helps equipment engineers prevent ‘Die Defects’ by predicting blade wear-out

Written by Pim van Eersel | Oct 9, 2024 1:34:06 PM

The challenge

Product yield is key in the semiconductor industry. Finding and understanding the root cause of defects is therefore an important process to be able to predict a root cause and prevent future defects.

One step in the production of IC’s is the so-called “wafer dicing”. In this step, wafers are cut (diced) into many pieces, each containing a circuit. Each of these pieces is called a die.

One method for dicing is “blade dicing” which, due to its mature technology and low cost compared to other methods, is still widely used. The downside is that blade dicing induces mechanical stress which leads to chipping. Especially when the blade is becoming dull.

As a few nanometers of damage to a die is already a problem, and humans cannot see particles smaller than about 50nm, operators cannot see if blades are wearing out. Therefore, predicting when a blade will be worn out and replacing it before it does, can help engineers prevent lost production time, reduce cost and improve quality. As wafer dicing comes after 5+ months of wafer production time, preventing defects caused by dicing saves a lot of cost.

One of our customers challenged us to significantly reduce chipping caused by blade wear out. With the starting point at a 4% defect ratio, we set ourselves the goal to reduce it by at least 50%.

The solution

Our solution is a machine learning (ML) model to predict blade wear out. It consists of two parts: a “data producer” part and a “data consumer” part.

Data producing leads to trusted data sets (data products) which can then be consumed (queried, analyzed, modeled) on an analytics workspace by e.g. equipment engineers.

The first part contains the equipment interfaces, the data pipeline and the data products. We used EMR Serverless for its on-demand scaling, Glue, Python and Spark for integration, pre-processing and transformation of data, and DynamoDB and S3 buckets for storage.

The second part contains the analytics workspace with AWS Sagemaker Studio to run the Machine Learning models. We used Athena Query service to analyze data directly in the S3 storage solution. Microsoft PowerBI and Amazon Quicksight are used to visualize trends and predictions which in the end lead to triggers for engineers to replace a blade or adjust machine configurations.

The value

The ML model is fed with equipment data such as dicing metrics, machine configurations, specific settings, and product ID’s. The longer the model runs and the more data it can analyze, the better the quality of predictions becomes.

After a few months, the defect ratio from blade wear out was already cut in half. After some time, the defect ratio reached 0.9%: significantly better than our already ambitious target. The specific 300mm wafer contains 300-400 dies, which are valued at 30 - 50 euro per die. Our solution thus saves over 400 euro per wafer on average, by reducing the defect ratio from blade wear out by 78%. Even without considering the 5+ months of wafer production time that is no longer wasted, the ROI was already very interesting for the customer.

View full post