MLOps

Assembly Line

Unveiling Databricks power in analyzing electrical grid assets using computer vision

đź“… Date:

đź”– Topics: Machine Vision, MLOps

🏭 Vertical: Utility

🏢 Organizations: Databricks


Data is ingested from an EPRI dataset consisting of images of distribution assets along with labels for each object. These are ingested into Delta tables and transformed through the medallion architecture in order to produce a dataset that is ready for model training.

After data loading has been completed, the training can begin. In the age of GenAI, there is a scarcity of large GPU’s leaving only the smaller ones that can significantly impact training and experimentation times. In order to combat this, Databricks allows you to run distributed GPU training using features like PytorchDistributor. This accelerator takes advantage of this to utilize a cluster of commodity GPU’s to train our model which brings the training time down almost linearly.

Read more at Databricks Blog

Accelerating industrialization of Machine Learning at BMW Group using the Machine Learning Operations (MLOps) solution

đź“… Date:

✍️ Authors: Marc Neumann, Aubrey Oosthuizen

đź”– Topics: MLOps, Data Architecture

🏢 Organizations: BMW, AWS


The BMW Group’s Cloud Data Hub (CDH) manages company-wide data and data solutions on AWS. The CDH provides BMW Analysts and Data Scientists with access to data that helps drive business value through Data Analytics and Machine Learning (ML). The BMW Group’s MLOps solution includes (1) Reference architecture, (2) Reusable Infrastructure as Code (IaC) modules that use Amazon SageMaker and Analytics services, (3) ML workflows using AWS Step Functions, and (4) Deployable MLOps template that covers the ML lifecycle from data ingestion to inference.

Read more at AWS Blog

🧠📹 What Sets Toshiba’s Ceramic Balls Apart? The AI Quality Inspection System

đź“… Date:

đź”– Topics: MLOps, Bearing, Corrosion, Quality Assurance, Machine Vision

🏢 Organizations: Toshiba


Bearings cannot be easily replaced once a vehicle is assembled. In the U.S., bearings used in EVs are expected to be of high enough quality to withstand long distances. One issue that can occur with EVs, however, is the “electric corrosion” of the bearings that mount the various vital parts of the vehicle onto the motor—a serious issue, as it can lead to the breakdown of the vehicle. High-performance bearings would drive the widespread use of EVs, and contribute to the push towards carbon neutrality. The electrical corrosion phenomenon had hampered these efforts, but not anymore—therein lies the beauty of Toshiba’s ceramic balls.

“Our ceramic balls go through slight changes about every year and a half due to changes in material and other factors. To keep up the accuracy of the quality inspections, we have to continually update the AI system itself. The MLOps system automates that process,” says Kobatake.

“We’ve been able to dramatically reduce the time spent on these inspections. Ceramic balls are expensive compared to their metal counterparts. They have so many different strengths, and yet they haven’t been able to replace the metal ones precisely because of this particular issue. If we’re able to reduce the cost through AI quality inspection, we’ll be able to lower the price of the products themselves,” says Yamada.

Read more at Toshiba Clip

How Corning Built End-to-end ML on Databricks Lakehouse Platform

đź“… Date:

✍️ Author: Denis Kamotsky

đź”– Topics: MLOps, Quality Assurance, Data Architecture, Cloud-to-Edge Deployment

🏢 Organizations: Corning, Databricks, AWS


Specifically for quality inspection, we take high-resolution images to look for irregularities in the cells, which can be predictive of leaks and defective parts. The challenge, however, is the prevalence of false positives due to the debris in the manufacturing environment showing up in pictures.

To address this, we manually brush and blow the filters before imaging. We discovered that by notifying operators of which specific parts to clean, we could significantly reduce the total time required for the process, and machine learning came in handy. We used ML to predict whether a filter is clean or dirty based on low-resolution images taken while the operator is setting up the filter inside the imaging device. Based on the prediction, the operator would get the signal to clean the part or not, thus reducing false positives on the final high-res images, helping us move faster through the production process and providing high-quality filters.

Read more at Databricks Blog

Industrial DataOps: The data backbone of digital twins

đź“… Date:

✍️ Author: Fredrik Holm

đź”– Topics: Digital Twin, MLOps

🏢 Organizations: Cognite


What is needed is not a single digital twin that perfectly encapsulates all aspects of the physical reality it mirrors, but rather an evolving set of “digital siblings.” Each sibling shares a lot of the same DNA (data, tools, and practices) but is built for a specific purpose, can evolve on its own, and provides value in isolation.

The data backbone to power digital twins needs to be governed in efficient ways to avoid the master data management challenges of the past—including tracking data lineage, managing access rights, and monitoring data quality, to mention a few examples. The governance structure has to focus on creating data products that may be used, reused, and collaborated on in efficient and cross-disciplinary ways. The data products have to be easily composable and be constructed like humans think about data ; As a graph where physical equipment are interconnected both physically and logically. And through this representation select parts of the graph may be used to populate the different digital twins in a consistent and coherent way.

Read more at Cognite Blog

Using MLflow to deploy Graph Neural Networks for Monitoring Supply Chain Risk

đź“… Date:

đź”– Topics: Graph Neural Network, MLOps

🏢 Organizations: Databricks


We live in an ever interconnected world, and nowhere is this more evident than in modern supply chains. Due to the global macroeconomic environment and globalisation, modern supply chains have become intricately linked and weaved together. Companies worldwide rely on one another to keep their production lines flowing and to act ethically (e.g., complying with laws such as the Modern Slavery Act). From a modelling perspective, the procurement relationships between firms in this global network form an intricate, dynamic, and complex network spanning the globe.

Lastly, it was mentioned earlier that GNNs are a framework for defining deep learning algorithms over graph structured data. For this blog, we will utilise a specific architecture of GNNs called GraphSAGE. This algorithm does not require all nodes to be present during training, is able to generalise to new nodes efficiently, and can scale to billions of nodes. Earlier methods in the literature were transductive, meaning that the algorithms learned embeddings for nodes. This was useful for static graphs, but the algorithms had to be re-run after graph updates such as new nodes. Unlike those methods, GraphSAGE is an inductive framework which learns how to aggregate information from neighborhood nodes; i.e., it learns functions for generating embeddings, rather than learning embeddings directly. Therefore GraphSAGE ensures that we can seamlessly integrate new supply chain relationships retrieved from upstream processes without triggering costly retraining routines.

Read more at Ajmal Aziz on Medium

Making The Most Of Data Lakes

đź“… Date:

✍️ Author: Anne Meixner

đź”– Topics: MLOps

🏭 Vertical: Semiconductor

🏢 Organizations: PDF Solutions, Synopsys


Data management and data analysis necessitates understanding the data storage and data compute options to design an optimal solution. This is made more difficult by the sheer volume of data generated by the design and manufacturing of semiconductor devices. There are more sensors being added into equipment, more complex heterogeneous chip architectures, and increased demands for reliability — which in turn increase the amount of simulation, inspection, metrology, and test data being generated.

Connecting different data sources is extremely valuable. It allows feed-forward decisions on manufacturing processes (package type, skipping burn-in), and feedback in order to trace causes of excursions (yield, quality, and customer returns).

“An understanding of the semiconductor manufacturing process and relationships throughout are essential for some applications,” said Jeff David, vice president of AI solutions at PDF Solutions. “For example, how can I use wafer equipment history and tool sensor data to predict the failure propensity of a chip at final test? How does time delay between process and test steps determine what data is useful in finding a root cause of a failure mode? What failure modes are predictable with which datasets? How do preceding process steps affect the data collected at a given process step?”

Read more at Semiconductor Engineering

MakinaRocks, unveiling the AI/ML modeling tool “Link”

đź“… Date:

đź”– Topics: MLOps

🏢 Organizations: MakinaRocks


Link is an extension for JupyterLab – an interactive development interface for notebooks, code, and data – that lets users easily create readable pipelines for AI and ML modeling. Link maintains the usability of JupyterLab that data scientists rely on while removing technological hurdles related to Kubernetes, a portable, open-source platform for managing workloads and services. By removing the technological hurdles associated with Kubernetes, Link allows users to create pipelines that can be used in MLOps environments with ease, even without a working knowledge of Kubernetes.

Read more at MakinaRocks News