The Problem of Industry 4.0: Data! – Part 1

Industry 4.0 is still a hot topic, even over ten years after the term was coined. Unfortunately, very often I find it to be much more hype than content. The examples where it actually worked well are few and far between, and the examples where not much was hyped as groundbreaking are way too frequent. In my view, a large problem of Industry 4.0 is the data, especially the data structure and the problems with analyzing the data. Hence, (yet another) short series of post warning on the difficulties of Industry 4.0 with a focus on the data.

Industry 4.0 – A Recap

Industry 4.0 is the fourth industrial revolution, after the first (the big one) with mechanization and steam power, the second with mass production and electricity, and the third with computers. The idea is to use computers in networks to leverage information for better speed, cost, quality, and safety (with a lot of buzzwords like cyber-physical systems, internet of things, big data, artificial intelligence, digital twins, smart factories, and many more). Depending on who you ask, the definition of Industry 4.0 may be quite different.

Industry 4.0
Industry 4.0

I have written quite a few posts with a critical look at Industry 4.0, trying to see What Works, What Doesn’t, Comparing It to Lean, and analyzing the data to see if it is a revolution (spoiler: no, or at least not yet), and I even had a series of posts for our Van of Nerds tour looking at the state of Industry 4.0 in Germany. Overall, I usually find much more hype and far fewer examples that actually work.

There seems to be a number of different causes for these many failures. Often, I find that Industry 4.0 is done not to solve a problem, but merely to do Industry 4.0. Lean improvements should always start with a problem to solve or improve, and doing something merely for the sake of doing something is unlikely to improve anything. Another problem is that the complexity of Industry 4.0 is often underestimated, and due to the fundamental differences in different factories with different products, it is hard to scale the system. You cannot just copy the software from a factory making cars to another factory making ice cream. The exception here is logistics, where the problem of moving things from A to B are similar in many factories. And another issue is the complexity of the data… which is the focus of this blog post.

On Data Complexity

Industry 4.0 is very loosely defined. Pretty much anything with computers related to manufacturing could be seen as Industry 4.0 (although according to the diagram above, only computers would be Industry 3.0). The scale of the complexity also varies widely. I advocate to keep the problems small and manageable, because this will greatly simplify things and make a successful implementation much more likely. Smaller implementations are, for example, installing a new robot or even a cobot (collaborative robot – although this again would be industry 3.0), or using AI to analyze camera pictures to detect errors or to understand the inventory, or using RFID chips to analyze the inventory. The more complex the problem, the more challenging the implementation. Depending on the complexity, the following points on data may apply to a varying degree.

Amazon Manual StorageYet if you listen to (some) consultants, the holy grail of Industry 4.0 seems to be everything connected to everything else (internet of things, cyber physical systems, etc.). Everything collects data and sends it around on the cloud. This is most impressive… if it would work, which it usually does not. The closest I have seen are Amazon fulfillment centers (see my series on Amazon Fulfillment Centers). But I know of many more attempts that failed, usually because the people in charge underestimated the complexity. These “everything with everything else” attempts often have the largest issues with handling the data, and they get hit in full force by the effects outlined below. Hence again my advice: Keep it simple!

Merging the Data

The first step is often to get the data together in a common database. This is often easier said than done. You may have many different machines by different makers, which may have different data structures. Somehow you have to get all of these data into a joint system. Imagine a professor (like me) giving homework and then receiving Word documents, PDFs, Excel files, and PowerPoints. While it is not a problem for grading, it would be a problem to put them all together in a single file. It is often similar in Industry 4.0.

Even if it is the same data structure, the details of the format may still be different. If I give my students homework and tell them to deliver it in Excel, it may still be in many different structures. One student has column A as the time, another one as an index. Putting it together in one file is still a lot of work. In industry 4.0 it is even more complex, as the data is often more complicated than a simple Excel file. There may be lots of unintended consequences.

As an example, think of the last time your company updated their ERP system. Going from one software to an upgraded software from the same maker should be no problem, right? You merely run the software update. If you have ever done that, then I am sure you winced. There are a myriad of things that can go wrong, and such “simple” ERP software upgrades are never simple, requiring extensive testing and trial runs to at least reduce the chances of a total company meltdown due to a ERP software problem.

You may also consider if you want to merge the raw data, or if you want to get pre-processed data. I often prefer to have all the raw data, but this is probably more a personal preference, and especially for larger systems the raw data can become quite big.

Overall, merging the data is a mess, and this does not even touch the potential problem of the ownership of the data and the willingness of one company to share the data with another company. For example, many modern machines can collect processing data, which is then sent to the maker to analyze predictive maintenance and other service problems. However, many companies (especially automotive) turn this data stream off right away, since they do not want the machine tool maker to know what and when they produce.

In my next post I will talk about cleaning up the data and subsequent steps. Until then, stay tuned, and go out and organize your industry!

3 thoughts on “The Problem of Industry 4.0: Data! – Part 1”

Leave a Comment