top of page

Tackling your legacy data issues is the first and key step on the path to digital transformation.

Updated: Mar 19, 2021

We are being approached weekly by clients seeking to embark on a digital transformation program for many excellent reasons. Predictive maintenance solutions, operational efficiency, minimise unplanned downtime and implementing a centralised portfolio wide management solution are just some of the reasons and digital transformation can and will address these issues. However, it is apparent that most enterprises aren’t properly (or digitally) prepared to adopt these trends. The reason why? Today’s pace of business and the disorderly data that’s needed to make sense of it all.

In the past, IT environments were simpler and more accessible for humans. But with the advent of cloud, containers, multi-modal delivery and other new technologies resulting in inordinately massive and complex environments, IT is being forced to move at machine speed, rendering manual processes too slow and inefficient.

To keep up with the rapid pace and scale of today’s digital environments, enterprises are turning to processes powered by machine learning (ML) and artificial intelligence (AI). Unfortunately, ML-based algorithms and AI-based automation, key elements of unlocking digital transformation, are easier said than done. The underlying reason is that ML-based algorithms, by themselves, aren’t sophisticated enough to deal with today’s ephemeral, containerized, cloud-based world. ML needs to evolve into AI, and to do that, it needs cleaner actionable data to automate processes.

However, attaining high-quality data presents its own unique challenges, and enterprises that do not have the right strategy in place will encounter cascading problems when trying to implement digital transformation initiatives in the future.

“Imagine this: Your predictive algorithms just picked up an anomaly from a vibration sensor on a compressor. It may fail, and you’d better fix it quickly. No problem, because you’ve been alerted early and mobilize your maintenance team to deal with it. Except...what parts do they need to fix it? What linked equipment do they need to shut down? They’re missing the context to effectively deal with the problem. Manually sifting through hundreds of scanned documents to gather that context will take time you don’t have.

“If the hierarchy is not accurate it is impossible to gain the reliability, maintainability and required traceability of the assets. How can Planners do their job if they are not aware of the plant’s assets and in particular how the assets relate to other assets?” - Paul Langan

If you don’t get that contextual information down, your sensors provide little value. All the investment you poured into your PdM initiative just went down the drain, and you’re back to square one.”

Next, realize that training AI/ML on historical data is not enough. It needs to ingest real-time data to respond to and automate processes. Real-time data is the fuel that allows the ML algorithms to learn and adapt to new situations and environments. Unfortunately, real-time data presents its own set of challenges, too. The four Vs - volume, velocity, variety and veracity - of data can be overwhelming and expensive to manage.

Recently we were approached by a major operations company in the energy space with a request for a strategy to deliver predictive maintenance. The company has global assets and was seeking a solution to assist in tackling both operational maintenance efficiency and maximising uptime on their operations, minimising unplanned down time and ensuring that the parent asset register was up to date and accurate. When we challenged them on the number of legacy documents (not pages) available for verification and data extraction, the initial thought was that there were one million digital docs covering seven assets, including profess manuals, P&IDs, isometrics, equipment manufacturing documents and so on.

On review, the number turned out to be closer to ten million, addressing the first “V”, enormous data volumes. Historically, putting a team together to review, log, verify and tag all this information to deal with the last “V”, veracity, would be neigh on impossible. If you can assume one person could deal with two thousand documents a day, tagging, revision checking, extracting some form of rudimentary context and manually logging this information, you are looking at a team of five taking one thousand working days to complete the first pass without any additional QC and covering any new data that has been captured in the past one thousand days (I.e. addressing the second “V” – Velocity) and assuming that the five people have the skills to understand all of the document and technical drawing types that they have to deal with (i.e. addressing the third “V” - Variety. Ultimately it is a brave, costly but never-ending option.

Utilising the latest AI options we are able to deal with these sort of volumes in three to four months, including domain expertise QC by ensuring there is a “human in the loop”, delivering a fully integrated and tagged asset register which can be the foundation for the next steps in the digital transformation process.

There are game-changing opportunities available to those companies who are embracing digital technology, but any enterprise that seeks to leverage AI must first develop a robust data strategy that prioritizes quality and contextualized training data.

Make no mistake about it - good, actionable data is the difference between having AI/ML engines work smarter or making your IT staff work harder.

91 views0 comments

Recent Posts

See All


bottom of page