Data The Ultimate Prize or a Starting Point

Imagine for a while: It’s the mid-1800s. You have just spotted some black, viscous liquid seeping out of the ground. It doesn’t look impressive, but when you strike a match and light it, it burns with a warm white flame. You immediately know what you have just discovered – black gold.

Let’s say you’re a cheeky entrepreneur; what would you have done? Let’s look at two scenarios.

The first scenario is that you draw the liquid out by the barrel load and sell it as something that can be used as an alternative to firewood. It is the most obvious answer.

However, in the second scenario, you still have more questions on whether the liquid can be used for something greater than just household fuel. You want to explore other ways you can use the black liquid that could perhaps change the world for the better.

So, you find ways to refine the liquid into better products that can be used for different purposes. One of them could be refining it to replace animal fat in oil lamps, which would revolutionize household lighting in pre-electricity days. Another more advanced use is to refine it further so it can replace coal to power engines as a more efficient and safer fuel. You find ways to create fibers that can be used to replace traditional cotton or linen in textile manufacturing. You even find a way to derive a highly malleable material – plastic — that can be used in anything from aircraft interiors to food storage containers or the buttons on your shirt.

The point of this short illustration was to make a simple statement: discovering oil means nothing unless you explore, experiment and learn how to engineer it in better, more efficient ways.

The same is true for data.

The amount of data captured daily is staggering, with estimates reaching zettabytes (one zettabyte equals a trillion gigabytes). Businesses are collecting vast amounts of it from users through various channels, and there is an abundance of it. They are clamoring to collect more and more of it to derive insights about their customers so they can produce better products and services for them.

Data, however, is a mixed bag. Valuable insights reside alongside irrelevant or even misleading information. The core problem lies in the very nature of data. It can be messy and biased. “Crappy data in, crappy insights out” becomes a critical truth.

Just as transforming oil into usable products requires significant engineering, data demands a similar approach. Businesses that are dealing with data in such vast quantities today need a multi-disciplinary team with expertise in statistics, mathematics, business acumen, and data science. The team must continuously refine its capabilities to ensure clean and unbiased data.

unnamed 7

The rise of Large Language Models (LLMs) might seem to diminish the role of data engineers. However, LLMs are only as effective as the data they’re trained on. In this sense, data today resembles plastic, a potential resource that can become a pollutant if not managed effectively. Biased data leads to biased AI outputs, which can have a domino effect, as biased data used in one system can propagate through interconnected systems. The real value lies not in collecting data but in the engineering expertise that transforms it into a powerful asset. In short, organizations must avoid becoming “data junkyards” – those who collect without the capability to translate it into value.

In the age of information overload, a critical shift in perspective is needed. The true goal is not collecting the data but empowering individuals and organizations to make better, faster, and more impactful decisions using data.

So, how can businesses transform their data to fuel their decisions?

Analytical rigor remains paramount. While advancements like prompt engineering, knowledge graphs, and question networks are valuable tools, the ability to ask the right question remains a human strength. It’s a skill honed over time and a mark of true intelligence.

Just like an oil baron in the 19th century found ways to refine crude oil further and further to make it more efficient, data engineers of the 21st century must find ways to make their data more efficient. Great engineers find ways to reduce the cost and the time of refining data into a meaningful and compelling product that can drive decisions.

The key question two hundred years ago was: “How can we build a more efficient refinery?” Recognizing the refinery was the key to unlocking value. In the same light, today’s key question is, “How do I build a better data exploration ecosystem?”

By effectively engineering data, the cost per question, both in terms of time and resources, will decrease. As such, the organization’s ability to answer more questions translates to greater preparedness for uncertainty. In this environment, optionality becomes the key to success, and generating options demands a constant flow of questions. It boils down to how clear the business is with its goals and objectives. Clarity is crucial for asking the right questions. Without it, organizations can linger.

The true potential of oil wasn’t realized until it was refined and utilized in new ways. The true value of data lies in its ability to inform better decision-making. By embracing a culture of curiosity and employing the right tools to unlock its potential, organizations can transform data from a raw resource into the fuel that propels them toward a future of innovation and success. The future belongs to those who can not only collect data but leverage its power to ask the right questions and make the best choices.

About the Authors:

Manaswitha Rao is a Business Unit Head who partners with clients in the retail and energy sectors. Todd Wandtke is the Head of Marketing and Customer Success.


Be Part of Our Network