Implementing Machine Learning in the Enterprise | Q&A with Ranjan Bhattacharya & Ed Lyons

Click on the Image above to watch the YouTube Playlist of EQengineered’s Data Engineering | Implementing Machine Learning in the Enterprise | Q&A with Ranjan Bhattacharya & Ed Lyons

 

1.    What is ML and why does an enterprise need it? 

Traditional analytics is about having the formula, and then using that on your data to get answers. Machine learning is when you don’t have the formula that you’d like, but you have lots of relevant data. Machine learning works to come up with the formula with many experiments.

Traditional analytics can provide answers to questions like “What happened?”; for example, in the retail sector, which products are popular in certain regions. Machine learning based analytics can provide answers to questions like “What can happen next?”—if I open a new store, which products are likely to sell more, based on its similarity to other stores. To do this analysis, an ML based engine may use external factors like the weather, or demographics.

Analytics and BI represent the foundational or traditional way to develop insights, reports and dashboards. These tools have served their customers well for some time. However, with competitive pressures on organizations, it is becoming increasingly critical for executives to ask more complex and challenging questions which become impossible for these traditional tools to answer. Advanced analytics incorporating data science and ML can create a foundation for better decision making which can extract insights from all kinds of data and can allow users without advanced skills to interact with data and insights easily.

According to a 2021 AI survey from PWC, 86% say that AI will be a “mainstream technology” at their company in a couple of years. 

Like any other investment, such as in new technologies, assets, or businesses, there is no guarantee that all AI investments will pay off. But without these investments, no organization can lay the foundations for future benefits like revenue growth, better decision making, and improved customer experience.

2.    How should companies go about implementing ML?

The key is to find the right business problems that are not being addressed by traditional analytics, and ensuring there is data from which to glean insights. Companies can often accelerate these initiatives by engaging external consulting services which can help validate the problem and help with getting the data ready for ML.

The Gartner data maturity curve is often used to describe an organization’s data and analytics maturity journey. This curve shows the different stages of maturity in an organization’s data journey and capabilities it can achieve at each of these stages.

There is doubtless a considerable amount of work required to become a fully data-driven organization and it can appear overwhelming at the beginning.

The good news is that it not necessary for an organization to be at a more mature stage of this journey before they can utilize the benefits of some of the more advanced analytics tools using ML, for example. It is possible, in fact advisable, for organizations to start exploring ML techniques and tools at even earlier stages of their journey. There may be some areas which can benefit from these tools even if the rest of the organization is not ready.

To do this, organizations should approach the challenge in an agile manner, identify the critical areas that can benefit from AI, build iteratively, and learn from the exercise, instead of trying a big bang approach.  

The implementation can also be accelerated with help from capable consultancy services.

3.    What kinds of technologies are needed for data and ML work?

Traditional D&A, that is the foundation of ML, is based on on-prem data warehouse implementations. Although this foundation has served well for the past two decades, it is difficult to scale to handle the volume, variety and velocity of a modern enterprise.

In contrast, cloud-based D&A products offer more value and capabilities through new services, including simplicity and agility to handle data modernization. Cloud solutions can also handle the demands of new types of analytics, such as streaming analytics, specialized data stores, and more self-service-friendly tools to support end-to-end deployment.

Most importantly, almost any organization, irrespective of its size and budget, can stand-up a cloud-based D&A platform without a lot of up-front investment. The cost of getting value from data and ML approaches is far lower than it once was, due to efficiencies from the scale of cloud-based offerings. 

4.    What goes wrong in data and ML solutions? How can these risks be mitigated?

 

Major risks to data and ML solutions include lack of sponsorship, data in poor condition, or an inability to manage the data, even if it is in good condition. Also, the risks in big-bang large-firm approaches to ML, which cost a lot and do not have a near term ROI, should be avoided. Rather a catalyst-style - short duration/high impact - direction is the best approach which assists to create a data-driven culture.

Machine-learning based decision making can introduce additional risks including financial, operational, and reputational. To address such risks, companies should implement appropriate governance and controls to cover every stage of an AI-based analytics lifecycle. They should create frameworks and toolkits to continually assess current and planned AI models, making sure they are not only explainable and robust, but also fair and ethical.

Companies should also evaluate the appropriate skill requirements of their workforce and provide training for an AI based culture.

A short, focused data readiness catalyst engagement, typically 6—12 weeks long, can help organizations identify data readiness gaps and provide a roadmap for growing up the maturity ladder.

Mark Hewitt