“AI First” – going beyond first steps

Materially beneficial corporate deployments of AI are beginning to proliferate. Notwithstanding, the AI activities of many still often amount to a small number of isolated pilots dotted around the organisation, conceived in an ad-hoc basis. Any organisation that does not yet have a clear strategy for AI – and, given the fast-paced nature of progress in this domain, the means to regularly revisit it – may be running material business risks as other industry players move forward. However, whilst individual AI solutions can be transformative within the scope of their application, this is not yet for everyone an undertaking of pervasive, front-to-back change along the lines of, say, a digital transformation for a high street retailer. Developing an AI strategy requires an exercise of careful discrimination – acknowledging the present limitations of AI as well as its strengths to identify where one can vs. cannot (or should not) exploit it.

This article is about the “what” of an AI strategy rather than the equally important “how”. We look at the various business areas where AI solutions are having an impact, try to characterise the boundary line of its applicability and talk to some of the areas of research that may shift that boundary to bring more of corporate business and operating models within the scope of AI solutions.

I. Scoping today’s opportunity space

First some terminology. We prefer “machine learning” over “AI” for being less loaded with singularity overtones. Better still would be the simple term “machine prediction” since in truth what the machine learning community anthropomorphically calls “learning” generally means not much more than fitting a model to data. This is an important point for executives to understand: the diverse, oft-times remarkable (almost magical) achievements of machine learning (e.g., in image captioning, real-time language translation, poker playing, art and music generation, etc., etc.) are all founded on the basic principle of probabilistic pattern recognition. No more, no less. Appreciating this becomes helpful to understanding where it can and cannot yet be made use of.
Researchers are busy extending the performance and scope of individual machine learning techniques. For example, natural language processing is now at or close to human level performance for language translation across a range of language pairs, but generating a goal-directed end-to-end conversation with a human is for any but the simplest dialogue scenarios still very much a work in progress. Whilst it is important to keep abreast of these specific developments (the Electronic Frontier Foundation tracks progress against a wide range of benchmark tests), abstracting to higher level considerations can be useful.

Figure 1: Scoping the applicability of machine learning today

Figure 1 provides one such abstraction and considers two factors in particular: the cost of making a mistaken decision vs. how different future experience is likely to be from past experience. Rules-based systems operate in the darkest shaded region – where the future is expected to be so similar to the past that simple rules will hold true for most circumstances. Machine learning takes over where the rigidity of these rules breaks down, but can struggle if the future task environment differs too markedly from that upon which the algorithm has been trained. The vertical axis is important as well. It’s not the end of the world if peanuts are mistakenly ordered when a customer has requested cheese through a voice-controlled grocery app (unless they have a nut allergy). Mistakenly telling someone that they do not have a life-threatening medical condition when in fact they do is more material. As we move up the vertical axis, machine learning deployments must rely more and more on the ability to exception manage – typically to default to human judgement when the machine declares failure (with a suitably thresholded definition of “failure”). If exception management is critical, but not feasible (e.g., in some real-time, customer-facing, high risk decision scenarios) then deploying a machine learning solution may not be possible.


II. Identifying specific opportunities – the need for systematism

This scope of applicability is now so broad that corporates are exploiting machine learning in five different ways (see Figure 2):

Figure 2: Five areas where corporates are exploiting machine learning

1. Enhanced “core business” prediction
Predictive analytics has been used for years to support many aspects of business decision-making – particularly in Marketing and Risk management. Machine learning’s ability to accommodate vastly greater numbers of predictive variables and variable relationships, across both structured and unstructured data (e.g., text, voice) alike, and to dynamically improve predictive power as new data is received means both that historical models may need upgrading, but also that predictive analytics is being applied in new business domains (e.g., evaluating the level of people risk faced by an organisation as a function of a multitude of internal behavioural practices, evaluating the level of compliance risk posed by specific client interactions, etc.). The intelligent enablement of (even formerly non-IP addressable) physical devices and equipment is bringing active end-to-end flow optimisation, fault prediction and root cause analysis to great swathes of heavy and light industry operations that hitherto have been operations management blind-spots. Business functions that historically may have had little ex-ante predictive analytics deployed for management purposes are now firmly within scope (e.g., Supply chain management, Finance and risk operations, Health and safety, etc.).

2. Automation
At least three types of automation solutions are being deployed, described below in loosely decreasing order of maturity:

  • Extractive. The automated extraction of structured information from unstructured sources (e.g., liability provisions extracted from supplier contracts in the context of M&A due diligence, automated searches for adverse media reports assessing new client risk, etc.)
  • Orchestrative. Automating processes or activities where simple rules-based approaches (which underpin most RPA solutions) break down. An example is automatically predicting the most appropriate clauses for a legal contract from a range of possibilities based on the requirements of the contract – determined by complex combinations of legal jurisdiction, level of desired risk, nature of the counterparties, and a range of other parameters
  • Generative. Generative models are being used to automatically “blank page” generate passable e-commerce product descriptions. They offer tremendous potential to excel in environments where creativity and stylisation meet structured constraint. Architecture is one example where generative design is beginning to have an impact. There will be many more.

3. New customer propositions
Machine learning is driving a wide range of proposition innovations. Some are redefining the parity requirements to compete. B2B examples include precision agriculture solutions to optimise the cost and effectiveness of pesticides (see, for example, John Deere’s acquisition of Blue River), intelligent receivables matching solutions for corporate banking clients (see, for example, Bank of America’s Intelligent Receivables solution), and data and technology providers building platforms for clients to make use of and/or develop bespoke machine learning solutions on top of their core service of data management and provisioning (for example, SAP’s Leonardo offering). B2C examples include Google’s Pixel ear-buds permitting near real-time face-to-face language translation and the Amazon Go machine vision-enabled no checkout shopping experience.

4. Commercialisation
Given the now much extended scope of applicability of machine learning, organisations are sometimes blissfully unaware of the value of the data they possess. Investors in early stage machine learning start-ups routinely value the ownership of or privileged access to training datasets much more highly than they do the start-up’s machine learning algorithms. Opportunities may exist to build new businesses in their own right founded on machine-learning driven competitive advantage (e.g., offering machine-learning enhanced in-house operations capabilities on an outsourced basis to third parties). New techniques such as representation learning are making it more straightforward to integrate a corporate’s data asset with that of relevant third parties to support predictive performance in new non-core (but potentially highly commercialisable) areas. As an absolute minimum, businesses must start to give consideration to IP ownership of third party (e.g., vendor) models trained on their data.

5. Disruptive models
The first four opportunity areas largely take the existing business model as a given and look for point improvement opportunities. Machine learning has the potential, however, to radically revise pre-existing business models and/or cost to compete. Ocado’s swarm robot warehouse automation approach – now fully implemented – is arguably one such example. Not only is order picking fully automated, but it is done so in a fashion which both minimises the number of different types of components involved (so reducing the cost of maintenance and repair) and maximises cost efficiency through making full use of the 3D space of a warehouse and variabilising as much as possible the amount of that volume required at any given time. This and other innovations across its end-to-end order to fulfilment chain, which collectively are presumably producing market-leading cost and quality performance, are enabling Ocado to successfully become the e-commerce fulfilment provider to other retailers in other countries. Another disruptive example is Premonition in legal analytics, which provides commissioning clients with performance prediction for specific lawyers in front of specific judges on specific case types – thereby bringing transparency to what has until now been a performance-opaque industry where brand has often had great weight.

As the examples given above demonstrate, the range of potential applications of machine learning is extremely broad. This raises the importance of having a systematic and comprehensive approach to identifying them. Structured reviews of the organisation’s operating model (see Figure 3), financial model, customer value proposition, etc. – undertaken by mixed teams of business practitioners and machine learning scientists, and employing approaches such as process walks, customer journey reviews, P&L driver deep-dives, etc. – are proving successful.

Figure 3: Systematically identifying machine learning opportunities


III. Evaluating feasibility – not-to-be-ignored practical considerations

How should any given opportunity be evaluated? Below are some of the criteria that we have found important. We ignore for now (somewhat heretically) the critical requirement of training data – revisited later.

  • Exception processing. As described above, if exception management is critical but not feasible then deploying a machine learning solution may not be possible.
  • Explicability. There has been much commentary about the difficulty of explaining the decision-making of deep learning models. It is worth noting that there are models (e.g., random forests) that better lend themselves to explicability and in many cases, perform just as well as (and sometimes better than) deep learning models. The practical preferences of the machine learning community itself are also tending towards facilitating better explicability (e.g., through making use of newer regularisation approaches that explicitly weight fewer rather than more variables). If deep learning is still preferred, approaches exist that permit a degree of interpretability of a given decision by representing complex models with simpler models in the region of the specific decision criteria. Nonetheless, explicability – and the degree to which it can be mitigated – may be a key criterion where models must operate in highly regulated environments.
  • Data protection regulation. Opportunities may need to be evaluated, and potentially re-specified, for compliance with relevant data protection regulation. For example, GDPR describes prohibited circumstances and requirements for, amongst other things, automated decisioning and profiling.
  • Legal and reputational risk. Any biases in the training data will migrate and manifest themselves in the model. If the training data captures socio-economic characteristics and trends which, although real, are ethically unacceptable, the machine learning model will capture, reproduce and even reinforce them – potentially exposing the corporate to legal and reputational risk.
  • Make vs. buy. Vendor solutions may be favoured over in-house builds if organisational data science / machine learning maturity is low. The machine learning vendor space is, however, still nascent and oft-times fragmented. Determining whether there are robust vendor solutions that will work in a given organisational context is key. Properly measuring prediction accuracy, evaluating the robustness of the underlying algorithmic approach to a range of realistic future use cases, and determining the degree of correspondence between the vendor data model and that of the organisation are examples of evaluation criteria that can be important.
  • Organisational change. Determining whether the organisation can accommodate the requisite business change (e.g., addressing potential HR considerations related to role redefinition; ensuring adoption; unifying pet projects and approaches across different divisions, etc.) is for some a critical criterion and often overlooked.

IV. Machine learning today vs. tomorrow

The vast array of different machine learning techniques has justifiably been described as a “zoo”. Research is similarly proceeding in a multitude of directions, all of which can be fascinating to the enthusiast. Nonetheless, two seemingly benign areas of development are most increasing the scope of business applicability of machine learning in the short-term:

1. Growing adoption of Bayesian learning approaches
As described, machine learning algorithms don’t always get it right and in many deployment scenarios don’t have to – provided they are able to give a probabilistic measure of their confidence in their output so that exception management can be appealed to. Classical models (representing the vast majority of historical corporate deployments of predictive analytics) can’t do this.
Furthermore, classical models typically only provide the most likely (maximum likelihood) answer. This is not good enough in many deployment scenarios. Setting optimal levels of supermarket inventory requires an ability to trade-off the costs of perishability against the opportunity cost of a missed sale. When the utility of getting a decision right is different from the cost of getting it wrong, classical models can be inadequate.
The growing sophistication and adoption of Bayesian methods – which yield probability distributions over possible outputs to address these classical limitations – is proving key to wider corporate deployment.

2. Reducing the volume of training data required
A variety of methods are effectively reducing the volume of labelled training data required. The ability to train models in simulations of real-world environments has become a critical requirement for models operating at the extreme end of complexity (e.g., for training self-driving cars, anomaly detection using “digital twins” of IoT-enabled physical assets, Deepmind’s optimisation of Google data centre energy consumption, etc.). As the ability to simulate real-world environments improves (perhaps eventually to realistically simulate real human to human interactions) so to will our ability to train models with ever more modest demands on training data.
An added benefit of the Bayesian approach described above is that it has an in-built ability to incorporate existing domain knowledge into models, which can result in improved performance and shorter training times whenever such knowledge exists. Another important development is in transfer learning, which takes several guises. Simultaneously training models on a range of tasks can result in faster training on any one of them. Transferring pre-trained models from one related environment to another can require considerably less fine-tuning to the new environment than training from scratch.
These developments and others are alleviating the burden of accessing high volumes of training data. Nonetheless, the most valuable experience a corporate can have in this respect is simply that of having already implemented a trained model and having captured the benefits – thereby making it easier to jump the belief-hurdle required to invest in building further training data assets in advance of benefits realisation.

In summary, corporates in all industries must now craft an AI Strategy. In doing so, a judicious approach will yield the highest return with the fewest adverse surprises. In fact, a perfect illustration of the limitations of AI today is, appositely, the nature of the exercise itself of identifying and assessing potential AI opportunities. Doing so requires informed and structured imaginative thinking along with the application of experience-based judgement – both of which will be the preserve of humans for some time to come.



Tariq Khatri, Co-Founder and Managing Director of machinable



This article was first published in D/sruption Magazine. Below is a link to the article