Brave new world, Part 2: Defensible enterprise strategy for the use of AI and automation

May 2 2019
by Paige Bartley, Jeremy Korn


In Part 1 of this series, we discussed the regulatory and legal landscape around automated decision-making, which currently serves as a set of proxy rules for the use of machine learning (ML) and AI technologies. What we know now creates some ambiguity, and the use of automation – which has a significant effect on living individuals, and is what many regulations such as GDPR put restrictions around – will likely land many businesses in court over the coming years. Organizations that want to avoid the wrath of regulatory bodies and avoid litigation need to create defensible strategies around the use of these technologies.

Despite the risks, automation will be critical to maintaining competitive advantage in a world where data is increasingly leveraged to drive every strategic business decision. Organizations that are developing ML, AI or automation strategy today must be aware of the current requirements, and understand the potential development of new requirements as they move forward. Defensibility and documentation of organizational actions and technology, under these circumstances, are critical. In this second half of this report, we outline basic best practices for implementation of automation-related technologies, helping organizations identify technology tools and strategy to innovate and move forward while minimizing risk.

The 451 Take

Organizations cannot simply shy away from automated technologies such as ML and AI if they expect to maintain competitive advantage. They need a cautious, but systematic, approach to the implementation of automation. The distinction between consumer-facing applications of automation, and internal enterprise use of automation is a key factor. The enterprise inherently assumes more regulatory and legal risk when implementing AI and ML that has direct material impact on consumers or data subjects. On the flip side, these same technologies will be critical to internally managing data and ensuring compliance at scale. In this sense, a two-pronged approach is warranted: aggressive use of automation in internal data management initiatives, and more conservative use when it has direct effects on customers or data subjects. Defensibility via documentation and transparency of enterprise initiatives around consumer-facing (or data-subject-facing) use cases will be necessary in these higher-risk scenarios. To serve this need, many tools are emerging to support explainability and fair use of ML and AI technology. Automation in enterprise platforms and products also continues to evolve to serve escalating data management needs.

Enterprise automation strategy: a two-pronged approach

Avoiding the use of AI and ML is not an option for an organization that wants to remain competitively viable, but there is strong incentive to implement these technologies in scenarios that represent the least risk and highest potential gain. Because existing regulations generally focus on restricting automated decisions that have significant effects on living individuals, the enterprise can then focus on use cases where this is not the case, minimizing risk.

Broadly speaking, there are two tracks of automated decision strategy that can be thought of, both 'internal' and 'external,' with the latter being riskier.

The distinction isn't perfect, since employees are internal to the company while still often having similar privacy and data protection rights to external consumers. But if we think of consumers as the largest cohort being protected by data privacy and data protection regulations, then the internal/external distinction makes it simple to categorize many common use cases for AI and ML. Chatbots on a website? External. Automated enterprise cyber-threat detection and remediation? Internal.

External use cases for automated decision-making are largely what regulations are looking to restrict, as they have the potential to negatively impact consumers (and other living individuals). However, it will be the internal use cases that have immense potential in the evolving regulatory climate, and may well be necessary for ensuring compliance with the regulations themselves. In data management in particular, there is a trend toward embedded product functionality driven by ML and AI that ultimately helps the organization find, sort and govern data at modern scale. Automated data detection, automated policy execution for data, automated masking, and automated tagging all help control sensitive/personal data. Simply fulfilling a data subject access request, such as for GDPR or CCPA, likely requires a certain degree of automation to execute perfectly.

Crafting defensible practices

As a rule of thumb, if automated decision-making (or AI, or ML) is used in a way that increases the likelihood of compliance with a regulation, then there is probably an implied regulatory green light to use it. Because of this, data management use cases are low-hanging fruit, and not only can increase the likelihood of compliance, but make it easier to ultimately gain insight from data within the organization. As regulations evolve, we will likely see a lot of innovation in ML and AI uses within data management, and most data management platforms have already begun to build in automated functionality. Using automation to increase control over the data held by the enterprise is highly defensible, and can be explained to regulators as a strategy for achieving compliance.

This doesn't mean that the external automated use cases discussed earlier are out of the question. Competitively speaking, it will be important to find consumer- and customer-facing uses of automated decisions and technology that are appropriate under the various regulatory requirements. But it also means that in planning to do this, thorough risk assessments must be conducted, legal advice sought, and decisions and rationale must be completely documented along the way. Compliance with requirements means little if it can't be clearly demonstrated to regulators, and thorough documentation allows organizations to show good faith in their actions – in the likely case that perfection is not achieved, or in the less-likely situation that litigation occurs.

So, while regulations currently have little to say specifically about AI or ML, and provide only general rules around automated decision-making, there is a strong theme of transparency and explainability emerging. Transparency of human process, if models are being built, is one aspect. Explainability of the algorithms and automated decisions themselves is another.

An evolving technology market to meet these needs

There is a robust market for ML- and AI-related technology evolving, and many of these offerings are pointedly addressing the concerns and needs driven by the evolving regulatory landscape. While technology tools are never the complete 'solution' to achieving compliance (people and process need to be addressed as well), they can augment enterprise efforts toward a more defensible approach. In most cases, these products use automation to better control data and its access, as well as provide mechanisms for explainability.

A number of smaller companies are eager to address the issue of explainability. For example, Waterloo, Canada-based startup DarwinAI uses a proprietary technique called Generative Synthesis that employs AI itself to build better deep neural networks. A byproduct of this method is explainability – to make a network more efficient, you must first understand how it operates. DarwinAI's Explainability Module presents users with a layer-by-layer schematic of their neural network so they can see the impact each part of the network is having on a given decision.

On the other end of the spectrum, the large providers of AI and ML services recognize that explainability is a growing concern among customers, and thus represents both an important business opportunity and an existential threat to their offerings. For example, Google lists 'responsibility' as one of the four cores of its cloud AI offerings, with 'transparency' contributing to this goal. Although the company does not currently offer any explainability features, its TensorFlow Extended (TFX) platform allows users to evaluate model performance for instances of bias or decay.

The need for explainability of models is being addressed both by incumbent vendors and newer startups because this need for transparency becomes more evident with regulatory requirements, and the internal enterprise needs to scale and see repeatable, consistent results with models. IBM's OpenScale offering was borne out of IBM research, and offers tools designed for business users to identify bias and isolate variables, adjusting for model fairness when necessary for legal or regulatory requirements. Fiddler.ai is a much newer and smaller vendor in the space (about 10 employees). It offers a platform-agnostic approach, but also touts an explainable AI engine that allows non-technical users to drill down and examine the effects of variables on a model, with the option to generate reports that support explainability.

Another approach in this arena would be governance of the end-to-end model development and deployment workflow, ensuring that the right people have access to the right data, and are using it for the right purposes based on granular usage policies. Immuta, a 451 Research  Firestarter awardee for Q1 2019, is a contender in this space, and allows for organizations to finely tune access control and data usage policies that support governance efforts in the model-building process, tapping into an approach that 451 Research has previously defined as DataOps, or more specifically, MLOps.

Where the market is headed

As organizations look to operationalize their ML and AI efforts and deploy models at scale for mission-critical purposes and decisions, explainability will only continue to become more important. This is not just a regulatory compliance need, but also a requirement for creating repeatable results and workflows.

In 451 Research's Artificial Intelligence/Machine Learning 2018 H2 survey, these needs were reflected in the proportion of respondents that reported 'deploying results in operational systems' as their primary barrier to using ML within their organization. While not the overall top challenge facing organizations, the 13% of enterprise respondents reporting this as their main barrier to machine-learning adoption demonstrates that ML results and subsequent automated decisions must be highly trusted to be used in operational scenarios. Explainability of models will be a major part of that trust.

Figure 1
Barriers to using machine learning within an organization

Given these enterprise needs, we can expect the ecosystem of AI and ML tools on the market to continue to add more mechanisms for explainability of models and controls for bias. Fundamentally speaking, a fair model that minimizes bias is typically at odds with a perfectly accurate model. Because models are trained on human-derived data sets, they tend to reflect the inherent human bias in those data sets, often surfacing bias that wasn't detectable in the original training set. As our regulations and case law become more sophisticated in creating rules for fair use of these technologies, software providers will need to include more mechanisms for tuning models based on careful adjustment of variables.

The ability to tune models to minimize bias and ensure explainability will be table stakes for the operationalization of ML and AI within organizations. Regulatory compliance and legal defensibility are just two aspects. The ability to scale out ML and AI efforts also depends on knowing how models operate, and having transparency into processes so they can be repeated. As the market for ML and AI tooling evolves, we can expect to see more embedded features that support this level of understanding.