Policy Brief
Erfolgsmessung von KI-Strategien. Mit Indikatoren und Benchmarks die Umsetzung der Strategie erfolgreich steuern
Authors
Dr. Stefan Heumann mit Nicolas Zahn
Published by
Interface
September 26, 2018
Artificial Intelligence (AI) is considered to be the key technology for the 21st century. Many countries have adopted national AI strategies in order to take advantage of the opportunities of AI and to address important challenges. The German government has also followed suit and formally adopted and published its own national strategy in November 2018.1 Like many other AI strategies, the German government takes a comprehensive approach, covering implications of AI for research, transfer between research and business development, employment, education, and regulation to name just a few of the most important issue areas. However, the strategy has been criticized for not defining clear and measurable objectives. The lack of concrete goals and clear indicators of success is symptomatic of many strategy papers and announcements in German digital policy. Definitions of clear goals are missing as well as policies to monitor progress and measure success. Politicians and citizens are therefore often left wondering what precisely we are trying to achieve, when it will be achieved, and whether we are making progress. The government’s AI strategy provides the opportunity to do better this time. The advantages are clear. Defining indicators requires the development of cross-departmental goals, which lays the foundation for tracking progress and thus creates the conditions for an effective implementation of the strategy.
A broad debate about AI is urgently needed, due to the broad and imprecise use of the term in public discourse. Even in political discourse, it is generally unclear what is meant by “AI”. When defining certain goals, for example the increase of AI-associated professorships in German higher education, itis important to understand when a professor’s work can be characterized as rooted in AI and when this is not the case. The same holds for research funding or increasing the numbers of AI-driven startups. However, it is not only the clear definition of goals that necessarily leads to a deeper engagement with the concept of AI and the technologies that this term covers. The question of how to define achievement indicators and measure progress is also important. Discussing the definition of AI also puts us in the position to better understand what a strong AI ecosystem is and how we can empirically measure its current state as well as track its further development.
Such an examination of goals, benchmarks and indicators must always be critical in nature. Meeting benchmarks and scoring high on certain indicators should never be an end in itself. We should rather continuously question whether the indicators really measure what we want measured, and whether potential flaws in our indicators, data sources or analysis could be distorting the picture. For some questions, it may be difficult or even impossible to verify progress and goal attainment through easily observable indicators. Flaws in data sources and the limits of our analytical methods must be recognised and openly discussed – a discussion that we seek to stimulate through this paper. That said, these limits are no justification for avoiding such a debate. The benefits clearly outweigh the problems. Benchmarks and indicators enable agile political governance that is based on the definition and measurement of progress and success.
We differentiate the definition of political measures and success criteria as input- and output-indicators. Before one can develop these, the overarching goals need to be defined. One such goal could, for example, be the establishment of an AI ecosystem. To derive input and output indicators from that goal, one must clearly define what an AI ecosystem really is, what its distinguishing dimensions are, and how one might foster them and measure their development.
Input Indicators | Output Indicators | |
---|---|---|
Quantitative | Amount of funding | Number of AI Patents |
Qualitative | Agile research funding | AI Quality Standards |
Table 1: Indicator matrix with examples, source: Stiftung Neue Verantwortung |
In the context of an AI strategy, input indicators are therefore all policy measures to strengthen the AI ecosystem. One can then differentiate between quantitative and qualitative input indicators. Quantitative, and therefore easily observable, input indicators include, for example, the budget that should be allocated to research funding or new investment funds. A qualitative input indicator would be a political measure, for example new regulations for the allocation of research funding that reduce expense and bureaucracy and boost competition. Output indicators relate to the achievement of goals. The core issue is whether the measures will lead to the result that has been established as the goal. However, output indicators do not have to correlate with input indicators. One can also use them to assess different dimensions of an AI ecosystem – even those not directly addressed by policy, such as the numbers of startups that are founded. One can again distinguish here between quantitative and qualitative dimensions.
With this paper, we would like to stimulate a discussion about input and output indicators related to national AI strategies. To generate ideas for the development of such indicators, we examined whether and how already-published AI strategies define their goals and measures to validate the achievement of those goals. The national AI strategies provide some good approaches and ideas, but lack an in-depth and systematic engagement with indicators and benchmarks. In a further chapter, we examine the methodologies of existing AI indices. In both cases, we were concerned with working through core questions and providing an initial overview. We would like to caution the reader that this is not a comprehensive study. But we hope to stimulate further discussion and research with this paper.
The majority of indices and reports we examined suffer from significant methodological weaknesses. The reports have generally been received uncritically by the media and the public. Therefore, we also want to spark a critical debate around AI reports and benchmarks. That said, we do not wish to generally call into question the importance and utility of these reports. So, in the third chapter, we set out our own ideas for the development of an empirical foundation for an AI strategy. Our approach hinges on a dynamic interaction with AI trend monitoring, as a means to providing the basis for the continuing engagement with and further development of indicators. With this paper, we hope to contribute to the discussion about how to define goals for national AI strategies and about how to measure them.