The pitfalls of impression-based technology policy analysis

In late 2022, OpenAI released a Transformer-based Large Language Model (LLM) called ‘ChatGPT’. Against the expectations of OpenAI staff, ChatGPT became the fastest growing web-based app in history, reaching 100 million active users in two months (beaten only by Meta’s Threads). The first public impressions of ChatGPT had both sublime qualities and ominous portents. In February 2023, Henry Kissinger, Eric Schmidt, and Daniel Huttenlocher wrote that generative artificial intelligence (AI) is similar to the intellectual revolution initiated by the printing press, this time consolidating and “distilling” the storehouse of human knowledge. In March 2023, Eliezer Yudkowsky, foreseeing extinction-level risks, implored the world’s governments and militaries to halt the AI ​​project and “be prepared to destroy a rogue data center by airstrikes.”

These first impressions represent two ends of a spectrum, but the reasoning that occupies the space between them is common in technology policy analysis: personal impressions of generative AI infiltrate the background assumptions from which policy analyzes are conducted. When assumptions of fundamental importance go unchallenged, it is all too easy to fall into the trap of extrapolating current technological conditions to future technological marvels. Technology policy analysts of all stripes are doing excellent work, but it is time to identify the gaps in our reasoning and aim higher individually and collectively.

An example illustrates the general trend. Paul Scharre of the Center for a New American Security, in his book ‘Four Battlegrounds’ – which contains a wealth of insights overall – outlines the future of AI, although he leans towards the idea that ‘building bigger, more diverse datasets result in more robust models. Multimodal datasets can help build models that can associate concepts represented in multiple formats, such as text, images, video and audio.” This expectation rests on the idea that scaling AI systems (increasing their internal capacity and training datasets) will lead to new capabilities, with positive reference to Richard Sutton’s famous argument in “The Bitter Lesson” about the benefits of such techniques.

Not long after, Microsoft researchers helped set the tone for a wave of overly optimistic claims about the future of LLMs with their provocatively titled “Sparks of Artificial General Intelligence” paper on GPT-4. It’s not hard to see how one’s personal impression of GPT-4 could lead to an equivalent feeling of “We’re on the cusp of something big here.” Yet this is no justification for allowing the assumptions associated with this sentiment to permeate your analyses.

Extensive research points to the limitations of LLMs and other Transformer-based systems. Hallucinations (authoritative but factually incorrect statements) continue to plague LLMs, with some researchers suggesting that these are simply innate features of this technology. Voters who use chatbots for basic information about the 2024 election could easily be misinformed about hallucinated polling places and other false or outdated information, according to a recent study. Other research shows that LLMs’ ability to form abstractions and generalize them lags behind that of humans; the reasoning ability of multimodal systems is a similar story. OpenAI’s latest development – ​​the text-to-video generator “Sora” – while remarkable in its realism, invents objects and people from scratch and does not conform to physical physics.

So much for the idea that new modalities such as images and video would lead to the reliable, robust and explainable AI systems we desire.

None of this indicates that this is the case only hype in the technology world. Carnegie’s Matt O’Shaughnessy rightly notes that talk of “superintelligence” is likely to have a negative impact on policymaking because of the fundamental limitations of machine learning. Furthermore, the Biden administration’s October 2023 sweeping executive order on AI, while dramatically invoking the Defense Production Act to authorize monitoring of certain computationally powerful AI systems, was more diverse in tone than one might expect .

Yet the problem we identify here is not a hype problem necessarily. Hype is one result from getting stuck in analytical frameworks that are too easily ignored in favor of quick publications and individual or organizational self-promotion. Lest we mistakenly believe that this is just a curious LLM-specific trend, the disappointment of AI-enabled and autonomous drones on the battlefield in Ukraine should raise eyebrows about the alleged speed of fundamental breakthroughs that will occur in 2023 . Moreover, it is easier to find nuances in the field of quantum information science, but at the same time, little individual or collective reflection seems to arise as the crown jewel of quantum computing begins to see its future degraded.

Nevertheless, today generative AI is starting to look like a parody of Mao’s Continuous Revolution: the transformation of this technology into a human-like ‘general’ intelligence or some other marvel of technological imagination is always one model upgrade away, and that cannot be allowed. to succumb to challenges from regulatory authorities or popular movements.

The conclusion is that policy analysts make choices when assessing technology. The choice of certain assumptions over others provides the analyst with a particular set of possible policy options at the expense of others. That individuals form first impressions about new technologies is inevitable and can be a source of diversity of opinion. The problem for policy analysis arises when practitioners fail to pour their first (or second, or third, etc.) impressions into a shared crucible that exposes unstable ideas to fierce intellectual criticism, leading them toward an articulation of specific policy challenges and solutions without unnecessarily neglecting other options.

Policy analysis is generally a mixture of ingredients from industry, domestic politics and international affairs. Simply identifying that a policy challenge exists is not enough the new but from an intuitive connection between the needs and values ​​of a society and the expected or actual impact of developments within its borders or abroad. That intuition – we all have it – should be the focus of our honest and shared inquiry.

Vincent J. Carchidi is a non-resident scholar at the Middle East Institute’s Strategic Technologies and Cyber ​​Security Program. He is also a member of Foreign Policy for America’s NextGen Initiative 2024 Cohort.

For the latest news, weather, sports and streaming video, visit The Hill.

Leave a Comment