China's quest for AI superpower status hindered by censorship challenges
China's quest to become a world-leading AI superpower is coming up against its own censorship regime, with the Chinese Communist Party's (CCP) control of information potentially hindering the development and rollout of large language model (LLM) chatbots.
The CCP prioritizes controlling the information space over innovation and creativity, human or otherwise, which could dramatically impede the development and rollout of LLMs, leaving China lagging behind in the race to become the world's dominant AI player.
Recently, Chinese regulators instructed key tech companies not to offer ChatGPT services "amid growing alarm in Beijing over the AI-powered chatbot's uncensored replies to user queries". According to state-sponsored newspaper China Daily, such chatbots "could provide a helping hand to the U.S. government in its spread of disinformation and its manipulation of global narratives for its own geopolitical interests."
The fundamental problem is that plenty of speech is forbidden in China, and the penalties for straying over the line are harsh. A chatbot that produces racist content or threatens to stalk a user is an embarrassment in the United States, but a chatbot that implies Taiwan is an independent country or says Tiananmen Square was a massacre could bring down the wrath of the CCP on its parent company.
The challenge for Chinese engineers is to ensure that LLMs never say anything disparaging about the CCP, which is a genuinely herculean and perhaps impossible task. LLMs' output is unpredictable, and they learn from natural language produced by humans, which is subject to inference, bias, and inaccuracies. Users can easily "hypnotize" or "trick" models into producing outputs the developer tries to prevent. Ensuring that LLMs follow the rules 99.99% of the time is a major unsolved research problem.
One potential solution is to prevent the model from learning about certain topics. However, as Yonadav Shavit, a computer science Ph.D. student at Harvard University, observed, "no one really knows how to get a model trained on most of the internet to not learn basic facts." Another option would be for the LLM to spit out a form response like, "As a Baidu chatbot, I cannot..." if there's a chance that criticism of the CCP would follow, but given the stochastic nature of chatbots, this option doesn't guarantee that politically objectionable speech to the CCP could never arise.
In that case, the de facto method by which Chinese AI companies compete among one another would involve feeding clever and suggestive prompts to an opponent's AI chatbot, waiting until it produces material critical of the CCP, and forwarding a screenshot to the CAC. However, this regulatory brittleness would need to change if China wants a thriving generative AI industry.
Former Assistant Secretary for Policy at the U.S. Department of Homeland Security Stewart Baker even publicized an offer that the person who gets Baidu's AI to say the rudest possible thing about Xi Jinping or the CCP will get a cash prize, and, if they are a Chinese national, he will personally represent them in their asylum filing in the United States. The offer underscores the significant challenge of ensuring that LLMs never say anything that might offend the CCP, which is a major impediment to China's quest to become a world-leading AI superpower.