Yuh-Jye Lee (李育杰), the initiator of Taiwan’s Trustworthy AI Dialogue Engine, or TAIDE, hadn’t heard the term “sovereign AI” until Nvidia’s Jensen Huang (黃仁勳) mentioned it at the World Governments Summit in Abu Dhabi in February. At that point, Lee had been building TAIDE — Taiwan’s government-funded large language model trained on Taiwanese data — for over a year.
Huang said building an indigenous large language model is a national imperative: “It codifies your culture, your society’s intelligence, your common sense, your history — you own your own data.” Lee, who was initially inspired to create TAIDE following the release of ChatGPT in November 2022, agrees. Lee observed that ChatGPT gave very “Chinese” responses to prompts asked in Mandarin, reflecting the fact that it was trained on Chinese data. To prevent the “invasion” of Chinese or Chinese-trained artificial intelligence, Lee decided that building a culturally and linguistically specific large language model for Taiwan was a national imperative. TAIDE is now widely considered to be Taiwan’s version of sovereign AI.
Labeling an AI model as “sovereign” certainly makes it very marketable in the national security realm. But it also invites criticism about the government’s role in the development and governance of the technology. While the scholars interviewed for this article varied in terms of their skepticism and optimism regarding sovereign AI, they generally agree that Taiwan needs to improve its AI policy and governance. If the government waits five years, a decade or longer to start creating policy around generative AI, “that will be the end game for the Republic of China or Taiwan as a country and as a culture,” said Ju-Chun Ko (葛如鈞), a Chinese Nationalist Party (KMT) legislator and an adjunct assistant professor at National Taiwan University’s Institute of Networking and Multimedia.
Alice Yang (楊長蓉), an assistant research fellow at the Institute for National Defense and Security Research who specializes in cybersecurity and data privacy, is suspicious of AI projects that receive their funding from the government: “For the government, it’s actually more important to have … control over the people” than it is to encourage freedom of thought and innovation. She argues that while the concept of “sovereignty” in the digital space might be especially attractive to authoritarian countries like China and Russia, in reality, “the idea is … very appealing to most governments.” Yang highlighted the U.K.’s Online Safety Act, which took inspiration from the Chinese model’s content monitoring requirements, private communication surveillance and enhanced government oversight. On the innovation front, Yang worries that sovereign AI could lead to government monopolization and control open source resources, which would put smaller companies who don’t have the resources to meet regulatory demands at a disadvantage.
Scholars interviewed for this article agreed that to promote effective AI governance in Taiwan, more attention needs to be paid to transparency, privacy and cybersecurity. According to Chen-Yi Tu (杜貞儀), an assistant research fellow at the Institute for National Defense and Security Research who specializes in internet governance, the most important component of transparency is clear disclosure of the data that developers use to train a model. (TAIDE is built using Meta Llama 2 and 3 models as its foundational layers, then fine-tuned using licenced, open Taiwanese data. Even though the foundation of TAIDE is not trained on Taiwanese data, the model is optimized to prioritize Taiwanese data during fine-tuning.) Since the Taiwanese data used to train TAIDE is publicly available, the model is able to provide a certain degree of transparency. Tu said it is also important to inform users about the potential risks and biases embedded into the model. Beyond data transparency, Ko said that civil society actors — such as g0v or the Open Culture Foundation — should be able to audit Taiwan’s sovereign AI to make sure it isn’t being unethically utilized by the government.
Privacy and cybersecurity considerations are also crucially important when it comes to building sovereign AI because the government has access to so much personal information. According to Yang, Taiwan’s privacy law is “far from adequate,” and the Taiwanese government has a history of overly collecting personal information and sometimes misusing it. This was especially apparent during COVID-19. Taiwan’s privacy law needs to be updated to address the newer features of the digital world such as cookies and GPS location, said Yang. The government needs to reorganize oversight over data protection within the bureaucracy to clarify roles and responsibilities, said Tu, adding that privacy reforms are a necessary prerequisite for the incorporation of personal information into TAIDE. Relatedly, Taiwan needs to continue to bolster its cybersecurity, as the country is disproportionately targeted in cyberspace, receiving as many as 30 million attacks per month.
There also exist some technical obstacles to developing Taiwan’s sovereign AI. According to Lee, creating sovereign AI requires large amounts of AI talent, computational power and data. TAIDE was released on Hugging Face as a relatively small model, about 7 to 8 billion parameters, because the research team didn’t have enough legal data (i.e., data that doesn’t violate copyright laws) to make the model any larger. For reference, ChatGPT-3 has 175 billion parameters. While smaller models can be more efficient and cost effective, as they use less computational power and memory, they also tend to have a narrower range of knowledge and less depth of understanding. Beyond a lack of digitized, open-source Taiwanese data, training large language models is incredibly expensive, because it requires advanced GPUs — Nvidia’s latest AI chip costs $30,000 per unit. Lee pointed out that although Taiwan’s TSMC produces over 90% of the world’s most advanced semiconductor chips, the government can’t afford to purchase very many itself.
Despite these limitations, Lee has many ideas for how TAIDE could be used to improve Taiwan’s society. For example, Lee and his team integrated Hoklo and Hakka and are planning to introduce more of Taiwan’s native languages to the model. Lee wants to employ TAIDE as a medical consultant in hospitals, where elderly patients might not be able to speak Mandarin and need instantaneous translation into their native language. Lee also believes TAIDE will be able to enhance Taiwan’s economic security by allowing businesses to run TAIDE on local servers, empowering them to integrate generative AI into their workflows without needing to send their information to the cloud. Lee and his team are also working on G-TAIDE, which will help the government with daily operations. Finally, Lee thinks generative AI might be the answer to Taiwan’s demographic decline and labor shortage, as it could perform tasks that Taiwan no longer has the capacity for.
Ultimately, despite the many potential pitfalls of AI governance, Ko believes that building sovereign AI is key to Taiwan’s strength. As the world becomes increasingly digitized, the most powerful countries will be those who have created a robust national identity in cyberspace. And in Taiwan’s particular political context, given domestic and international division on what exactly “sovereignty” means for the island, Ko thinks the digitalization of Taiwan’s national consciousness could help forge an identity for Taiwan that transcends competing interpretations of what Taiwan is.








Leave a Reply