Over the course of the last three years, AI has revolutionized everyday life in the 21st century. AI-generated images are everywhere, AI is being used to replace customer service in handling mundane questions to chatbots, content farms on social media are creating more and more content at record pace. For a moment in time, there was even an AI bot working the drive thru at my local taco fast food restaurant. After a couple months of the AI drive-thru attendant, that technology has since been removed, I’ll leave you to assume why.

In translation, the introduction of AI has thrown everything off-balance. Organizations with no regard for quality, understandability, or readability turn to Neural Net machine translations for quick and cheap compliance. Why pay hundreds or thousands of dollars to a language company for translations when you can get translations from Chat GPT or any other Large Language Model (LLM) service for free or at least a nominal fee?

There are two technological reasons why you should acquire translations from a language service company as opposed to translations from AI. Additionally, there are the legal concerns with disclosure for LLM Model training, and message concerns with the quality and understandability of the translations themselves.

From a technology perspective, there are two main reasons to avoid raw AI translations. First and foremost are hallucinations. Artificial intelligence hallucinates when LLM AI models compile, generate, and present-as-true outputs that are incorrect, nonsensical, and/or factually inaccurate. Hallucinations occur and are amplified largely because of the second issue.

The second issue is the nature of training these models. In order to build them up, AI companies consume a lot of data, they scour the internet for any and all content and serve it up to the requester as a “matter of fact”. Before about 2022, the internet was a treasure trove of great, 100% human-generated content. This data was perfect to teach these models how humans have molded the world to their liking and what information could be more or less trusted in the newly established neural net. Since 2022, however, the waters of the internet have become muddied with AI-generated content alongside genuine human-generated content. A problem has arisen in the last few years where undesirable traits such as AI hallucinations are replicated and magnified. It can best be compared to genetics. If someone is a carrier for a recessive trait and reproduces with someone who is not a carrier for that trait, the chances of that trait manifesting in the resulting child is almost non-existent. If that same person were to reproduce with someone who is also a carrier for that trait, the chances of that trait manifesting in the resulting child is amplified. In the same way, if the LLM was trained on content that had errors or AI hallucinations in it, those errors seed more and runaway ensues – a technological re-creation of the Spanish Habsburg line. This phenomenon is called Model Collapse and unless AI models are cleaned up and previously AI-generated content is identified and expunged, it will continue. Those organizations who control snapshots of the internet before 2022 have a distinct advantage over newer models in that they can train later versions off the pre-2022 content, untainted by AI-generated material. Only professional translators, masters of both languages and native speakers of the target language at that, can review AI machine translated content and determine if the resulting translation is understandable by the target readers.

From a legal perspective, giving any content to AI is handing over that IP to be used in training said LLM, especially if you’re using a free or nominal fee version of said application. If you’re asking your Language Service Provider (LSP) to use AI translation engines, and you care about the confidentiality of your content, even if temporary, it’s important to know what engines they’re using and what the conditions are, from that company, on use of the content to train engines in the future. If you are dealing with Personally Identifiable Information (PII) or personalized health information and considering translation with AI, this professional would ask you to reconsider as with health information specifically, using AI would be a breach of HIPAA regulations.

From a professional standpoint, we’re seeing clients come around. Client contacts have told me on calls that they’d been using Google or ChatGPT but end users have come back to them saying that the translations are poor at best; at worst, the translations are flat-out wrong or say the opposite of the truth.

There’s another reason still to limit AI use: the environment. These data centers, spread around the world, are consuming vast amounts of electricity and water. Electricity consumption wouldn’t be that bad, if all electricity were from carbon-neutral sources. However, most data centers are located in the US whose electricity is still mostly generated by the burning of fossil fuels. Globally, data centers are using 400 Terawatts of electricity per year and account for about 3% of global greenhouse gas emissions (surpassing the aviation industry) and use 3-5 million gallons of water per day for cooling.

–for other language professionals–

About a year ago, a colleague asked me what I thought the impact of AI was going to be on the industry, I responded, conservatively, that these revolutions in technology come in cycles – we saw it around the turn of the century with CAT tools, about 10 years later with commercially viable hybrid and (the initial/Gen. 1) neural net MT, and now with more developed NMT. Initially, two things happen: some users are quick to adopt and slow to adapt – adopting the new technology without considering the repercussions; some translators go apocalyptic, “the profession is dying out, our jobs are obsolete, and we need to pivot.” Next, our profession learns to adapt to the technology. Instead of the AI being a foe, it will become a tool for translators to exploit and assist in the translation process. Finally, the industry will stabilize as it comes to terms with co-existing with the technology.