Some people are very worried about AI attaining superhuman competence and wiping out humanity. I’m not worried. AI is currently like the super-bright narcissistic uncle with Alzheimer’s; there are amazing insights, but also outright fabrications, constant minor changing of the story they’ve already told you, along with occasionally completely forgetting what they just told you. This is capped off by AI just giving up and refusing to answer past a certain point until they’ve had a nice nap. Unless things change a lot, this incompetence is much more dangerous to us than AI’s competence, and I trust that AI will not be put in charge of anything critical unless and until these issues are resolved.
The Four Horsemen Explained:
Hallucinations occur when AI tells you things that just aren’t true. This is because AI is programmed to always provide an answer. If it doesn’t have a good one, its incentive is to fabricate the best answer possible. For instance, I recently asked AI to get links for a list of makeup products. Initially, it appeared successful, returning numerous Sephora and Ulta links matching product names. Unfortunately, clicking these links led nowhere. AI didn’t find links and just made them up. It simply hates admitting it doesn’t know something, so it guesses instead.
Ghost Editing is when AI subtly or significantly alters parts of documents it was never asked to change. For example, if you provide AI with a blog post and explicitly instruct it to tighten only the conclusion, upon close comparison, you’ll notice unauthorized edits throughout. These could be minor adjustments, like swapping “use” for “utilize,” or extensive changes, such as rewriting entire sections without your permission.
Catastrophic Forgetting is when AI abruptly forgets the entire context of your conversation. Like Alzheimer’s, this often occurs in longer conversations but can happen anytime. For instance, I recently collaborated with AI to develop a complex system-performance monitoring tool. Right after delivering intricate code, I asked for clarification on a particular segment. Instead of clarification, it offered an entirely unrelated analytics program, ignoring all prior discussion and context. I had to re-educate it completely before it started again giving useful responses.
Token Limits are essentially AI’s short-term memory. AI can only keep track of so much information simultaneously before it hits a limit. Recently, token limits (or context windows) expanded significantly from around 4,096 tokens—roughly a blog post—to 100,000 tokens or more, accommodating entire novels. Despite this improvement, most business applications still face limitations. For instance, my recent collection of college website data, gathered for a potential project, totaled over 12 billion tokens—more than 10,000 times the current maximum. Handling data at this scale is simply beyond current AI capabilities.
How Businesses Work Around These Issues
These issues represent serious obstacles to companies using AI effectively. But, because of the immense potential of AI, companies have figured out ways to help mitigate these issues.
To address hallucinations, developers implement fact-checking layers or retrieval-augmented generation (RAG), which cross-check AI answers against reliable databases. In effect, this is getting a second opinion for everything. Obviously, this approach isn’t ideal, as it significantly slows down response time and requires maintaining comprehensive and up-to-date databases, something not easy or simple to pull together. For instance, Mirror Mirror uses advanced RAG, AI facial analysis, and other sophisticated methods to produce makeup recommendations. These produce far superior answers compared to using ChatGPT alone, but the average recommendation might take 60 seconds for the facial analysis and another 15 to 20 seconds for the recommendation, as compared to ChatGPT responding within seconds. This extra processing was also complex to build and maintain, and costs extra to run. A truly intelligent AI would handle these items without requiring custom code built around it.
For ghost editing, being very careful about telling AI what to dot or specialized comparison software that highlights unauthorized edits can help. In many cases, the AI is so untrustworthy that it is not allowed access to make the change. Instead, it merely suggests the change, and a human must actually make it. This complexity and overhead greatly reduces AI’s convenience and speed. Until AI can make its own changes and be trusted not to alter anything it isn’t supposed to, these extra processes will be unavoidable.
Catastrophic forgetting is dealt with by breaking conversations into smaller, more focused parts or employing external memory-management tools. But one of the great strengths of AI is understanding the conversation and adjusting along the way, so these methods fragment interactions, diminishing AI’s ability to maintain a coherent, flowing conversation. It’s like saying the solution to dealing with your uncle’s Alzheimer’s is to give him the entire story every time you open your mouth. It may be effective, but it’s not efficient.
To handle token limits, tasks are segmented into smaller chunks, and summarization techniques or external vector stores are used to selectively retrieve relevant context. In other words, you tell the AI only what it needs to know and carefully leave out anything non-essential. While effective, this means that you have to do all the work. You have to decide beforehand what the AI needs to know. The real power of AI is to give it a large amount of information and have it decide what is useful.
Conclusion
These are real and quite problematic issues. In order to make AI “work” well, companies have to do a lot of pre-work to get around them. Companies that don’t do the pre-work find themselves with results that are far from ideal, but companies that do must spend significant time and effort to make them happen.
The limitations of computer systems are constantly changing. It was only 25 years ago we had the Y2K “crisis,” where computer systems were predicted to crash because developers had coded years as two digits, not four. No doubt these limitations will look very different in five years than they do now.
That said, for the moment, these limitations remain. I am unaware of any major breakthroughs that will handle these issues without workarounds, including those mentioned above. But these are significant issues actively being addressed. Certainly, the token limits keep increasing and may continue to do so. No doubt, these limitations will look very different in five years than they do now.
For the time being, these four horsemen remain substantial obstacles. Until fundamental improvements occur, managing AI’s quirks and incompetence will remain a much greater immediate concern than worrying about it taking over the world.

Leave a Reply