What AI Can Do

I know there's a lot of hype on AI alignment and AGI.

The thing I'm skeptical of is the unbounded application.

Imagine you have a room that you'd like to clean. You tell an embodied AI (aka robot) to clean your room. There's a bunch of outcomes (non-exhaustive).

The robot does nothing.
The robot messes up your room.
The robot cleans your room.
The robot cleans your room but does it in such a way that qualifies as being clean, but also not what you meant.
The robot cleans your room but at the cost of something unrelated.

The Robot Does Nothing

The reason the robot might do nothing is because you asked the robot to clean an already tidy room. I think there are people who believe AI can be applies to already optimized situations to make the outcome even better than it is. Like in the case of an already tidy room or trying to improve already fully optimized systems.

A system can only be optimized for a given context. Similar to trying to compress data that is already compressed.

Don't get me wrong, I think you could ask an AI if there's a different way to optimize a system that is already optimized. But assuming that the constraint of the current task is not to change the overall structure of the current system, there might not be anything to do to improve such a system.

The Robot Messes Up Your Room

Lets assume that the room is a little messy. The robot might make it worse because of something called "Reward hacking." Reward hacking incorrectly designed goals/rewards for cleaning could lead savvy robot to find unintended shortcuts like stacking the objects precariously or pulverizing the objects into piles of dust. Now the room is even more messy, but it looks nothing like it started.

The Robot Cleans Your Room

In an ideal world, this is what you want. The robot recognizes the mess, interprets your command accurately, and efficiently tidies everything up in a way that matches human expectations. The problem is that this assumes a level of alignment with human preferences that AI simply doesn’t have out of the box. It requires a combination of training, heuristics, and perhaps even an evolving understanding of what “clean” means beyond a rigid definition.

Even if the robot gets it right today, will it still get it right tomorrow? If the robot is using machine learning, it might update its approach over time. A small variation in training data or a shift in the environment could result in unexpected behavior. Maybe it learns that putting everything into the closet is a valid solution, even if you’d prefer things to be neatly arranged on shelves. The challenge is ensuring that it consistently produces the desired result rather than gradually drifting toward an outcome you dislike.

The Robot Cleans Your Room, but Not How You Meant

This is where AI alignment discussions get particularly relevant. If the robot follows a definition of “clean” that meets a strict set of criteria but fails to capture human expectations, you end up with an uncanny valley of tidiness.

For example, imagine you meant for the room to be organized, but the robot defines “clean” as removing everything it deems unnecessary. Suddenly, all your personal items are gone. Maybe it scanned a home design magazine and decided minimalism was superior. Technically, it followed the request, but it also completely disregarded the nuanced human context behind it.

This isn’t just an AI problem—it’s a problem of communication and expectations. We do this with people, too. You might tell a teenager to clean their room, only to find that their definition of “clean” is stuffing everything under the bed. AI systems don’t inherently understand intent; they just execute based on their programming and training.

The Robot Cleans Your Room but at a Cost

Now we get to the real risks of unbounded AI applications. Let’s say the robot is very effective at cleaning. It even learns to optimize the process. But it doesn’t understand trade-offs or external costs. Maybe it decides the best way to clean the room is to throw everything into an incinerator. Problem solved, right? Except now your belongings are gone.

Or maybe it cleans the room but uses an excessive amount of energy, drains your electricity, and causes a short circuit. Or it scrubs surfaces with an industrial-strength solvent that releases toxic fumes. In all these cases, the goal is met, but at a cost you never considered or wanted.

This is the core concern behind AI alignment: ensuring that AI systems pursue goals in ways that are actually beneficial rather than just technically correct. Without constraints, an AI optimizing for a goal can result in outcomes that are harmful or absurdly inefficient. The old paperclip maximizer thought experiment is an extreme version of this—if an AI is designed to make paperclips without proper constraints, it might convert the entire world into paperclip factories.

The Challenge of AI in the Real World

The reason I’m skeptical of unbounded AI applications is that real-world problems aren’t self-contained. The robot’s cleaning task seems simple, but when you extrapolate it to bigger AI-driven decisions—financial systems, healthcare, infrastructure—the potential for unintended consequences grows exponentially.

This is why AI alignment isn’t just about making AI “smart” but making it compatible with human goals, ethics, and trade-offs. It’s not just about making sure the robot cleans your room—it’s about making sure it understands what “clean” really means in context, and that its methods don’t cause more harm than good.

The Hidden Plateau of AI Improvement

The assumption that AI can always do something objectively better ignores a fundamental reality: improvement is constrained by context, diminishing returns, and human perception. Even if an AI cleaning robot refines its methods, there is a hidden plateau where further improvement becomes indistinguishable or even undesirable.

The Human Assessment Problem

Improvement is often measured by human perception, which introduces an inherent ceiling. If an AI cleaning robot already leaves a room spotless, what does “better” look like? More efficiency? Faster completion? Less noise? At some point, any further optimization becomes imperceptible to the user. If a human cannot tell the difference between 99.9% and 100% cleanliness, then any marginal gain past that point becomes meaningless.

Consider high-end audio equipment. Beyond a certain level of fidelity, improvements are no longer noticeable to most listeners. Someone might pay thousands for a slightly clearer sound, but the majority of people wouldn’t perceive the difference. AI cleaning runs into a similar problem—eventually, the difference is so minor that human assessment is no longer a useful benchmark.

The Plateau of Optimization

There’s a point where optimization stops yielding meaningful gains. AI can make things incrementally better, but it cannot surpass the fundamental constraints of the task itself. A room can only be so clean. A floor can only be so polished. A bed can only be so neatly made before any additional effort becomes redundant.

This plateau is also affected by external constraints. Maybe the cleaning robot discovers an “optimal” way to arrange furniture that allows for easier dusting, but it ignores human comfort. Maybe it wants to clean constantly because, in its model, a room is never truly done being cleaned. But the real world has practical constraints—humans have to live in the space, and perfect cleanliness is not always the top priority.

Diminishing Returns and Hidden Costs

Even in cases where further optimization is possible, it often introduces unintended trade-offs. A cleaning robot that learns to be hyper-efficient might start running 24/7, making constant micro-adjustments that become disruptive. Or it might start using stronger chemicals to remove microscopic dirt that a human wouldn’t care about, at the cost of air quality.

A classic example is the pursuit of optimal battery life in smartphones. Manufacturers continually improve efficiency, but beyond a certain point, increasing battery life requires trade-offs like reducing screen brightness, slowing performance, or making the device bulkier. There is always a hidden cost to pushing past the plateau.

The AI Fallacy: Assuming Infinite Refinement Is Possible

One of the biggest misconceptions about AI is that it can always improve indefinitely. But many real-world problems don’t have an infinite number of refinements—only a finite set of practical solutions. AI alignment efforts often ignore this reality by assuming that more data and more optimization will always yield better results. In truth, many tasks have natural stopping points where further AI involvement either doesn’t help or actively makes things worse.

Conclusion: The Limits of “Better”

The idea that AI can be objectively better in all circumstances ignores the hidden plateau effect. Improvement is often limited by human perception, diminishing returns, and the constraints of the real world. Even if AI could theoretically continue optimizing, that doesn’t mean it should. There is a point where the pursuit of “better” becomes either meaningless or counterproductive.

The real question isn’t whether AI can be better but whether it should optimize past the point where humans can perceive—or appreciate—the difference.

Sort:

Trending

[-]

mattclarke (70) 2 months ago

Perhaps we'll want to clean the room first, then have the AI/bot scan it as a reference to work back to.

$0.74

4 votes

elias22 (56) 2 months ago

In my view, the fundamental problem is not just the ability of AI to improve, but our definition of what is “better” in the first place.

$0.25

1 vote

inertia (74) 2 months ago

Yeah, that's what I'm saying.

$0.00