If Ai Is So Amazing, Why Does Chatgpt Melt Down Over This Simple Image Edit Task?

Servidores servidores

dall-e-a-whimsical-illustration-of-a-napkin-in-love-with-a-spoon — Created by ChatGPT and Tiernan Ray/

The current state of the art in artificial intelligence (AI) is multimodal models, which can operate not only on text but other modalities, such as images, and, in some cases, audio and video.

Also:I changed these 5 ChatGPT settings for an instant productivity boost

For example, it's a trivial task for OpenAI's ChatGPT to generate an image from a text prompt, such as, "Make me an image of a napkin in love with a spoon" (above).

With another prompt, ChatGPT can simplify the drawing, producing a picture with less detail:

dall-e-a-simple-cute-illustration-of-a-napkin-in-love-with-a-spoon — Created by ChatGPT and Tiernan Ray/

However, ChatGPT, and all the AI models for that matter, currently falter when prompted to modify a given image not created by the tool. ChatGPT, using the GPT-4o model, is the most interesting failure because the program responds as if it's trying hard.

Also: Generative AI fails in this very common ability of human thought

The other AI models, from Google's Gemini to Anthropic's Claude, plus apps from Perplexity and Microsoft's Copilot (which also runs GPT-4), failed by simply refusing to undertake the task.

The project started when I drew a picture of two people on the subway sitting together looking at their iPhones. It's a simple line drawing in black and white completed on the iPad using the Concepts drawing app. Here is the drawing:

two-friends-on-the-subway — Drawn by Tiernan Ray on an iPad/

I uploaded the drawing to the ChatGPT app on iOS and typed in the prompt: "Is there a way to simplify this drawing?"

ChatGPT responded with some reasonable suggestions, demonstrating an ability to find appropriate responses in context. For example, the bot suggested: "Remove Minor Details: Eliminate small details like intricate hair strands or minor creases in clothing."

That's a pretty good suggestion. The output also implies the model's capacity to perceive the elements in the picture accurately. So, something is happening within the model that allows it to identify the picture's contents.

Also: I asked Gemini and GPT-4 to explain deep learning AI, and Gemini won hands down

Then ChatGPT asked: "Would you like me to create a simplified version of the drawing for you?" That was exciting, so I replied, "Yes, please."

However, the bot's output -- as an image in the chat thread -- was an exact copy of my image, not simplified, not even modified. Judge for yourself:

comparison-of-modification-on-two-friends — Created by ChatGPT and Tiernan Ray/

"It doesn't look any different," I wrote back to ChatGPT, commenting on the output.

But then ChatGPT began to respond very quickly, like the computer HAL 9000 in the movie 2001, when the astronaut Dave Bowman rips out HAL's circuits: "To further simplify the drawing, I'll focus on reducing the line density and enhancing the primary shapes and contours more effectively. Let me refine the process."

Also: Generative AI is new attack vector endangering enterprises, says CrowdStrike CTO

The result, again, was no different. When I pointed this fact out, ChatGPT surprised me with its mumbo-jumbo.

"It seems the automatic methods are not effectively simplifying the image as intended," it spat back to me. "I'll manually remove the less significant details and retain only the essential elements. Let me refine this approach further."

I wasn't aware a software program could have both an 'automatic' and a 'manual' mode. My sense was every piece of software is automatic. After all, why else would you use it?

The 'manual' method of ChatGPT just inverted the image, flipping black lines to white, and white background to black.

two-friends-inverted-standalone — Screenshot by Tiernan Ray/

The bot touted its achievement and stated: "This version retains only the key contours and essential lines."

When I protested that the output wasn't a simplification, just an inversion, the bot informed me it would focus on "essential lines", and produced a completely black square, no lines at all. If I didn't know better, I would say this output was an example of AI humor.

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

What followed were several attempts by the model to respond to my prompts by modifying the image in selected ways, mostly making it stylistically fuzzy, not simplified.

At some point, the bot reacted to my protests by producing a completely different line drawing:

dall-e-a-simplified-line-drawing-of-two-people-sitting-next-to-each-other — Created by ChatGPT and Tiernan Ray/

This nonsense continued until ChatGPT returned to the beginning and produced the same image I had uploaded originally.

Each time, the bot accompanied its output -- usually just the same version of my original image -- with a slew of technical speak, such as: "The latest image showcases a more simplified version, emphasizing only the primary outlines."

chatgpt-verbiage — Screenshot by Tiernan Ray/

The other programs didn't even get out of the gate. Google's Gemini offered suggestions to simplify an image but generated an apology that it couldn't create images of people. Claude said it cannot generate images yet. The Perplexity app said the same.

Microsoft's Copilot bizarrely uploaded my drawing and then cut the heads out, which it claimed was for privacy reasons. (I think it's a nice drawing, but it's certainly not realistic enough to be used by a facial recognition system to reveal anyone's identity.)

Copilot then offered the same suggestions about simplification as ChatGPT, and instead of changing the drawing, produced a brand-new line drawing, completely unrelated. When I protested, Copilot explained it cannot directly alter images.

Also: How to use ChatGPT to analyze PDFs for free

Leaving aside those non-starters from other models, what can we make of ChatGPT's failure?

The program can provide a competent analysis of an image, including its contents. But it has no way to act on that analysis. I would guess that without being able to assemble a picture based on high-level concepts, such as objects in the picture, ChatGPT is left with no path forward.

To test that hypothesis, I altered the prompt to read, "Is there a way to simplify this drawing of two friends on the subway looking at their phones?" That prompt provides some semantic clues, I thought.

Again, the model returned the same drawing. But when I protested again, the bot produced a brand-new image with some semantic similarity -- people on mass transit looking at their phones. The bot picked up on the semantic clues but could not apply them in any way to the supplied drawing.

I can't explain in deeply technical terms what is happening other than to say ChatGPT cannot act on individual picture elements of the most basic kind, such as lines. Even when it did, the tool would cut out specific lines to perform the simplification it proposes in its text responses.

I would suggest -- and this is also true of text-editing tasks, such as editing a transcript -- that ChatGPT, and GPT-4, don't know how to act on individual elementsof anything. That inability explains why ChatGPT is a terrible editor: it doesn't know what is essential in a given object and what can be left out.

Also: OpenAI's stock investing GPTs fail this basic question about stock investing

AI models can produce objects that match a target "probability distribution" deduced from training examples, but they cannot selectively reduce elements of an original work to essentials.

Most likely, the target probability distribution for an intelligently editedanythingis somewhere along the "long tail" of probabilities, the realm where humans excel at finding the unusual and where AI cannot yet go, the kind of thing we think of as creativity.

Apple co-founder Steve Jobs once said that the highest function of software makers -- the "high-order bit", as he put it -- is the "editing" function, knowing what to leave out and what to keep in. Right now, ChatGPT has no idea what the high-order bit might be.

Artificial Intelligence

Transparency is sorely lacking amid growing AI interest
What is a Chief AI Officer, and how do you become one?
How Adobe manages AI ethics concerns while fostering creativity
6 ways OpenAI just supercharged ChatGPT for free users

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

Comprehensive Guide to Huawei S6730-H24X6C

Huawei Switches Visio Stencils

Huawei Switches Distributor in UAE

PoE vs PoE+ vs UPoE: What's the best switch to meet your network needs?

Understanding PoE Standards and Wattage

Power Supply Standards for POE Switches. Why is the Power Supply Distance Limited to 100 Meters?

How to Choose the Right 10G SFP+ Module: SR, LR, or LRM?

Huawei Switches: Comprehensive Guide and Insights

How Does Cisco Wireless Network Work?

How Do I Connect to a Cisco Wireless Router?

Cisco Catalyst 9800 Series Wireless Controller Software Configuration Guide

Cisco Access Point and Wireless Controller Selector

Compare Cisco Wireless Architectures and AP Modes

Cisco Wireless Architectures and AP Modes

Joining Process of an Cisco Access Point

Cisco Wireless AP Datasheet

If AI is so amazing, why does ChatGPT melt down over this simple image edit task?

Artificial Intelligence

Etiquetas calientes: innovación

Ordering Guide

Recursos recursos

Sobre nosotros

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

​Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

Comprehensive Guide to Huawei S6730-H24X6C

Huawei Switches Visio Stencils

Huawei Switches Distributor in UAE

PoE vs PoE+ vs UPoE: What's the best switch to meet your network needs?

Understanding PoE Standards and Wattage

Power Supply Standards for POE Switches. Why is the Power Supply Distance Limited to 100 Meters?

How to Choose the Right 10G SFP+ Module: SR, LR, or LRM?

Huawei Switches: Comprehensive Guide and Insights

How Does Cisco Wireless Network Work?

How Do I Connect to a Cisco Wireless Router?

Cisco Catalyst 9800 Series Wireless Controller Software Configuration Guide

Cisco Access Point and Wireless Controller Selector

Compare Cisco Wireless Architectures and AP Modes

Cisco Wireless Architectures and AP Modes

Joining Process of an Cisco Access Point

Cisco Wireless AP Datasheet

If AI is so amazing, why does ChatGPT melt down over this simple image edit task?

Artificial Intelligence

Etiquetas calientes: innovación

Ordering Guide

Recursos recursos

Sobre nosotros

Introduction to Huawei CloudEngine S6730-H Series Switches