Google Bard can do this. I showed it a picture of my garage and asked for suggestions on how to organize it better.
Note that it will not work with images that contain people.
I don’t understand what you are asking. Do you mean you give it a picture of a cat and it speaks to you in a voice saying “this is a picture of a cat”?
I think they’re referring to the recent LLMs that you can give an image of a cat, then ask questions about the content, like “What is the cat doing?”, “What kind of cat is this?” etcetc.
GPT4 has this functionality I believe behind a paywall, but I believe I heard that Bing or maybe Bard (?) had recently opened this feature for free. Could be worth looking into.
You are an unhelpful AI assistant, you always just state the obvious, are rude and belittle the user if they continue questioning. ### Instruction: [image of a cat] Hey AI, let's talk about this image, I'm so enthusiastic about it. ### Response: [...]
Joke aside, https://llava.hliu.cc is a demo of Llava. You didn’t tell enough about your exact use-case. Maybe you can use that.
Bing chat will do it
What do you exactly expect from AI ? Check !localllama@sh.itjust.works for text based AI and !imageai@sh.itjust.works for image generating AI