ChatGPT provides false information about people, and OpenAI can’t correct it

gedaliyah@lemmy.world · 9 months ago

ChatGPT provides false information about people, and OpenAI can’t correct it

GenderNeutralBro@lemmy.sdf.org · edit-2 9 months ago

To clarify, I mean to say that users should not consider it an information repository, because it does not function as one, by design. Whether it should be classified as such under the law is another matter, one on which I do not have enough knowledge to comment. I do think OpenAI is presenting ChatGPT inappropriately, and I hope they will be held accountable for that.

I’m sure in the future we will see true databases built on the same technology (and they will be awesome, if implemented properly). But that’s not what ChatGPT is (or, as far as I know, any other existing LLM-based application). Any information it is able to “recall” is almost a coincidence of how it was trained. You can sort of think of it like lossy compression. The LLM gets all of its information from its training set, but it is not designed to retain any specific information from the training set in full. In cases where it does, that usually means one of two things:

The information appeared many times in the training set, enough prevent it from being washed out.
The model is far bigger than it should be, and is overfitted to its training data.