This is a proposal by some AI bro to add a file called llms.txt that contains a version of your websites text that is easier to process for LLMs. Its a similar idea to the robots.txt file for webcrawlers.

Wouldn’t it be a real shame if everyone added this file to their websites and filled them with complete nonsense. Apparently you only need to poison 0.1% of the training data to get an effect.

    • raoul@lemmy.sdf.org
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      17 days ago

      We could respect this convention the same way the IA webcrawlers respect robot.txt 🤷‍♂️