• Mahlzeit@feddit.de
    link
    fedilink
    English
    arrow-up
    41
    arrow-down
    1
    ·
    1 year ago

    That ought to satisfy all those who wanted “consent” for training data.

    • Esqplorer@lemmy.zip
      link
      fedilink
      English
      arrow-up
      18
      ·
      1 year ago

      I wonder how they worked around user violations of copyright… Imagine all the content uploaded to Instagram/Facebook that the poster didn’t create but simply uploaded their download/screenshot.

      • Mahlzeit@feddit.de
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        7
        ·
        1 year ago

        That shouldn’t be an issue. If you look at an unauthorized image copy, you’re not usually on the hook (unless you are intentionally pirating). It’s unlikely that they needed to get explicit “consent” (ie license the images) in the first place.

        • GiveMemes@jlai.lu
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          1
          ·
          1 year ago

          Yeah but is it the same thing for a human to view data and an AI model to be trained on it? Not in my opinion as an AI doesn’t understand the concept of intellectual property and just spits out the most likely next word whereas a person can recognize when they are copying something.

          • Mahlzeit@feddit.de
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            4
            ·
            1 year ago

            I understand. The idea would be to hold AI makers liable for contributory infringement, reminiscent of the Betamax case.

            I don’t think that would work in court. The argument is much weaker here than in the Betamax case, and even then it didn’t convince. But yes, it’s prudent to get the explicit permission, just in case of a case.

            • GiveMemes@jlai.lu
              link
              fedilink
              English
              arrow-up
              4
              ·
              edit-2
              1 year ago

              Doesn’t really seem the similar to me at all. One is a thing that’s actively making new content. Another is a machine with the purpose of time-shifting broadcasted content that’s already been paid for.

              It’s reminiscent insofar as personal AI models on individual machines would go, but completely different as for corporate and monetizable usage.

              Like if somebody sold you an AI box that you had to train yourself that would be reminiscent of the betamax case.

              • Mahlzeit@feddit.de
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                1 year ago

                Yes, if it’s new content, it’s obviously no copy; so no copyvio (unless derivative, like fan fiction, etc.). I was thinking of memorized training data being regurgitated.

                • GiveMemes@jlai.lu
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  2
                  ·
                  edit-2
                  1 year ago

                  Yeah I just think that ingesting a bucnh of novels and rearranging their contents into a new piece of work (for example) is still copyright infringement. It doesn’t need to be the Lord of the Rings or Star Wars word for word to get copyright stricken. Similar to how in the music sphere it doesn’t need to be the same exact melody.

                  Edit: Glad you down voted instead of responding. Really shows the strength of your argument…

                  • Mahlzeit@feddit.de
                    link
                    fedilink
                    English
                    arrow-up
                    4
                    arrow-down
                    2
                    ·
                    1 year ago

                    I didn’t downvote you. (Just gave you an upvote, though.) You’re reasonable and polite, so a downvote would be very inappropriate. Sorry for that.

                    Music is having ongoing problems with copyright litigation, like Ed Sheeran most recently. From what I have read, it’s blamed on juries without the necessary musical background. As far as I know, higher courts usually strike down these cases, as with Sheeran. Hip hop was neutered, in a blow to (African-)American culture. While it was obviously wrong, not to find for fair use in that case, samples are copies.

                    It’s not so bad outside of music. You can write books on “how to write a bestseller”, or “how to draw comics” without needing permission. Of course, you would study many novels and images to get material. The purpose of books is that we learn from them. That we go on to use this to make our own thing is intended (in the US).

                    What you’re proposing there would be a great change to copyright law and probably disastrous. Even if one could limit the immediate effect to new technologies, it would severely limit authors in adopting these technologies.