Writers and AI

AI is coming, regardless. And while AI applications will have a strong impact on manufacturing and production, they’re also going to affect so-called white-collar clerical and lower-level data-management, as well as routine computer coding.

Authors and artists won’t be exempt, either. Artists who do illustrations for books and other publications are already complaining, and at least some publications are refusing to use artwork solely or partly AI-generated.

As for authors, a number of lawsuits have been brought in California and New York courts against various AI companies for copyright infringement because the companies employed unauthorized copying of authors’ works to train their generative AI models.

The discussions around this appear muddled, at least to me. Intelligence has to “learn” in some fashion. I’d read more than a thousand SF books and all the stories in ANALOG (including some from the Astounding Science-Fiction era) for fifteen years before I ever thought about writing a story, let alone a novel. So have a great many other authors.

The problem I have with the way the AI companies approached this was that while I had to pay (or occasionally borrow from the local library) to read and learn, these companies used pirated copies and paid no one. Some have attempted to claim “fair use,” which is absurd, given that “fair use” case law doesn’t allow use of extended prose in any form.

So why shouldn’t the AI companies pay royalties to authors whose works aren’t in the public domain? Shakespeare doesn’t need the royalties, but living, breathing, and working authors need and deserve them.

Since none of these legal suits have yet come to trial (so far as I can tell), we’ll see what the courts have to say.

Of course, those lawsuits don’t address the fact that dialogue from movies and TV shows has been used by companies such as Apple and Anthropic to train AI systems.

I could be underestimating the potential of generative AI, but I doubt that it will ever produce truly great prose or poetry, or even well-written mid-list fiction, but I have no doubt that, in time, it will be able to churn out serviceable methodical fiction with little uniqueness.

As in many fields, we’ll have to see, but in the meantime, the AI companies have no business pirating current authors’ works in an effort to eventually replace them.

9 thoughts on “Writers and AI”

  1. KevinJ says:

    I’m very sad to bring this up, I just saw it today:

    “Judge backs AI firm over use of copyrighted books”
    -https://www.bbc.com/news/articles/c77vr00enzyo

    1. AndersK says:

      Actually, the article is pretty fair – the AI companies are free to use the books they bought, but not the pirated books.

      I’d personally prefer a different license model – read: one that pays authors more – than just having to pay for one book to use it forever as part of AI learning, but drawing a line at “you do have to pay for at least ONE copy” is somewhat understandable.

      We’ll see if the AI companies believe that using the output of their programs are fair use to other AI companies, but hypocrisy has ever been a part of humanity so I’m not expecting that they’d grin and bear it.

  2. RJL says:

    My experience of several areas where I believe AI is heavily involved, Google and youtube search engines and online versions of the New York Times, the San Mateo Daily Journal, the AP website (and probably others I dont recall right now) has been increasingly negative. AI as used by these publishers seems to dig an increasingly deep narrow rut limiting the offerings, topics, presented to a viewer. Articles and other media listed repeat. In the case of news media, a count of unique items for the new day seems to hang at maybe 15, not counting sports which I dont read. Items start to repeat about 10 to 12 down the list. For youtube and google search I’ve taken to not saving _any_ history and clearing their browser data in an effort to reboot their search. Seems to help but the same limiting behavior soon developes again.

    In light of this experience, I dont think AI bodes well for anybody depending on good research, much less anybody trying to explore and grow their perspective.

    Perhaps different goals and management would yield better results. Oh, one other area of fail: I read the enthusiastic cheers of a software developer. He had spent time building a small application the old way and then a little later connected w/an AI system and “asked” it to build the same system. Looking at the two sets of code, it appeared the AI produced the preliminary work but not a finished application. The preliminary code looked to me to be something any developer who wanted to understand their own code should do themselves. It was not completely trivial but it was basic and to build on that base or to see where more or different was needed, one would benefit hugely from laying out the base themself. I did that work moderately well for 10 years for customers in the financial and railroad industries…

    And, last I heard, NOBODY can any longer determine just what algorithms, criteria, processes the AI uses, is governed by. So complex it has become a black box. Now just what kind of power and agency do you feel good about handing over to a black box?

  3. Daze says:

    I suspect that the LLM track for AI is played out. The current versions have scraped the entire web – read “pretty much everything ever written in English”, given the well-documented use of almost all copyrighted ebooks – and there’s nothing significant more to add. From here, it will increasingly be trawling AI-generated new material, and thus just cementing its biases and hallucinations.

    As the New Scientist reported a week or so ago, the LLMs can’t truly distinguish the sentence “this scan shows no evidence of cancer” from “this scan shows evidence of cancer”, the two sentences being 29/31 the same, and the LLM not actually knowing what “no” or “not” mean. just the frequency of them being the next word in a sentence.

    My wife got an AI generated report on the “plagiarism count” in a yet-to-be-published scientific article. Half of the quotes she was accused of plagiarising were from articles by herself. Most of the other half had highlighted sentence fragments like “across all age groups … there is some evidence”, which didn’t exactly need acknowledging as sourced from anywhere. None of the instances were real. Clearly no human had looked at the report to check whether it made sense.

    1. Postagoras says:

      Daze makes a great point that the LLMs have already scraped pretty much everything, so there’s not much more input for them.

      Mr. Modesitt makes the great point that he acquired additional “training” by reading many novels. But obviously he didn’t have to read every book, play, poem, scientific paper, newspaper article, and PR garbage to be able to write.

      However, the criticism about the “fair use” defense is a bit off. The defense of the LLM companies is that they are creatively transforming the scraped text. This is the defense that allows Weird Al to write new lyrics and sing them over the existing song.

      1. Ronrythm says:

        Just to know that Al always asks permission and doesn’t use the music if turned down. That’s why there are no Weird Al versions of Prince songs. He doesn’t just do what’s legal, he does what’s right.

      2. Mayhem says:

        Also creative transformative or not, they should also be paying royalties to the rights holders of the works being transformed. After all their entire concept is a derivative work, derived from all the training material.
        Unfortunately since they don’t actually know how their LLM models came up with the results, they argue that they have no duty of care to do that. My response is that they should then have to pay royalties to every rights holder of ingested material for every query, since they have no way to prove it *wasn’t* used either.

  4. Davidfrelf says:

    Was just thinking about what you said, I think I had something on that, could be a reach, but figured I’d share anyway.]
    This reminded me, I threw together a quick note, [url=https://maba-3d-druck.de]this link[/url] might be kinda on-topic, if you’re curious.

Leave a Reply to KevinJ Cancel reply

Your email address will not be published. Required fields are marked *