Resolution to Kindle 2’s Text-to-Speech Issue Benefits All Involved

The paperless office is yet to arrive, but that hasn’t stopped companies from introducing innovative ways of consuming written materials. One such attempt is Amazon.com’s e-book, called ‘Kindle’. When first released in late 2007, I looked at how purchasers of content for the device may be limited in their ability to re-sell their copy of the text. Version 2 is now out, and its new text-to-speech feature creates additional legal concerns. Although the issue has now been resolved, insights can be gleaned form looking at the legal claims made, and how they were resolved without resorting to ligitation.

Authors Guild Executive Director Paul Aiken claims the feature is legally suspect: “They don't have the right to read a book out loud, …That's an audio right, which is derivative under copyright law.” In an interview on gadget blog Engadget, Mr. Aiken provides further details of his concerns:

Well, the legal objections fall in a couple categories. One is the basic copyright objection which I know has been bandied about a lot online, and that objection comes in two parts. There's the unauthorized reproduction of the work which is one claim under copyright law -- for that there has to be fixation of the copy and there's a legal question as to whether or not there's adequate fixation in the Kindle. The second claim is that text-to-speech creates a derivative work, and under most theories of copyright law, there doesn't have to be fixation for there to be a derivative work created.

The general reaction in various blogs has been that both these arguments are tenuous. I will take a look at each of the arguments.

Fixation

As Mr. Aiken states, for federal American copyright protection to attach to a work, the work needs to be “fixed in a tangible medium of expression” (Copyright Act, 17 U.S.C. §102(a) [Copyright Act]) While fixation in an electronic medium clearly gives rise to copyright protection (as in the case for the text on the Kindle or a discrete audiobook recording), the difficulty here is that there is no separate fixation of the audio apart from the text. This is because the text-to-speech functionality translates the text into audio on the fly.

Thus, it was suggested that the Authors Guild was claiming legal rights over the reading aloud of books generally – a claim which has been met with some strong resistance. The most extreme of responses is that the act of parents reading to their children would be made illegal if such a right is recognized.

Mr. Aiken, however, is quick to point out that this is not what he meant. Further on in the Engadget interview, when posed with the question of “what's the difference between the Kindle and the average person either reading or performing the book?”, he answers:

As we see it, the difference is, the machine playing an audio version that the publisher often does not have the right to sell. They do not have the multimedia or audio right that goes with the electronic book. As a matter of contract, we have a problem and as a matter of copyright, we have a problem.

From the perspective of fixation, he is arguing that because the device enables the text to be both read and heard, it serves as a “hybrid ebook / audiobook”. Seemingly, he takes the position that the fixation of the digital file will be sufficient for both visual and audio rights.

Derivative Work

With regards to the second of Mr. Aiken’s arguments, he is claiming that even if there isn’t sufficient fixation to constitute another copy, then the reading aloud of the text constitutes a derivative work. Under §106(2) of the Copyright Act, the copyright owner has the right to control any “derivative works”, which §101 defines as “any … form in which a work may be recast, transformed, or adapted”. (For example, the Harry Potter movies would be a derivative work of the Harry Potter novels)

While the threshold of creativity required to “recast, transform, or adapt” a work is low, a mechanical reading will likely not even meet this low bar. This is because such a reading contains very little transformative effort. There are no additions or subtractions from the text, and no editorial liberties taken. Even within the confines of the text, there no creative effort put into the way the text is read. Unlike a human actor which reads an audiobook, there is no emphasis or inflection in the reading, and no “emotional flavors and subtle real-life cues” (as this article puts it). As such, it seems difficult to argue that a derivative work has been produced through the text-to-speech function.

Amazon Backs Down

Despite the seeming weakness of the Authors Guild’s legal position, Amazon decided this was not a battle they wanted to fight. While still employing strong verbiage to emphasize the legality of the text-to-speech functionality, they nevertheless decided to give authors the right to switch on or off the text-to-speech functionality on any works going out to a Kindle.

Implications

In the short term, this will mean that buyers of the Kindle 2 will not have the text-to-speech functionality that was originally promised. Considering Amazon's willingness to drop this feature as an automatic inclusion, it seems likely that the functionality was not too integral to the product.

Nevertheless, text-to-speech software has been around a long time (video link, beginning at 2:52), and the Authors Guild raising the issue now seems to indicate that the Kindle 2 achieves a type of technological convergence unseen before. That is, the Kindle 2 is likely the first piece of technology that could potentially threaten the existing audiobook market.

On a greater scale, this series of events reflects the multi-faceted considerations that copyright law is intended to take into account. On the one hand, there is the interest of the authors to not have their legal rights be taken away from them. But at the same time, the technological innovation undertaken by Amazon to ease the delivery and consumption of written materials should also be encouraged. Moreover, the benefit of increased accessibility for consumers is a further factor to be taken into account.

In this case, a judicial determination of the proper weight to be afforded to the different considerations has been avoided. This seems to be the best way to resolve the inevitable legal tensions that will arise as technology continues to progress. Instead of resorting to inflammatory rhetoric about the rights of each side, there is likely a common middle ground that could be found. Indeed, when asked why they would want to fight technology in the Engadget interview, Mr. Aiken responded,

We don't want to fight it, we want it to be licensed. We think the lesson from the music industry is to make stuff available at reasonable price, and that's what we want to do. We want to enable this market, but we don't want Amazon to take control of it by default. We think it's something that is rightfully the rights holders' to license.

Ultimately, the end result is that the parties wishing to have this feature for the text will have the opportunity to purchase it, whereas those that don’t ever want to listen to a text-to-speech rendition of their purchase won’t need pay for it. This is seemingly an amicable solution that takes into account the needs of all those involved without resorting to a lengthy and costly legal battle.