Citation, Appropriation, and Fair Use: News Genius Picks Up Again Where Failures Left Off

As an old man of the Internet, I've seen several waves of "scribble on top of other people's pages" plug-ins and web site. Anyone remember Third Voice? It was a browser plug-in that first appeared in 1999. that let you annotate other people's sites. It failed. When it shut down in 2001, Wired wrote this about it:

But the seemingly innocuous "sticky notes" gained enemies quicker than users. Launching a grassroots campaign called Say No to TV, some 400 independent Web hosts banded together to gag Third Voice, which they likened to "Web graffiti." Nearly two years later, it would appear that the group got what it wanted. On Monday, Third Voice posted a short message on its site, notifying users that the service had been discontinued.

Spinspotter started up around 2008. It was a toolbar plug-in that let users annotate news articles and press releases (or, really, any web page) to mark up spin. It was an attempt to crowdsource spin and remove it. (I was on its journalist advisory committee. I quite liked the founders.) By 2009, they pivoted to a new model, as GigaOm reported:

…the company switched strategies because [founder John Atcheson said] “we found it hard to get people to mark spin with the quality level necessary, and (b) we saw a much bigger opportunity elsewhere for the technology we’d developed.”

There are several other similar efforts that I don't believe reached even this level of attention between 1999 and present. The latest entrant is News Genius, part of the Genius network.

From the day’s biggest stories, to the latest travesties of the fashion world, there’s something for you to dive into. And if you don’t see a headline you like on this page, we invite you to start a conversation literally anywhere else on the web.

My translation from marketing speak is, "We want this to be the equivalent of what Talk Soup once was and Joan Rivers' red carpet commentary crossed with a show like Crossfire, but we're throwing spaghetti at the wall because we have no idea what will stick."

Site Cite

The genius of the original Genius (once called Rap Genius) is that it was planned around a corpus that people know: All the lyrics of every song ever written. It's an extension of handwritten and typewritten projects I used to see as a kid (some of which migrated online), where someone would take a song like "American Pie" and try to decipher the easter eggs, clues, and deeper meanings. Genius performs that task at scale, and sometimes gets rather Talmudic in how deep people go.

Some (many?) songwriters and musical artists embraced Genius, because it provides a platform for deeper analysis. A friend's father, an English professor, assembled and edited a collection of critical essays about a well-known British author a number of years ago. He sent her a copy of the book, and her reply to him, as best as I can remember it was, "An author's work is most likely to survive their lifetime when it is taken seriously enough to examine this deeply."

Because Genius expanded into literary works and other genres, its extension into news doesn't seem that peculiar. Except when you examine their marketing phrase carefully: literally anywhere else on the web. While they focus on high-level partnerships and the ability for sites to incorporate Genius Web Annotator into their site, that's not necessary. They generally promote this on their main about page, too: "you can annotate most pages without downloading or embedding anything."

Any web page can be annotated and viewed with that annotation at the News Genius site. In an era full of sites that allow grievers and trolls to flourish and plan coordinate attacks—sections of reddit, 4chan, and many lesser-known sites—why anyone would want to resuscitate this notion of arbitrary annotation, I don't know. Although it was just yesterday that Microsoft had to abandon its machine-learning Twitter bot after it turned into an anti-Semitic, misogynistic, hates-spewing account within 24 hours, having apparently not considered what happens when you put no thought whatsoever into how the unfettered Internet might behave. The best headline on the affair from the UK's Telegraph:

Microsoft deletes 'teen girl' AI after it became a Hitler-loving sex robot within 24 hours

There's a world of difference between specific corpuses—all lyrics, all books, all news articles—and overlaying annotation on the entire rest of the Internet. Those corpuses are typically all copyrighted (unless old enough to be out of copyright or dedicated to the public domain), and, more importantly, are defended typically by firms that have the resources to employ lawyers who can mediate and negotiate over the use of the source material. (For instance, the Hamilton web site links directly to the Genius annotations, and Lin-Manuel Miranda participates in the markup.)

With that in mind, there's also a huge difference between what I'd separate into citation and appropriation:

  • Citation is the reference to something else, whether it's lyrics, a blog post, someone's public tweet, or a published article online or only in print. Because the commentary in citation links to the original or, at most, reproduces a small part of it, there's a clarity of it having a freestanding purpose and nature.
  • Appropriation reforms someone else's work into a product on which you build. It typically involves either outright violation of copyright or invocation of the fair-use doctrine. Reproduction of the whole in much the same form as the original (or a translation, such as audio into words that are identical) forms the basis.

In American law, fair use provides some guidance on commentary. It's perfectly legal—though you may have to defend it in court—to reproduce 100% of an originally copyrighted item if the case for critique can be made well. 

The original Rap Genius involved appropriation: lyrics were reproduced literally in order to comment on them. However, Genius recognized that its fair-use basis might not be enough, and put a licensing arrangement in place just three years ago. It's not that it was indefensible, but rather that fair use is always tried in court, and it can take years and piles of money without any concrete assurance of winning. (The "Happy Birthday" lawsuit that was just concluded is a famous example, but there are many much quieter ones.)

The Genius Web Annotator is a hybrid of citation and appropriation that doesn't respect the source's owner nor have any mechanism to opt out or block it. The site retrieves the original page through a proxy server and then rewrites it with added JavaScript, which lets it overlay its commentary tool. I wrote the company earlier in the week through its general feedback form asking about how to opt my sites out. I've received no response so far.

This annotation is probably transformative under the definition of the fair-use doctrine, but it needs the inclusion of the whole to make sense. Genius offering "criticism, comment, news reporting, teaching, scholarship, and research," which is considering in fair use, and citation or clippings from the original coupled with commentary wouldn't have the same value to Genius. However, reproducing an entire work (such as a complete song, article, or blog entry) often doesn't play out well in court.

Web search engines long ago mostly agreed to honor an opt-out signal, robots.txt, and have used that as a way to knock down legal challenges and ethical ones. If you don't want to have your pages "spidered" (retrieved, indexed, and included in the corpus), the robots.txt file lets you mark pages or an entire site off-limits to every spider or to specific ones.

Genius offers no such tool, nor any opt-out mechanism I can see. They view their annotator as an extension of commentary. In a tweet yesterday, the company's account wrote: "But your blog is public! People can comment on Twitter, Fb etc; Genius is in its simplest form a more efficient tool for this."

Except all those cases that Genius cites are citations; in none of them is appropriation involved.

Sucker Punch Down

Into this environment, my friend Ella Dawson got suckerpunched by News Genius' approach and its editor. Ella is a social-media manager by day, and an advocate for the de-stigmatization of sexually transmitted infections (STIs). Shaming people, mostly women, for STIs is another tool of culture to put every consequence of sexual interaction on women, even when a man is involved in the act. Ella has herpes, isn't shy about it, and is my hero for using her voice to take the heat in the path to help others.

She writes today about how someone took offense to a post she wrote about journalists referring to people with STIs as "sufferers." (As a survivor of both cancer and a heart blockage, I agree with her. I was a person with cancer; suffering implies a host of not-necessarily-accurate associations, and can be used to condescend or remove the agency of the person inflicted with a disease. With a chronic condition that has few or no side effects with proper care, "sufferer" seems like a huge overstatement, too.)

After some back and forth, Ella blocked this person, who then took to News Genius to annotate the article, noting it may be "punching down a little bit here." The editor of News Genius joined in with snarky and hostile comments. Despite having blocked both individuals on Twitter, they linked to Ella's tweets, which is potentially a violation of Twitter's terms of service, but certainly indicates a violation of agency when, say, a political figure isn't involved or some other newsworthy person.

As with many Internet tools created without any forethought about abuse, opting out, and reporting and resolving issues, Genius seems malicious in absence rather than in intention. As Ella wrote:

You can hate-read my content all you want—I know that is a risk of being a person who says things on the Internet. But when you create a tool that pastes commentary directly on top of my work without letting me opt-in and without providing a way for people to turn off the annotation on their pages, you are being irresponsible. You are ignoring the potential your tool has to be abused, and you are not anticipating the real harm your tool can do.

Contrast this with Medium's approach to annotation on Medium's site. Essay authors can receive public or private notes, and choose which to make public and which to remain private or delete. Commentary on a post, called "responses," is presented at the end like comments, but each response is a full-fledged Medium post.  (Last year, Medium added the ability for everyone, instead of certain outlets or requiring email, to disable responses to appear linked; they can still be made, they just don't appear at the end of the referenced post.)

It's not a question of attempting to deny people all fora in which to critique her work. It's not a question of whether what she wrote is public or not. My friend Anil Dash linked me yesterday to a very smart essay he wrote in 2014, "What is Public?", in which he examines what that word means in the social-networking era:

Public is not simply defined. Public is not just what can be viewed by others, but a fragile set of social conventions about what behaviors are acceptable and appropriate. There are people determined to profit from expanding and redefining what’s public, working to treat nearly everything we say or do as a public work they can exploit. They may succeed before we even put up a fight.

This no opt-in, no opt-out appropriation by Genius is a taking. It attempts to put a beachhead on one's own words and create a new kind of public work that's beyond the reach of the creator to affect. This is distinctly different than social-media, blog, or other commentary. One would never argue reddit has no right to have comment threads that link to Ella or my or anyone's work. (Some companies have tried to make legal cases that every inbound link to a publicly available web page does require an explicit license, but let's ignore that illogic.)

Rather, there's a not fine distinction at all between citation and appropriation. 

How will this play out? Genius will likely continue to put out mealy-mouthed statements until something happens that's too egregious to ignore, or they're sued by an individual or organization that has the wherewithal and interest to pursue it for long enough and with enough money that it prompts Genius to change the parameters of what it does.

Like Third Voice, Spinspotter, and other failed efforts, I expect Genius' "annotate everything" effort to fail as an extension of its main focus, which is valuable and defensible.

See addenda at follow-up at this post.

Update! Genius is adding a way to report abuse and Hypothesis is adding nuance and thoughtfulness to redesigning aspects of being able to annotate everything without consent. We'll see how it plays out, but it's a relatively quick response to some aspects of the critique. 


Note: I originally said Genius was using a frame-based approach that let the user's browser load the content, and then Genius to overlay it. In fact, as Kevin Marks pointed out to me, it's using a proxy server and posting the contents from its servers, which is substantially more problematic.