The Independent newspaper had an interesting piece today about a pair of researchers (Penguin UK editor Jodie Archer, and associate professor of English at the University of Nebraska-Lincoln, Matthew Jockers) who have spent the last 5 years using computer algorithms to analyze 20,000 books looking for patterns that make best sellers stand out. The result is a system they claim can predict bestselling books by an 86% margin, which is pretty good.
Naturally, they’re releasing what they learned in their upcoming book The Bestseller Code (and probably marketing their software to major publishers as we speak) however they did release a few interesting tidbits from their research:
“Novels with high or low emotions tend to have a stronger chance of hitting the [bestseller] lists and staying on them.
A couple of pointers from the findings: real people are more appealing to readers than fictional being, so stay away from Dwarves, unicorns, and elves as main protagonists. Those characters who appeal the most are also more likely to “grab”, “think” and “ask”.
The words “need”, “want” and “do” are twice as likely to appear in bestsellers, while the word “okay” appears three times as much. Words like “love” and “miss” appear more often in successful books, apparently appearing three times for every two in lesser selling books.
So, basically, people like reading about other people they can relate to, and stories where the main characters are active and pursuing goals (especially relationships) are what readers want. Now, the word “okay” is an interesting bit, and my interpretation on that is that readers like books written in colloquial and easy to understand language. It may also be a side effect of most bestsellers being modern thrillers and romance novels, so “okay” turns up in modern dialog a lot.
It will be interesting to see the results of this research, and how far it can go. Of course, the publishers would eventually like to have machines churning out their bestsellers like widgets, but I doubt that will happen anytime soon. Also, it will mean a bunch of books which don’t fit the formula will never get the chance to reach a wider audience, because 86% is not 100%, and many good books could fall between the cracks if publishers start using this to cut costs and be lazy.