Success in Books: A Big Data Approach to Bestsellers
Reading remains the preferred leisure activity for most individuals, continuing to offer a unique path to knowledge and learning. As such, books remain an important cultural product, consumed widely. Yet, while over 3 million books are published each year, very few are read widely and less than 500 make it to the New York Times bestseller lists. And once there, only a handful of authors can command the lists for more than a few weeks. Here we bring a big data approach to book success by investigating the properties and sales trajectories of bestsellers. We find that there are seasonal patterns to book sales with more books being sold during holidays, and even among bestsellers, fiction books sell more copies than nonfiction books. General fiction and biographies make the list more often than any other genre books, and the higher a book's initial place in the rankings, the longer the book stays on the list as well. Looking at patterns characterizing authors, we find that fiction writers are more productive than nonfiction writers, commonly achieving bestseller status with multiple books. Additionally, there is no gender disparity among bestselling fiction authors but nonfiction, most bestsellers are written by male authors. Finally we find that there is a universal pattern to book sales. Using this universality we introduce a statistical model to explain the time evolution of sales. This model not only reproduces the entire sales trajectory of a book but also predicts the total number of copies it will sell in its lifetime, based on its early sales numbers. The analysis of the bestseller characteristics and the discovery of the universal nature of sales patterns with its driving forces are crucial for our understanding of the book industry, and more generally, of how we as a society interact with cultural products.