This report results from a study of eight international projects that have uncovered previously unimagined correlations between social and historical phenomena through computational analysis of large, complex data sets.
How many lifetimes? This question often arose when the authors of this report pondered the extraordinary scale and complexity of research conducted in the Digging into Data Challenge program. Analyzing and extrapolating patterns of meaning from tens of thousands of audio files; nearly 200,000 trial transcripts; millions of spoken words, recorded over many years; and hundreds of thousands of primary and secondary texts in ancient languages would, if undertaken using printed resources and analog materials, have required the lifetimes and generations of scholars.
Because the resources in question were digital, the time of analysis and discovery was compressed into months, not decades. By choosing to work with very large quantities of digital data and to use the assistance of machines, the Digging into Data Challenge investigators have demarcated a new era—one with the promise of revelatory explorations of our cultural heritage that will lead us to new insights and knowledge, and to a more nuanced and expansive understanding of the human condition.