Exploring the Leuven Database of Ancient Books: I. Prelude

Background and Disclaimer

The Leuven Database of Ancient Books (LDAB) is a searchable database of metadata on Greek, Latin, Coptic, Demotic, Syriac and other literary texts. It first came online in 1998, and has been widely used by New Testament scholars. I learned about its existence only a few weeks ago, when reading a scholarly work on earliest Christian manuscripts.

My interest in LDAB was piqued immediately, for many reasons: First, readers of this blog would know that I’m very interested in classical Greco-Roman and Christian literature. The LDAB opens up the whole wide classical world that I have not yet explored. Second, apart from reading classical texts, I’m also curious about the people behind the texts. The physical characteristics of manuscripts provide many valuable clues of the lives of their users. In the style of Sherlock Holmes, we can draw inferences about the ancients from the artefacts they left behind.

In my own field of scientific research, it is almost obligatory that public databases are versioned and downloadable, in order to facilitate reproducibility and collaboration. So I’m a little disappointed and surprised that LDAB is not downloadable, though it is accessible through a web interface. For the web interface is not suited for large-scale customized queries and visualization.

I decided to reverse engineer LDAB –just because I have the power, by downloading all 16298 [16559 as of Feb. 14, 2020] pages from their site, extracting the records from the HTML pages, and re-building a database from them. This was done programmatically over the past weekend, and I’m excited to document and share my findings. Needless to say, this is not a rigorous scholarly endeavour. My purpose, apart from satisfying personal curiosity, is to demonstrate, as proof of concept, how a valuable scholarly resource like Leuven Database of Ancient Books can be used to inform the general public on ancient manuscripts and the ancient world.

Information technology can definitely advance NT studies, as it has many other fields, for software programs can be developed to analyze large amount of textual data and find meaningful patterns in a relatively short period of time, which is a daunting, if not impossible, task for humans.

I’m very grateful to Prof. Willy. Clarysse at KU Leuven, the founder of LDAB, for graciously answering my questions about LDAB, generously sharing his knowledge of papyrology and providing valuable feedback on this blog series.

Historical Overview from LDAB

Ancient Manuscripts by Religion and Century

A picture is truly worth a thousand words, as the figure above is almost self-explanatory: Ancient manuscripts dated to before the 3rd century are predominately Greco-Roman literature (in yellow), with significant portion of Jewish texts (in dark orange) between 2nd century BC and 1st century AD. The earliest Christian manuscripts (in blue) are dated to the second century, and increasing almost linearly through the 8th century. The first Islamic text (in red) is dated to the 7th century.

For comparison and quality control, I’ve included a graph comparing the results between the reverse-engineered (NEMO) and the original LDAB. Without applying any filter, both return the same values. However, the NEMO figure above has addition detailed info on each religion, which is missing from the LDAB graph (live version here).

LDAB vs NEMO Comparison

References:

Related Posts:

Leave a Comment