Imagine spending months carefully scanning century-old manuscripts, uploading every image to your institution’s repository, and breathing a sigh of relief, the collection is “preserved.” Then, a few years later, a platform migration scrambles the order of hundreds of images, backup files go missing, and metadata built with care disappears into a new platform.
That is not a hypothetical. It is what happened to the J.P. Morgan Coptic Codices at The Catholic University of America’s Semitics/ICOR Library and it illustrates one of the most persistent myths in the academic world: that uploading equals preserving.
The myth of “It’s online, so it’s safe”
Nowadays, people who rely on institutional repositories often assume that once a collection is digitized and accessible, its future is secure. And who can blame them? Platforms like JSTOR have become synonymous with permanence. But digitization, the act of converting physical materials into digital files, is only the first step in a much longer, more demanding process.
Digital preservation scholar Paul Conway (2010) draws a clear line between the two: digitization creates a new digital product from a physical object, while digital preservation is the entire “suite of tools, standards, and policies” that protects it for future use. One is a moment; the other is an ongoing commitment.

Madonna and Child, Bybliothecae Pierpont Morgan Codices Coptici Photographice Expressi, vol. 13.1 (manuscript M574: The Book of Hermeneiai with Various Hymns), p. 4. Semitics/ICOR Library, Catholic University of America
Ancient manuscripts, modern problems
In 1910, a group of Coptic manuscripts was discovered near Hamuli, Egypt, at the ruins of the Monastery of Saint Michael. Dating from the ninth to the tenth century, they are among the oldest dated Coptic manuscripts ever found — and their miniatures and bindings represent the earliest known examples of Christian book art in Egypt. The collection is especially rich in biblical texts, most written in the Sahidic dialect of Upper Egypt.
In 1911, Belle Greene, first director of the J.P. Morgan Library, oversaw negotiations to acquire the collection on behalf of J. Pierpont Morgan. She then opened correspondence with Msgr. Henry Hyvernat, a Coptologist and professor at The Catholic University of America, who spent the next 25 years shepherding the manuscripts through restoration, collation, and scholarly documentation. Hyvernat (1912) described them as “the most complete, and from the point of view of ancient Christian art and literature, the most valuable collection of Coptic manuscripts as yet known.”
Morgan, Greene, and Hyvernat agreed on an ambitious plan: restore and rebind every manuscript, collate the full collection, and produce a 56-volume facsimile edition to be distributed to prominent libraries worldwide. By 1922, twelve sets had been printed. Two were reserved — one for the Morgan Library and one for The Catholic University of America — and by 1925, all sets had reached their destinations.
In 2019, the Semitics/ICOR Library digitized the collection and uploaded it to Islandora, CUA’s institutional repository at the time. The materials were now accessible online. The goal, as Dr. Monica Blanchard, curator of the Semitics/ICOR Library, put it, was straightforward but significant: the digitization “provides scholarly access, it gives scholars a close look at how the manuscripts looked at the moment of being photographed, and it preserves them for future use.”
In 2025, CUA migrated its repository to JSTOR Forum (later upgraded to JSTOR Digital Stewardship Services). The transition introduced three compounding problems. First, the volumes themselves were no longer in proper order — the absence of standardized numbering with leading zeros meant the system sorted them as “Vol. 1, 10, 12, 2” rather than a logical sequence. Second, and more disruptively, the images within each individual record had also been scrambled: pages that should follow one another in manuscript order were now mixed up, breaking the coherence of every volume. More concerning, no complete backup of the file set was initially available. In its absence, correcting the records would have required manually reorganizing hundreds of images across dozens of volumes.
Curation in action: what recovery actually looks like

What happened at CUA is a known risk in digital curation. The Digital Curation Centre’s lifecycle model, known as the DCC&U, maps every stage a collection must pass through: from creation and ingestion, through storage and access, to ongoing transformation and reuse. It exists precisely to show that preservation is not a destination but a cycle of continuous, active work. When a collection moves from one platform to another, each of those layers can shift, losing information in translation. Digital curation scholar Arjun Sabharwal (2024) argues that migration is one of the most fragile points in the entire curation lifecycle, precisely because it always involves a process of transformation, by adding some things or removing others. For collections with rich, interdependent metadata, even a well-intentioned upgrade can scramble years of careful work.
Fixing the Coptic Codices collection required going back to basics. After a thorough search (reaching across different parts of the library), a complete set of backup files was finally located. With a full set in hand, the decision was made to start fresh: rebuild every record from scratch rather than try to patch a broken structure.
I started by tracking down accurate metadata for each volume, I was able to find Msgr. Hyvernat’s original documentation held at the Morgan Museum and Library in New York. Msgr. Because his notes were in paragraph form, AI tools were used to structure them into a format compatible with JSTOR’s metadata fields. Researchers in archival science have noted that AI-assisted structuring can surface connections between documents that human catalogers might otherwise miss, but they also stress that every output needs careful human review (Toth et al, 2025). In this case, AI accelerated the process, but curatorial judgment remained essential.
After two weeks of work, the collection was reorganized: volumes in proper order, images correctly sequenced, metadata accurately describing each record rather than the collection in aggregate.
What this means for your campus community
Preservation Week is a good moment to ask a question that rarely comes up in conversations about open access or digital scholarship: who is doing the ongoing work of keeping these collections coherent, accurate, and usable?
Faculty who cite digital collections in their research, students who rely on them for thesis work, and administrators who point to them as evidence of institutional impact — all of them depend on a layer of continuous curation that is largely invisible. File organization, metadata remediation, storage management, migration planning: none of this is glamorous, and very little of it happens automatically.
The Coptic Codices case is a reminder that digital collections are not infrastructure. They are living intellectual resources that require active stewardship. Uploading is where the work begins, but not where it ends.
Footnote
Sebastian Vinueza is a graduate student in the Department of Information Sciences and the Graduate Librarian Pre-Professional (GLP) at the Semitics/ICOR Library. This post draws on both his hands-on curatorial work with the J.P. Morgan Coptic Codices and his research conducted as part of LSC 648: Digital Curation. He thanks Dr. Monica Blanchard, Curator of the Semitics/ICOR Library, for her generous guidance throughout this project.
Bibliography
Conway, P. (2010). Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas. The Library Quarterly, 80(1), 61–79. https://doi.org/10.1086/648463
Hyvernat, H. (1912). The J. P. Morgan Collection of Coptic Manuscripts. Journal of Biblical Literature, 31(1), 54–57. http://www.jstor.org/stable/3259990
Sabharwal, A. (2024, April 26). Curation-Migration Cycle: A Preservation-Centered Framework for Institutional Repository Migration [Presentation].The Southern Miss Institutional Repository Conference. https://aquila.usm.edu/smirc/2024/1/8/
Toth, G. M., Albrecht, R., & Pruski, C. (2025). Explainable AI, LLM, and digitized archival cultural heritage: A case study of the Grand Ducal Archive of the Medici. AI & SOCIETY, 40(6), 4561–4573. https://doi.org/10.1007/s00146-025-02238-5