IU's HathiTrust Works with KU on Text Analysis Project

A new text analysis project at Indiana University, which just won a $500,000 grant from Andrew W. Mellon Foundation, will focus initially on a collection of Black fiction compiled at the University of Kansas. The award will allow experts in IU's HathiTrust Digital Library to create reusable worksets and research models for analyzing digital collections. The purpose of "Scholar-Curated Worksets for Analysis, Reuse & Dissemination" (SCWAReD, pronounced "squared") is to come up with new methods for working with digital collections that emphasize content tied to "historically under-resourced and marginalized textual communities," as the campus explained in a press release.

The broad mission for HathiTrust is to provide tools and services to support computational research on a growing collection of digital texts. SCWAReD will apply human expertise with advanced technologies to identify, recover and curate texts by writers "hidden among vast digital library collections."

The Illinois team will be led by Stephen Downie, co-director of HathiTrust and associate dean for research in the School of Information Sciences at the University of Illinois, Urbana-Champaign.

The first model to be produced by SCWAReD will involve a joint enterprise with the University of Kansas' Project on the History of Black Writing (HBW), begun in 1983 by Maryemma Graham, a professor of English. After compiling a dedicated archive of Black fiction, HBW created the Black Book Interactive Project (BBIP) at KU, to increase the visibility of and research on Black-authored materials. The BBIP team will work with HathiTrust to produce a workset on the HBW corpus. Results will include an analysis of the texts, generation of "derived data," documentation and a project whitepaper. Graham will also serve a role in selecting three other competitively chosen scholar-curated collections to be funded under SCWAReD.

"This partnership allows us to realize the original intent of what many call the 'digital turn': an ability to share knowledge more broadly and to advance scholarship through collaborative opportunities enhanced by technology," Graham noted. "Unfortunately, the legacy of racialized practices has followed us and made too much of our knowledge invisible. While it might not be possible to start on a level playing field, we can work together to develop a model for building more inclusive databases and content-specific worksets that derive from them. This is an unusual, but much needed partnership that can be replicated across the digital landscape: We both bring something to the table, we both care about research, and we both care about what a rigorous investigation into a more diverse knowledge network can tell us."

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • data professionals in a meeting

    Data Fluency as a Strategic Imperative

    As an institution's highest level of data capabilities, data fluency taps into the agency of technical experts who work together with top-level institutional leadership on issues of strategic importance.

  • stylized AI code and a neural network symbol, paired with glitching code and a red warning triangle

    New Anthropic AI Models Demonstrate Coding Prowess, Behavior Risks

    Anthropic has released Claude Opus 4 and Claude Sonnet 4, its most advanced artificial intelligence models to date, boasting a significant leap in autonomous coding capabilities while simultaneously revealing troubling tendencies toward self-preservation that include attempted blackmail.

  • university building with classical architecture is partially overlaid by a glowing digital brain graphic

    NSF Invests $100 Million in National AI Research Institutes

    The National Science Foundation has announced a $100 million investment in National Artificial Intelligence Research Institutes, part of a broader White House strategy to maintain American leadership as competition with China intensifies.

  • black analog alarm clock sits in front of a digital background featuring a glowing padlock symbol and cybersecurity icons

    The Clock Is Ticking: Higher Education's Big Push Toward CMMC Compliance

    With the United States Department of Defense's Cybersecurity Maturity Model Certification 2.0 framework entering Phase II on Dec. 16, 2025, institutions must develop a cybersecurity posture that's resilient, defensible, and flexible enough to keep up with an evolving threat landscape.