ODH Lightning Rounds 2019


Hey everybody, I’m Braxton Boren from American
University right here in Washington, D.C. My project is titled, “Hearing Bach’s Music
as Bach Heard it.” I’m very proud that I avoided using a Bach pun in the title, but it could have been Bach to the Future or Bach in the Day if you prefer. J.S. Bach, he was an important composer 1685-1750. He’s very influential for later western composers,
but he basically stayed in the same small area of Germany his entire life. And the last 27 years of his life he spent
in one city, primarily composing for the Thomaskirche, single church in Leipzig, Germany. The church was drastically altered by the
Lutheran Reformation in 1541, which reduced the reverberation time, improved the clarity,
and using computational simulation techniques, we can now recreate the acoustic soundscape
of the church as it existed in Bach’s time, that is, in the 1700s, and before the Reformation,
in the 1500s. So this is a key issue in history that we
have visual representation from thousands of years ago in Paleolithic caves, right? We have pictures that people have drawn, but
we don’t have any sounds until we have the advent of audio recording in the 1800s. So, traditionally historical methods have
been limited in how much sound we can recreate, but with better computational simulation techniques,
so I was actually just in Leipzig, and essentially the way this works is we take acoustic measurements
of the present-day church, I just did that, I just got back I’m quite jet lagged. Then, we’re able to calibrate a computer model
based on those measurements in the actual space, that’s happening in the next month
or so. After that, then we work with historians to
alter that computer model to account for these historical changes, to account for the 1500s
and the 1700s, so we have different models we can listen to. Then we’re going to record an entire Bach
Cantata with instrumentalists playing in the virtual church, that is, listening to themselves
at the church in different points in time. So we have the actual feedback effects of
how the space affects the performers themselves, and we simulate that with a technique called
convolution to add that reverberation onto those different recordings so you can hear
what Bach’s music sounded in his own time and what it would have sounded like had the
church never been changed, had it stayed a Catholic Church essentially, as it would have
sounded in the 1500s and sort of see whether his music fits as well in a different space
rather than the space he was actually composing for, and finally, working with a graphic design
professor to create a web-based application for listening at different points in both
space and time. So listeners who can then move themselves
around, listen from different positions in the church, from where different groups would have sat. Where the men sat. Where the women sat, where the children sat. Also, move yourself through time as well. We were actually in contact with the Bach
Museum in Leipzig, and they’re very interested in this as well so I’m hoping that we’ll be able to have a sort of a steady rendering of it that will be there at the Bach Museum
in Leipzig as well, and I think I’m under time so I’m just going to finish right there. Thank you. I want to thank NEH for their hospitality,
especially for my visual accommodation. If you have a similar issue, please know they
are very, very hospitable and welcoming to people with that kind of challenge. I also want to thank you all for your inspiring presence. I’m humbled. I’d like to also thank the ancestors of the
Piscataway and Conoy Nations whose land we are on today. I was given permission to share the image
here of Panchay Tall Man, a member of the Board of Advisors for the digital project
for which I’m Project Director, Mapping Indigenous American Cultures and Living Histories. This man, a medicine man for the Murdoch people,
stands here at a site that is not only a national park, but a sacred site. It was the contrast between his knowingness and the lack of knowledge of the public, and the future k-12 teachers that I teach as well
as my interaction with a digital mapmaker that led to this project. Mapping Indigenous American Cultures is an
open-access collaborative digital resource intended as a national prototype, and it will
feature the Federated Graton Rancheria, supported by its chairman as well as other nations,
and It will, I hope, provide a platform for the public face of other indigenous nations. The focus has become in this project teaching
children and tribal children specifically, but it is important to everyone working on this resource that it be straightforward and accessible, and so it’ll be accessible to those who are visually impaired as well. Something to think about with you digital projects. So, while we’re working with our indigenous partners gathering stories, images, and other information, we’ve assembled a portion of the resource. The final version will not be a western cartographic map, but there will be one there. Here you can see features of public significant sites in a tab marked resources that contains links to research, tribal, and educational resources. We also have language here. The Myaamia Tribe included. We’re proud that Rutledge recently accepted for publication a text that accompanies/is a partner with this map. I invite you if you have a personal collection to share with your with your local tribe. You’ll enjoy the community that that makes
possible for you, and I thank you all. Hi, I’m Dave Hochfelder from University at
Albany SUNY. Our project Picturing Urban Renewal seeks
to make three interventions into the public and academic understanding of urban renewal in the United States. First is that smaller places matter and matter
greatly. Urban renewal affected smaller places much
more dramatically than larger cities. Our second intervention is that place matters related to the theme of small cities, in that this is a national story, it’s a federal program, but it’s played out locally under very different local circumstances, and this is key to historians understanding and the public’s understanding of urban renewal. The third intervention we seek to make is
to make a strong case for including the visual record. We argue that you cannot understand urban renewal in the U.S. without understanding what was lost, and what was planned, and what was eventually built. We’re integrating two projects. We have two NEH grants. One a Digital Projects for the Public grant
that is a deep social history of urban renewal in the city of Albany, New York. I encourage you to Google 98 Acres in Albany and you’ll get to our WordPress blog where you’ll see some of our preliminary results. And then this project, Picturing Urban Renewal, will include Albany as one of 4 cities that we’re looking at in New York State including Kingston, Newburgh, and the Stuyvesant Town project in Manhattan. This is what we have the Digital Humanities
Advancement Grant to further. This project is both publicly engaged, we
consider ourselves public historians first and foremost, but it is also scholarly facing
as well. To that end, we seek to use a variety of visualization
techniques and tools including timelines, look-arounds, magic lenses, a whole range
of tools. Here you see a mockup of what the timeline tool will look like that integrates movement through time and the visual record and space. This is several points in time and representative of the 4 sites that we’re looking at. So again, I encourage you to go to 98acresinalbany.wordpress.com for our Albany work and to stay tuned for more from Picturing Urban Renewal. Thank you! Hi, I’m Peter Logan from Temple University. This is Jane Greenberg from Drexel University. How did knowledge change in the 19th century? That’s the question that we’re asking with
this project, and we’re narrowing it down a little bit by focusing on the English-speaking world instead of the world as a whole. We’re also using historical editions of the
Encyclopedia Britannica as a proxy for all knowledge at the time. It was, after all, the reference work of record at the time and its authors were the best-known scholars of their day. On the other hand, its articles are obviously,
clearly biased, deeply biased with Victorian prejudices, but that was how knowledge as
such was constituted in the 19th century, that’s what counted. And what we want to do is use computational analysis to take a look at how and when concepts became classed as knowledge, when they disappeared as knowledge, and that way help us to better understand how the social construction of
knowledge changes over time. We narrowed the scope of the project to four major editions. Excellent page images already exist, but the textual data was too poor to actually use so we developed a semi-automated workflow to recognize the text that’s giving us an accuracy rate of 99.5% or better at this point. All told, we’re using this process to generate text for 81 volumes averaging 1,000 pages each and we have completed about 25% of the work so far. We use Python to convert the text into standards compliant TEI with one entry per file. When complete, the data set will include 100,000 entries totaling 100,000,000 words, and it’ll be freely available in several online repositories. We’ll mine the data to get a sense about changes over time In linguistic density and concept migration, but much of what we want to do
is a comparative analysis, and all of that depends on generating good metadata for these files so I’m going to ask Jane to talk to you about that. Alright, we’re working with a tool called
HIVE, which stands for Helping Interdisciplinary Vocabulary Engineering. It’s a natural language processing extraction tool. Actually, it was originally funded by IMLS. What it does, is it extracts terms and then
matches them against controlled vocabulary, so we’re working with the Library of Congress subject headings. We are now working on measuring the sizes
of the entries and determining what is the best number of controlled concepts that should
be selected for the size of the entry and then those are being automatically ported
into TEI records, so thank you. Thank you. Hello my name is Scott Roberston. I’m from the Georgia Institute of Technology, and, our project is The Digital Drawer: A Crowd-Sourced, Curated, Digital Archive Preserving Memory and History. My collaborators include folks from Emory
University, the Emory Center for Digital Scholarship, and a fairly new non-profit called The Historic Rural Churches of Georgia. And, I’ll just quickly give you a motivation
for this project. So, HRCGA has been for the past two or three years, hand-collecting a lot of materials including oral histories, historic photographs,
scanned records, all about historic rural churches in Georgia, so churches that are
over 100 years old in rural counties. And, they are collecting them, curating them, it’s a two person job at the moment, and we have successfully landed this project to essentially build them a real digital archive on top of, we have decided to leverage Omeka S, rather than building this from the ground up ourselves. And, importantly, build a public facing content ingest engine, so web or and/or mobile app on top of Omeka, which is designed specifically
for a target demographic, people interested in this project. HRCGA has been fortunate to have a fairly
large social media presence. They have 40,000 to 50,000 active subscribers who’ve been so far generating content for this project and following them on social
media, and we have some pretty good demographics. These are older adults, over 65, primarily
women, living in rural counties, not particularly technically savvy and possibly with disabilities so we want to design a tool specifically for this population, and that doesn’t just mean
being accessible. It means paying particular attention to the
usability of our tool set, so we have engaged in a fairly detailed and ongoing participatory design process, where we really seek to engage our stakeholders, our users, and other stakeholders in the design process, so they become designers with us, and that can involve, and has involved, everything from focus groups to in-depth prototyping interviews with individual users, kind of
think aloud sessions, and we’ve come up with a good set of findings and initial user interface and usability, user experience design for creating this ingest tool. Some of the findings are: they’re all familiar
with common tools like Facebook and Ancestry, but they have deficits, they have confusing
problems, so, okay, onward! Read the right paper eventually. Hi, I’m Taylor Arnold from the University
of Richmond, and I’m Lauren Tilton also at the University of Richmond, and we’re going to talk a little bit about the Distant Viewing project. So, our question is how might we analyze images at large scale? How might we develop computer vision algorithms, specifically machine learning techniques that are informed by and for the Humanities? And what might a humanities scholar and a
statistician along with a series of collaborators open up new ways to study images as well as increase access to our moving image archives? And that is the goal behind the Distant Viewing toolkit, and the work that we are doing at the Distant Viewing Lab at Richmond. So what is the toolkit? So we’re building an open source software
library that is able to ingest still and moving images and output structured metadata that captures time-coded shot breaks, scene breaks, dominant colors, faces, mood, and anything we can think of to make it work, and we’re developing it out in the open so you can go
to our website and actually see it in development as we commit things every day. And as I was saying this is a pipeline of
annotators that can analyze different features and objects within moving images or still
images. And, um, one of the things that this is a
part of in terms of media studies and film studies is thinking about how formal analysis
can open up questions about culture and messages that images and visual culture sends to people as well as thinking about how we can actually use this method to open up semantic metadata about moving image archives. Hundreds of thousands of hours sit in our
archives, and in collaboration with people like the Media Ecology project, and Mark Williams, we are looking at how we can actually use computer vision techniques to generate that semantic metadata to increase discovery and access. So, we have these dual pieces. How do we open up new questions and lines of inquiry specifically towards media studies, cultural studies, and film studies, as well
as how we think about this tool set being used across archives in other institutions
alongside of that. So, part of this then is the algorithmic approach to developing and structuring metadata. So, what does that look like? This is a fun example. But, we actually are then using this computer
vision techniques, training them with historical commitments and ethical commitments along
the way, things we are not outside of. Thinking about how we might be able to do
this at large scale. One season of a TV show for example is almost a terabyte of data. So, we’re not only thinking about this for
sitcoms, but early cinema, public television, and other collections that we can open up
in new ways. So, we’re also partnering with American Archives of Public Broadcasting, and other places. So, we would love to collaborate and make
this toolkit work with your collections and think about questions we haven’t even thought
about yet, thank you. Pleasure meeting everyone today. My name is Mo from University of South Carolina, from Mass Communication Department, and here’s my collaborator Chin-Tser from Computer Science Department, and our team includes linguistic scholars as well. So, the point of the cartoon here is, what
happens when we spread words? When we talk about gossips, scandals, and
rumors? I’ll answer myself because of the time limit. The content changes, right? Information changes. Sometimes tone of the information, sometimes in the nuances. Sometimes it’s just the structure, structure
of the sentences change, right? So those changes are something that we’re
going to look at. But, previously it’s really hard to capture
those changes, right? Because those conversations take place in
informal, interpersonal context, right, but now we have social media conversations which we called it digital trust data, digital footprint, so that we can document and then observe those changes over time, over sharing processes and spreading processes. And, we chose the topic, there are a lot of
concerns about misinformation, so one of the topics we chose is fake news. Rumors generally grow deep formed as they travel. So, our step is first we map out the evolution tree analysis. It is based on the textual string distance,
and time-stamped data. Mapping the evolution tree to find out which tweet is a good content, and then children versions. So, this is one of the examples we chose,
one of the fake news around the 2016 election, Trump was born in Pakistan. So, this is the tree, evolution tree, so in
the top of the note means that original tweet, and then the children, different notes, meaning the different versions of those tweets. It could be, we haven’t looked at those individual trees qualitatively, so we don’t know what’s going on for sure, but it could be just retweet or it could be some linguistic changes, but it could be some authentification or de-authentification of those fake news. Somebody could argue against those fake news, or somebody could express some skepticism about those fake news. So, we’re going to look at how those fake
news could evolve over time over those sharing processes. That’s what we’re going to do. Thank you. Hello, thank you all and thank you to the
NEH. Thank you! My name is Renee Gondek and this is Ethan Gruber, and we are part of Kerameikos.org, which is a linked open data, linked open data project that aims to standardize the ontologies for studying greek vases as well as disseminating that information. Why Greek pottery? Greek pottery is virtually, pottery in general
is virtually indestructible and can be found at most archaeological sites. Along with being very small and portable,
so you can find them at many museums around the world. This particular vase here you can see there’s a caption. You can find a lot of information related
to this type of object. Number one, region, Attic, referring to region, Athenian essentially, Athens. The type of technique like red figure, which
can give us chronological as well as geographical parameters. Calyx crater, which refers to the type of
vessel shape, so it gives us information related to daily life practices or rituals. Painters or potters like Euphronios that give us visual information related to mythological subjects, like the death of Sarpedon like
you see here, or even daily-life practices, religious practices. And, as well as a provenance when it is known. Sadly, a lot of vases do not have that information, but when provenance is known we can learn a lot about trade practices as well as information related to archaeological sites. This particular vase was actually found in
the Etruscan tomb in Italy like other Athenian 6th and later 5th century B.C. vases. So how might one actually find more information
about this particular vase, especially if you are using a different language with a
different script like Russian or Japanese? So, the artist of this particular vase is
Euphronios, and Euphronios is a Greek person, and his name is written in Greek, in Greek
script, but in English literature his name is transliterated into the Latin alphabet, and regardless of the language, whether French or German, there may be slight variation in spelling, and different fields of research. And certainly in Arabic it’s been transliterated and transliterated into Cyrillic for Russian
scholarship, but fundamentally we all agree that this artist is the same person regardless of the name of that individual in a particular language. And, so, what we seek to do is follow linked
open data techniques for establishing a permanent identifier on the web for that entity following URIs. And, an important aspect is to link that person to other people, which will facilitate aggregation of different museum collections. Thanks. Hi, I’m Della Pollock. I’m the Executive Director of the The Marian Cheek Jackson Center for Saving and Making History in Chapel Hill, North Carolina. We’re a small non-profit dedicated to honoring, renewing, and building community in the historic black neighborhoods in Chapel Hill, neighborhoods that are collectively known as Northside. Northside was, as similar to many, many communities around public universities, many of them represented here, was settled as a labor enclave for the
University of North Carolina. Its residents built the stone walls that surround the campus, hauled water from the iconic old well to the students in the dorms. And, for instance, through the 1970s comprised almost, 100% of its food service corp. Today, their descendants, fourth, fifth, sixth
generation residents are at risk of losing their homes to skyrocketing property taxes,
and patterns of predatory investment, that result in patterns like this which just show
the change from 2000 to 2011. At the Jackson Center, we are actually cutting land loss, and building affordability through a major land-banking initiative, through tax
mitigation, critical home repair programs that keep the people safe in their homes,
but none of this works without history. As one of our neighbors said, “History is
the spiritual glue of this community.” So the question for us becomes, not only how do we control the dirt, but how do we renew the glue? How do we mobilize the past in order to preserve the future of Northside? Our project is the Northside Digital Commons. It will model one way that communities like
Northside can pursue a community, driven digital historiography to continue to connect, grow, and prosper. It will transform the digital repository of
over 150 oral histories that the Jackson Center currently holds, and over 1500 images into
a vibrant public square. Led by a community review board, it will replenish loss of shared public place with virtual space, renew multi-general continuity that has a
youth-ready design that is ready to be implemented in our schools program that extends across three districts. It hails unique and largely unheard histories
of civil rights and labor leadership. It will connect neighbors who are both local
and displaced, but continue to identify as neighbors. It emphasizes a user methodology consistent with goals of community, self-determination, and sustainability, and it supports and shares out the vision, the values, and the exceptional strengths that continue to define Northside community. Thank you. Hi, my name is Scott Graham from the University of Texas, at Austin. This is Dave Clark from the University of Wisconsin, Milwaukee. The primary goal of the Transparency to Visibility project is to create an integrated methodology and toolkit that will allow humanistic researchers with no prior coding skills and no interest in developing them, the ability to transform natural language into relational network diagrams. As somebody who is interested in science, technology, and medicine studies, we are currently building this on the backbone of a data set of 34 million biomedical research articles indexed with PubMed and their conflicts of interest disclosure statements at the footer of the article, you know, these that say, “the researchers got so and so money from Merck or Pfizer, etc.” What we’re doing is creating this toolkit that will allow us to read all of these conflict of interest statements and then render them as large scale network diagrams. And ultimately we hope to be able to build this out as something that other people can use as Scott suggested, so part of our method needs to be developing tool sets and integrating them together in ways that will work not just for this data set, but for additional data sets in the future with other people we’re working with. So every time we find ourselves like going down the road of hardcoding something that’s particular to our solution, we have to take a step back and look at ways that are more open and more global in the approach. This is a good data set, however, for this kind of project because one of the things that’s true about these 34 million data pieces is that they are frustratingly inconsistent across the many, many medical journals. So you’ll get author names where it’s just the initials, or the initial and then the last name. You’ll get the company name in like three or four different ways. And this is a problem that’s going to occur in any kind of humanistic text-driven data set. And so we’re really ramping up some good solutions to standardizing that data set in a way that will really allow us to build out the relationships in a very clear, cut fashion. The current implementation of this puts the data into an R Shiny format where users can manipulate the data, apply different visualization algorithms and filter based on content areas of interest. So, the current set up actually allows users to search via PubMed for their medical areas of interest or disease areas of interest and then automatically filters the data set so that you can see this is the representation of the funding networks that surround Leukemia at the time of this data snapshot. Again, the end result of course is to take it away from this very biomedical, focused application to a toolkit that can be applied to all sorts of different humanistic research endeavors. Thank you. The mission of National Breath of Life is to work with endangered language communities to build capacity around the archival sources for community-directed revitalization efforts. This is accomplished through a series of workshops that provide community researchers with the necessary tools and training for archive-based revitalization efforts. Since 2011, National Breath of Life has served 117 indigenous researchers representing 55 different language communities. National Breath of Life has been funded in part by the National Science Foundation, Documenting Endangered Languages Program, and more recently with support from National Endowment for the Humanities. Dormant languages that have lost their speakers, as was the case for my language, Myaamia, require rigorous linguistic analysis of archival documentation for the reconstruction and revitalization. At the center of our current project is the Indigenous Languages Digital Archive, or ILDA. ILDA is specially designed software that originated from an early 2012 NEH-supported project designed to address the critical archival needs for the Myaamia language revitizialtion efforts. Due to the software’s success within the Myaamia effort, ILDA has undergone continued modifications so that it may be shared more broadly through National Breath of Life. ILDA is the only available software that allows for the organization, storage, and retrieval of digital surrogates of linguistic archival materials, and directly links independent data derived from linguistic analysis, to the original manuscript pages. So why invest in revitalization of small minority languages? Linguistic diversity correlates with a diversity of knowledge systems, histories of human survival, and aspects of human cognition. The vitality and use of a native language within a community has been shown to have statistically valid correlates with community health and well-being. Our understanding of the relationship between language vitality, cultural stability, and communal health is evolving, and there is little research in this specific area. Minority language revitalization is forcing us to examine more closely, and to rethink more broadly the importance of this work for not just the communities who seek to preserve their cultural heritage, but in preserving diverse knowledge systems, that are most efficiently and effectively expressed through these languages. I’m Anne Knowles. This is Anika Walke. Jewish ghettos have been understood for decades as crucially important places in the history of the Holocaust. Since the fall of the Soviet Union, and the opening of archives across Eastern Europe, scholars have been able to document hundreds, even thousands, of ghettos, beyond the large, relatively familiar urban ghettos, such as Warsaw and Krakov in Poland, or Vilnius in Lithuania. These two maps reflect the emergence of new scholarship about ghettos in the East. The black and white map on your right shows the approximately 1,100 ghettos included in the U.S. Holocaust Memorial Museum’s Encyclopedia of ghettos in German-occupied Eastern Europe, the main source for our project. We are extracting information from the encyclopedia entries, into related tables in a historical GIS. Scholars have found many significant differences among ghettos, depending on dozens of factors, for instance, the number of Jews living in a given region, changing Nazi policies, the arrival of German forces, proximity to the front, local administrations, or the whims of military and civil officials, the nature of the built landscape, incidence of disease, or armed resistance, and many more. Our project therefore has three broad goals. It’ll be the first comprehensive comparative analysis of ghettos in space and time. By capturing many of the key features of ghettos, the historical GIS will support iterative, exploratory mapping within and between regions, reveal patterns of forced movement, labor, mass murder, and other key events that together constituted the Holocaust in this part of Europe. Secondly, our place-based model will situate victim experience in relation to perpetrator actions. These two groups of historical actors have generally been studied somewhat separately, and Holocaust scholars have been seeking ways to integrate them for a long time. Our project will use geography to achieve this integrated history of the Holocaust in Eastern Europe. Place is also very prominent in the hundreds of survivor interviews that form our second major source. Some of them can be mapped quite precisely, including ghettos, camps, cities, and towns. To link these places to the historical GIS of ghettos, we will use geoparsing, and a Holocaust gazetteer of coordinate locations. Other places that matter to individuals, however, cannot be mapped in conventional terms, but they too belong in a full geography of ghettoization. We will use close reading to identify and study these places of Holocaust experience. Our final goal is to develop new modes of visualization that will bring into registration the many kinds of place, experience, and events related to the Holocaust, and help us gain a full understanding of the genocide. Thank you. Good afternoon. My name is André DeTienne, and I am a director at the First Edition project at Indiana University, Indianapolis. NEH-funded project is code named STEP, for Scholarly Text Editing Platform. STEP is to be a cloud-based online production solution available to any scholarly edition project for the documentary or critical. It aims to reproduce online the entire workflow of a scholarly edition from initial transcription to final layout. This will make it far easier for lead editors to distribute all kinds of tasks to collaborators anywhere in the world. The first slide demonstrates STEP’s comprehensive workflow, one that can be adapted to the needs of all sorts of editions. The central design principle rests on the desk metaphor, a series of places customized to fit the professional needs of the specialized occupants. Each desk allows documents to be created or received from some other source to be encoded, corrected, or edited, to undergo round of proofreadings and corrections. Some desks have access to a customized TEI XML encoding interface that facilitates the internal tagging of text as much as possible. Take for example a transcription desk, where everything begins and provides transcribers with a TEI XML compliant interface that specializes in the act of manuscript transcription. STEP also comes with an additional powerful tool to make the work of transcribers easier with stand-alone open source software called STEP transcriptor and descriptor. Other desks include the editing desk which allows editors to emend off of your text and have the system create TEI compliant list of emendations. There are also the annotations desk that assist and monitor the work of scholarly annotation to the text, the apparatus desk that consolidates all of the components of the textual apparatus, the front-matter desk that consolidates all the documents that precedes the authors’ body text, such as a table of contents, preface, and introduction. The back matter desk, that consolidates everything that follows the body text such as annotations, apparatus and index, and the layout desk that consolidates all the type-setting work. The second slide shows you one of the prototype interfaces in the desk flow monitor, which lets members of an editorial team to survey the stages of progress undergone by any document within any given desk, and figure out what is the next stage of work for any document and to whom it is to be assigned. The last slide shows one of the many interfaces in the marvelous software called STEP TEI Header Maker. TEI refers to Text Encoding Initiative. Every TEI compliant document must start with a TEI header, a structured XML repository of text-related metadata. A TEI header can be very complex. The TEI Header Maker helps editors create such headers without technical anxiety. Thank you all. Hi. I’m Amir Zeldes from Georgetown University, and I want to talk about Coptic, which is the language of Hellenistic Egypt in the first millennium, and also the human language with the longest continuous documentation on earth, together with its ancestor Ancient Egyptian, but sadly for many years, chronically understudied relative to its importance. So, for example, if you expect you can probably read The Odyssey in Greek tomorrow, you decide you want to read it, you probably all expect you can find it with clickable translations, and analysis, and everything, and you would be right. You wouldn’t be right for a lot of Coptic classical works despite the importance of Coptic for the development of Christianity, Monasticism, and generally the ancient world. So my colleagues and I in a series of projects that were largely funded by the NEH, also the German DFG and some other foundations have taken it upon ourselves to try to improve the situation and put as much Coptic online as we can. And through these previous projects we’ve arrived at a kind of critical mass of what I’d call gold annotated data, so data sets that we’ve curated by hand, which form what we think of as a standard representation for the kind of Coptic that we want to make searchable online. Primarily for the Sahidic dialect. For those classicists amongst you, that’s a bit like Attic Greek. So in this current project phase we’re looking to scale the project in multiple directions, one of them is up, so having more text, mainly through the use of automatic tools. One of them is across, to handle more heterogenous data then we’ve done before, and one of them is out, which is to connect with other projects that are currently emerging simultaneously, especially in areas where we have less expertise such as geographical resources. The current phase of the project is focusing on machine learning tools for Coptic thanks to the existence of the data that we developed
in previous phases. The goal is to have any kind of Coptic in no matter what standard come in, and this uniform representation that we’ve developed to come out, so that users can expect similar or comparable results for data coming from different resources. This is actually quite difficult because different scholarly editions have used different standards even for just separating what constitutes a word in Coptic, or different orthography. And currently we’re working on a pilot to homogenize all of our biblical data coming from existing projects. This summer we’ll work on other legacy digital data that was developed again in differing heterogenous standards. And next summer we’re looking to include OCR input as well. And all of this means developing tools that can kind of squint at the data and say, “Well, this looks a bit like the following,” and then use our gold standard representation to create our kind of data, the data that will be uniform for users. For those of you who are not super interested in Coptic, I’d like to say that our tools are very flexible and also open source. If you have languages with similar challenges, for example, we were very pleasantly surprised in a paper this fall when we discovered that our system actually had the best performance documented for segmenting Hebrew, so it seems that the tools are actually applicable to other languages as well. Finally, and more specifically, our current plans are to broaden our data sets to include more genres that we haven’t had as much before such as Saints’ lives. Getting data for syntactically analyzed data sets, which will force people competing on analyzing syntax to include Coptic in their tools and is also named entity recognition. Thank you. Alright, I am Jessica Otis and I am presenting on World History Commons, a grant to revitalize and expand the World History Matters website. It’s an award winning NEH-funded collection of world history websites that have been developed at the Roy Rosensweig Center for History and New Media over the past 20 years. Many of these sites were created in the early days of the internet when hand-coded databases and HTML were the norm, and some rely on depreciated technologies, we can all wince in unison, Flash! Their functionality and aesthetic are of the
early internet days as you can see on the slide, while their target audience is teenagers and young adults who have grown up with the web 2.0 But despite these limitations, the World History Matters websites remain immensely popular. They received over 3 million unique visitors, and just shy of 20 million-page views in the last 12 months. These numbers are calculated across the 8 content-based websites and the 9th landing page website. You can see on this slide, the World History Matters site, that’s the landing page, its purpose is to link the other 8 sites together. Just briefly going over them, the Amboyna project focuses on a 17th century conspiracy trial. The Liberty, Equality, and Fraternity, and Imaging the French Revolution sites examine the French Revolution. The latter specifically focusing on images of crowds and crowd violence. Gulag presents an in depth look at life in the Soviet Gulag between 1917 and 1988, while Making the History of 1989 explores the fall of Communism in Eastern Europe. Children & Youth in History and Women in World History are wide ranging across space and time, but thematically focus on the lives of children and women respectively, while world history sources was our original world history website. Now while each of these individual sites is an amazing collection of world history resources, the World History Matters landing page is just that, it’s a landing page. The underlying databases on these sites are not unified in any way, so for example if you wanted to search for material on women in the French Revolution, you’d have to run individual searches across three different websites. So instead of just updating the aging infrastructure, and content of each individual site, we’re creating one new, singular, unified site, World History Commons. This will be a unified resource for scholars, educators, and students interested in world history. You can see here we’ve already begun some design work on the site, which is being completely redone in Drupal with dramatically expanded content and updated media that’s compatible with current computing standards, aka, the iPhone. But the biggest takeaway I’d like to leave you with is that like many digital humanities projects, this is a massive team effort that I am just the front for. Alright, this is co-sponsored by the World History Association, and the Roy Rosensweig Center for History & New Media. Our core project team is listed on this slide, and again, this is just the tip of the iceberg because World History Commons unifies the work of over a hundred scholars, and it is our honor to ensure that this work is accessible for a new generation of world history students. Thank you. Hello, I’m Rebecca Salzer from the University of Alabama. Dance expresses and communicates history and culture. One of the great challenges for dance scholarship and education has been its intangibility. Despite the existence of several dance notation systems, dance remains largely an oral tradition, transferred from teacher to student, and performer to performer. Film and video recording technologies have revolutionized dance education and scholarship, and recorded dance now serves as a version of dance text for analysis, preservation, and transmission. Despite this, online dance resources today are difficult to find, and when found, are often in excerpt form. This for an educator or scholar is like studying the first few measures of a piece of music, the first few lines of a poem, or the corner of a painting. The need in the field is so vast, that there are many possible first steps toward filling it. Over the last two years, I have identified a working group of dance scholars, educators, archivists, legal, and technical experts. We’ve been in close communication with subsets of the larger group presenting and gathering feedback at national conferences. This Level 1 Digital Humanities Advancement Grant, will allow the entire group to convene for three days in May at the University of Alabama. Consolidating the information we’ve gathered into a detailed blueprint for either a new online dance resource, or a way to aggregate and enhance existing resources for the purposes of scholarship and research. Following our May meeting, the group, whose members you see listed here, it’s a quite a considerable number of people working together, will publish a white paper obviously, and we’ll host sharing sessions in New York, San Francisco, Chicago, and Atlanta to spark interest in the project, and to gather feedback from
scholars, educators, and artists, the resources future users and contributors. Our next step will then to be seek support for an organize the implementation of the pilot. We’re excited to move forward and we thank the NEH for its support. Thank you! Good afternoon, everyone. I’m Alisea McLeod from Rust College, and I am representing as one of the Co-PIs, our project Freedom’s Movement: Mapping African American Space in War and Reconstruction. This is a digital history project covering the eras of Civil War, and Emancipation and Reconstruction. This project, Freedom’s Movement, will bring together three extant projects that have been in communication with each other over the past three to five years, and we will be moving forward with a linked data model to bring these projects together. These three projects represented by Scott Nesbit, who is our PI, by myself, and about John Clegg, who is in charge of the African American Soldiers Project. And, some of you may be familiar with Scott Nesbit’s project, it’s a former project still in existence, but funded formerly by NEH, and basically what Scott Nesbit has done is to map the official record, any mention made of African Americans encountering Union lines. John Clegg’s project, the African American Soldiers Project, crowd-funded and crowd-sourced just like the other two projects that we’re bringing together, actually documents the service records, all of the data contained in service records of the United States colored troops, and then finally, the project that I’ve been working on for five plus years, the Last Road to Freedom Project, which basically traces African Americans as they encounter Union lines, and their names are recorded. And so, we have been transcribing what we call contraband camp, excuse me, registers that include very rich data, including first and last name of African Americans encountering these lines, birth place, age, complexion, year of birth, year of enlistment, place of enlistment, name, and residence and I could go on and on. We haven’t really seen rich data like this in very many other projects and so this is really going to bring down the wall that African American family history researchers so often talk about. We also are convening a meeting this summer at the University of Georgia, where we will be meeting with other possible collaborators and that includes AfriGeneas, African American family history researchers, and several other projects, and thank you. I’m Bjørn Stillion Southard from the University of Georgia, and I’m here representing the team working on Historic Profiles of American Incarceration. Our project takes inspiration from the lively, public, and scholarly debate over the history of mass incarceration in the United States. From Michelle Alexander’s The New Jim Crow, to Ava DuVernay’s documentary 13th, we are interested in studying the history of American prisons, and prisoners, going back to the first era of extensive prison construction
and reform in the late 18th and early 19th centuries. Our goal is to compile and analyze prison records across a large span of American history to see how earlier patterns of incarceration
compare with more recent ones. The key to our project is the increasingly widespread digitization of prison records. Fortunately, a model for our project exists in the United Kingdom. It is called the Digital Panopticon. For the past six years, a team of British researchers, and website developers, created a searchable database of approximately 90,000 British men and women who were convicted of crimes between the years 1780 and 1920. The database which links together millions of records and offers users an assortment of research and learning aids, is housed at a free and publicly accessible website, digitalpanopticon.org. Citizens interested in genealogy can enter family names and examine the resulting records. Secondary school teachers and students can read the historical background information, and a series of prisoner biographies and attempt their own investigations. Scholars can use the data to develop original research projects, and policy makers can learn from the general and specific patterns of crime and punishment that emerge from the records. Our first task is to identify prison records
that are already digitized or could be digitized in the next two or three years. The vast size of the United States makes a comprehensive prison database impractical, but we intend to select a representative number of prisons from every region of the United States for inclusion in the project. We’ve already begun analyzing digitized records at a handful of prisons in different states and regions. All of our project participants will meet this fall to discuss our preliminary plans to design a historic database of American prisons and prisoners. Hi, my name is Karen Desmond and I’m from Brandeis University. My level I project, Measuring Polyphony: An Online Editor for Late Medieval Music, will allow a variety of modern users: students, experts, musicians, alike, to access and contribute transcriptions of polyphonic music directly linked to digital images of the original medieval manuscripts. For the first phase of this project, which is complete and available at the URL measuringpolyphony.org, I developed a methodology to encode these music compositions in a machine-readable way, and display them online using the note shapes and the rules of the original medieval notations. The data is encoded in XML following the standards of the MEI, which is a community-based initiative similar to the TEI but focused on music. The music notation is displayed using a software called Verovio that can convert the MEI to SVG images of the music notation, and also to MIDI, to allow you to actually hear the music. Music notation is relatively complex to encode because you have to align multiple voices and instrumental parts in order for the music to make any sense and sound good. Medieval notations such as the mensural notation I focus on, add a further layer of complexity because these notations are context-based. So for example, certain shapes may have more than one possible duration, the long highlighted here could be held for three or two beats depending on the notes that follow it. In other cases, the correct interpretation may be dependent on a particular geographic location, or the interpretation may be ultimately unclear as is the case with the semibreves in this example. As a result, the process to create my original online edition of these works was quite fiddly. The medieval notation was translated back to modern notation in order to transcribe it and align it properly. It was then converted to MEI, and then translated back to medieval notation. This current project aims to make the process more straightforward, eliminating the intermediate steps and creating an online editor that will allow the direct entry of the medieval note shapes from the very beginning of the process. So I envision an effort, similar to efforts to transcribe handwritten texts. The images of each manuscript page will be segmented to identify the musical staves, and then the transcriber will simply enter exactly what they see in the medieval manuscript. The computer then can interpret these note shapes and align the parts following the rules for medieval notation. In a broader humanities context then, Measuring Polyphony, investigates how modeling the meaning of signs can lead to new understandings of the interaction between the sign and the signified and of the relationship between notational style and changes or difference in musical style across time and place. Hello, my name is Mary Furlong Minkoff, I’m the curator of archaeological collections at James Madison’s Montpelier, and the Project Director for the Montpelier Digital Collections Project, which was inspired by the successes and challenges that the Montpelier Foundation faced during the creation of The Mere Distinction of Colour exhibit. This exhibit highlighted the need for an inclusive collections database in order to improve collaboration. It showed us that working with the public created public buy in. It also revealed our limited digital presence, and fourth and most importantly, the success of that exhibit showed that collaborating with descendants of the enslaved community resulted in a unique and powerful product. At Montpelier, we house four different departments that house collections: archaeology, curatorial, research, and the historic preservation. These collections vary drastically in material, size, and organization. Each department uses a different database system to manage their collections, some of which are designed primarily for research, while others primarily to manage objects, none of them successfully meeting the needs of both, and none of these systems are available to the public. Because of the size and variability of our collections, our need to function as a collections-management tool, a research tool, and a public engagement tool, it quickly became evident that we needed a partner’s invest in the project as we are, so for this project we selected Matrix at Michigan State University because of their shared interest in enslaved history, public engagement, and long-term institutional collaboration. When we began to discuss this project, it was important for us to focus on the process as much as we were focusing on the final product. Because of this, we’re engaging with the public at each step along the way, both in person and digitally, #MontpelierCollections. We’ve also begun to build a website, collections.montpelier.org, which features a survey designed to hear from as many people as possible who’d be potential users for this type of database. In July we’re going to hold a two and a half day workshop, hosting 45 participants representing a variety of stakeholders, including digital humanities scholars, museum professionals, educators, descendants, collectors, genealogists, volunteers, and the like. During the workshop participants will develop a plan for the database. While the workshop is happening at Montpelier, it’s also going to be happening digitally, and you’ll be able to watch it live-streaming, and engage on Twitter and other social media platforms with the questions we’ll be posing during the workshop, and incorporate that feedback to the in-person workshop. And our effort in this project is not only to address the needs that Montpelier faces, but also to make sure the work is done in a way that empowers the public to be active participants, not only in the consumption, but in the creation of their own heritage and history, and we believe that this engagement begins early, so that the final project will be one that is useful and valuable to everyone. Thank you. So I’m going to ask you sing with me. Here’s a note. Ah! Mary had a little lamb, little lamb, little lamb, Mary had a little lamb whose fleece was white as snow! So this, like much music, is self similar and so Wattenberg, who’s now the head of Google’s, I think it’s, Big Picture, data visualization group came up with this art diagram approach in 2001. There was one problem, though, he didn’t really understand musical themes, so composers might not just do Mary had a little lamb, Mary had a little lamb, or Mary had a little lamb. And, so, basically we’re revamping his visualization approach to have a much better algorithm, so, [music plays]. So that’s the famous theme to Bach’s C Minor Fugue. Here’s the tonal answer, [music plays]. And actually the notation there is an octave below. So, um, composers really adapt themes, they transpose ’em. This one actually happens to be transposed and modified, hence, it’s a tonal answer, not a real answer. Um, and, so a big part of our project is teaching students how to teach computers which are kind of dumb, how to understand these really, you know, things like Gestalts and stuff that humans, are so easy for humans, and so, here’s a binary representation that actually relates those two themes together, and when we use that instead of Wattenberg’s original exact pitch matching we can get transpositions, other modifications, and we get a much better visualization, and so in this case, for this fugue, there’s eight statements of the subject, there’s transposed, both modally and also chromatically as well as certain notes are modified, but it recognizes all of them. I should note, so we’re rolling this out at many of the major music programs in the Southeast, UGA, Emory, GSU, Georgia Tech, FSU, but the way we’re branding it, it’s actually based on African tone languages and music. My area of research is Nigerian music, and it’s based on two years of field work in Nigeria, but we can’t really tell them that because then they’ll think it’s irrelevant to them. So, that’s why I used the Bach example although in my own research I would use examples from Nigerian music. So, thank you very much. Alright. Hi, everyone my name is Benjamin Wiggins, and I’m here representing my colleague, JB Shank. I don’t think our title was shortened. It hasn’t been yet, so I guess this is what you can get away with. I’ll let you read it on your own. Basically, JB and I are studying the Amsterdam publisher John Frederick Bernard’s Religious Ceremonies and Customs of the Peoples of the World, and we affectionately call that the CCR, and that’s not the rock band that John Fogerty led, which many people, I try to explain this to, they hear CCR and that’s all they can think about. The CCR is a seven-folio encyclopedia, and it has 250 really lavish engravings, one of which you can see here from Bernard Picart. This is really an enlightenment print sensation. This is a book that actually treated religions as more or less equal, and kind of anthropologically looked at them. So, this was of course heretical at the time, and one of the interesting things about the CCR is there’s really no original or standard version that exists. We’re studying this because it has, it was distributed as sheets, and people put it together in many different ways, some censored, and some with other books even distributed throughout this encyclopedia, so it’s been called the book that changed Europe, but it’s still really quite understudied compared to other encyclopedias and because there’s so much kind of confusion around this, and there’s no original version, we really aim to bring technology to bear on bringing some order to this text. And so, what we’re going to do is build a portal that would really allow for, that would really allow for study of this in a single environment, and how we’re going to do that is IIF. We are thinking about using the International Image Interoperability Framework to portal in from the home institutions into a viewer, which will be Project Mirador, which is developed by Stanford, and what that will allow us to do is actually allow scholars to see these side by side, annotate, and make comparative assessments of these works. And I just want to bring it back to, if you don’t study early modern religious texts or encyclopedias, this might be interesting for you still because of the potential of the, not CCR, IIF, too many acronyms, and project Mirador, which could be a really great scholarly portal for any subject matter. Thank you! Hi, I’m Ansel McLaughlin from Northeastern University’s College of Computer Sciences, and I’m affiliated with the New Lab for Text
Maps and Networks, so I’ll be presenting Improving Optical Character Recognition and Tracking Reader Annotations in Printed Books by Collating & Transcribing Multiple Exemplars. Optical character recognition for old books can give bad output, with more than 1 in 5 words incorrectly transcribed. This is due not only to unusual fonts and variable image quality, but also because of the traces that readers leave behind. Annotations such as underlines and earlier marks and marginal notes. The OCR of this copy of Rousseau belonging to John Adams contains several examples, but annotations can provide interesting insights into historical reading practices. Projects such as book traces, which crowd-sources interesting annotations to the archaeology of reading, which performed intensive transcription of the reading notes of Elizabethean scholars such as John Dee and Gabriel Harvey testify to the value of this information. Even if we can’t fully transcribe a marginal scroll, knowing which books and which passages were annotated can be valuable. This project will work to solve both problems: correcting noisy OCR, and detecting annotations, by taking advantage of the digitization of multiple copies of the same book. We can collate and different OCR transcripts and align images of pages with and without annotations. This will produce trained data both for language models, for OCR correction, and for computer vision models to detect the annotations in books. Although we will train on the subset of books with multiple exemplars, we’ll be able to apply these methods to books with only a single copy. The project will produce cleaned up OCR of historical books before 1800, and a database of which pages and passages have been annotated. Thank you. The Italian Renaissance is famous for art, architecture, music, and learning, but an integrated experience of these achievements is difficult to grasp, given the dispersal of physical evidence, and the disciplinary confines of our learning. Um, it’s also often gendered as male. This virtual reality project for study of one of Renaissance Italy’s most stunning art spaces and collections, the Studiolo of Isabella d’Este, addresses both of these problems with cross disciplinary digital tools for approaching the period through one of its most important women. Its immersive, interactive features will convey the human scale, cognitive density, and aesthetic specificity of a Renassiance art space, and capture the multi-sensory complexity of rooms that were meant to dazzle visitors with humanist ideals. Individual and collaborative work in this environment will foster new approaches to studying and teaching a multi-media Renaissance and provide models for analogous projects in other periods. Isabella d’Este was a multi-talented Renaissance woman. She was a co-regent of the Italian city-state of Mantua, who wielded both cultural and political power. She was an accomplished musician, a prodigious correspondent, a diplomat, a dynastic mother, and much more, all of which make her an ideal filter for study of the period in general through our larger project, the Isabella d’Este Archive. She is perhaps best known today as a path-breaking art collector who amassed a personal gallery with works by Michelangelo, da Vinci, Mantegna, and others that paved the way for modern American collectors like Peggy Guggenheim, and Isabella Stewart Gardner. As a striking instance of personal branding, the Studiolo was unparalleled in Isabella’s own time as an architectural and artistic expression of explicitly feminine Renaissance culture. Sorry, I need to move my thing. Is it this? Yeah. Partnering with museums around the world, the virtual Studiolo will reassemble dispersed artworks of Isabella’s collection in a digitally mastered 3D version of her tiny art-packed rooms where gilded ceilings, intarsia panels, frescos, and personal emblems surrounded the collections of paintings, bronzes, books, cameos, antiquities, and musical instruments that played starring roles in the production of her learning. The project, which uses photogrammetry, and Blender for 3D modeling, will invite users into a Renaissance visual and acoustic experience that was meant to be immersive. In the round, floor to ceiling, and textured by sound and reading. We aim to open the creativity and the spirit of inquiry of this turning point in western culture by harnessing the tools of today’s technologically powered Renaissance. Thank you. Hello, and good afternoon, my name is Samantha Blickhan, and I’m the digital humanities lead for Zooniverse.org, based at the Adler Planetarium in Chicago. Zooniverse is the largest platform in the world for online, crowd-sourced research, with more than 120 projects and over 1.7 million registered volunteers. Online crowd-sourcing is a process of task distribution, meaning researchers with large data sets upload their data to our site, and our volunteer community helps them to classify or process these data in a variety of ways, from identifying galaxy type, based on visual characteristics, to transcribing handwritten text that cannot be machine read. So this project will support that steadily growing number of text transcription projects being created using our free, do it yourself project builder tool, which launched in July of 2015. Since that launch, use of the project builder has grown exponentially, more than 300 substantial projects have been created using this tool, including more than 20 in which text transcription is the main project goal. So one of the unique features of the Zooniverse platform is that all subjects receive multiple classifications. In the case of transcription, this means that an individual document in any given project, this is anti-slavery manuscripts at the Boston Public Library, antislaverymanuscripts.org, which was built with the help of the IMLS, so shout out to Ashley back there. So, in the case of transcription this means that an individual document in any given project will be transcribed by multiple people before it’s considered complete, and the transcriptions are then aggregated together for consensus. The intention behind this method is to remove the need for substantial review before the resulting data are made available to the public. So to support this growing demand for text-based projects on the Zooniverse, we’ve made great strides in our work on full text aggregation,
including producing some in-house algorithms for the text transcription tools currently available in the project builder, but we’re aware that many research teams lack the expertise needed to go through the process of aggregating their data and evaluating the accuracy of the results. This ends up being a barrier to participation, since the platform currently lacks a dedicated space in which researchers can view, edit, and manage the resulting data from crowd-sourced transcription projects. So, that’s what we’re going to build. Over a two-year period, we plan to build an interface for research teams to view and edit text based outcomes of projects built using the Zooniverse project builder. This tool will be available to all, but we’re hoping it will be particularly useful for researchers who might not have as much institutional support or access to large digital humanities labs, or the ability to work with data scientists who can aggregate text-based data. Our hope is that this tool will help to make the project builder accessible for a much wider range of people, and help build their confidence in digital public research methods. Thank you. Hi, everybody! I’m Donna Thompson Ray and I’m a Project Director of Faculty Development Programs at the American Social History Project Center for Media and Learning at the City University of New York Graduate Center. Take a breath, right? I’m so pleased to be here, and I thank the NEH for their generous support of our project, and particularly it was great to meet old friends and new friends, our Program Officers, who have been really terrific in our work with the NEH. Who Built America, the Open Educational Resource, or WBAOER, or, the title that’s on that yellow sheet of paper that you have that says something like Open Educational Resource for Who Built America, we answer to all those titles, right? Our project is a multilayered OER, which contains a main narrative, dynamic primary source materials such as audio, text, visuals, maps, charts, and graphs, as well as teaching resources to explore and further kind of fill out the construction of the narrative. This is completely an interactive platform here. It is free, publicly accessible, and produced in Drupal, and I’ll talk a little bit more about the other softwares that are represented in the project as well. This resource draws from our two-volume textbook, Who Built America: Working People and the Nation’s History, which we’ve produced in three editions, and also our interactive primary source and teaching resource, History Matters, an online resource for teaching the U.S. history survey, produced two decades ago. Both of these resources are in great demand, and I must point to a really important partner in our production of History Matters and with the textbook to some extent too, and that is GMU’s Center for History and New Media, represented by our colleague there, and Sheila’s past work there too. They’ve been really an integral longtime collaborator of all our online resources, and with the textbook. So, as you can see from the first slide, the opening page kind of resembles a traditional textbook in its appearance, right? The basic like table of contents, and then one can click to the opening essays under part one and different sections of the narrative and then the chapter. This resource as I mentioned has its central narrative. Along with it, are primary sources that are in conversation with the narrative. As you can see, the design can be used in various devices, right? Your iPad, your phone, and that is in fact to make it accessible to kids, young people, who have low budgets, so sorry, and lastly, here are the additional teaching resources that are also present on the website, and you can have an account open, and, well, you can go to the URL, we’ll share it with you later, thanks, I’m so sorry, I’m so sorry! Hi, everyone, my name is Brent Seales, and I am Professor and the Chairman of the Computer Science Department at the University of Kentucky. I’m really grateful to the NEH for the funding for this project because I have a colleague who believes that he will die before I complete this work. Now it’s true, my colleague who is a well-known Herculaneum papyrologist was quoted in the New Yorker as saying, “I do not expect this scroll will be read during my lifetime.” Challenge accepted. Which challenge am I accepting you may wonder. The NEH and I together, Brett, we might not be able to actually save his life because it’s the NEH, not the NIH, right? Maybe we can at least change some expectations. Next slide.There it is. We will try to read portions of the Herculaneum Collection that have never been read. The challenges that many of the papyri containing the writing are completely closed, even they’re still rolled up, and they’re stuck together hiding many layers. The Herculaneum corpus contains 1,800 scrolls and almost 900 of them are still unopened or only partially opened thanks to the carbonization of Mt. Vesuvius, and that’s a lot of text. Many believe that the promise of new excavations also could reveal more scrolls. Only in the last decade has technology advanced to the point where noninvasive imaging, together with computer algorithms gives us a way to look inside them and read the text. Tomography is truly amazing, but it also has some limitations. Certain kind of inks, like carbon black from
Herculaneum seem almost impossible to capture using tomography. This work actually aims to show how. We will use NEH funds to develop a systematic method that we think will indeed reveal the ink of Herculaneum papyrus, allowing us to use our software pipeline, which we call virtual unwrapping to read the interior, and the previously unseen layers. The starting point is to expand a method based on what else? AI machine learning. We will scan open fragments where text is visible, and we’re going to build a kind of large-scale reference library. Our hope is that this massive training set will allow us to amplify and read the ink in the tomography of all the hidden, still wrapped up layers. If we’re correct and the supervised examples of the open fragments help us see the hidden text, the story will go something like this: The fragmented open text, of Herculaneum have redeemed the still unopened ones, making us all feel somewhat better about having basically destroyed so many of them in the first place. It will also broaden the pathway for entire categories of other damaged material, bringing more of the invisible library into our digital library. And finally, it will save us all from my colleague’s predictions about how long he will live, and how long it will take us to succeed. Thank you very much. Hi. Good afternoon. Thank you very much. I’m Mark Williams from Dartmouth College, and I am so grateful to the NEH for many reasons. Not the least of which, is that they made the title of my project legible. Media Ecology Project, which is the research project I direct, is working to cultivate more and better access to moving-image archives, make those materials available online, generate new kinds of 21st century scholarship about those materials, and scholarship that generates qualitative metadata that goes back to the archives. So in a very real sense it’s an information ecology, but we’re also trying to save moving-image history by enhancing new scholarship about it, so this is a project that works on together a number of fascinating metadata
resources into a linked data compendium that will be published in the Scalar platform at the University of Southern California. It expands the work that we’ve initiated with the Library of Congress and their one of a kind collection of early cinema, based on the copyright practices called the paper prints. This is the Rosetta Stone of moving-image culture, and we’re just so proud to be playing a role in advocating for and helping them to find a reason to digitize these materials. It also works with a one of kind collection of chronological production log of the American Mutoscope and Biograph studio, one of the founding studios of American cinema history put together by an extraordinary scholar, former cataloger at the LOC, Paul Speer, who has produced this chronological log, including information about lost films. We’re also incorporating resource from the Museum of Modern Art, which is digitization of the biograph exhibition catalogs that includes not only descriptive information about each film, but three key frames from each film, including films that are otherwise lost. We’re working with the LOC, the Museum of Modern Art, the British Film Institute, and the extraordinary film, Eye Filmmuseum in Amsterdam to acquire prints of as many films as we can and make them part of the compendium, and working to expand our own semantic annotation tool which was funded by the NEH, which really advances the capacity for a linked data compendium. It’s a tool that allows scholars and students
to create time-based annotations about these films, so for example, from 41 seconds, to 51 seconds, this is what I see, or this is a keyword, or here’s an object, and it can create multiple time-based annotations as this suggests. You can select specific parts of the frame to annotate, and so there can be many annotations going on at the same time. And, we’re putting that kind of information in relation to one another, and they mutually contextualize one another, so time- based annotations, written- and print-based annotations, and some machine-generated annotations via tools like Optical Flow to produce new inspired moments of new research questions. Thank you. So, good afternoon, we’re two of the project co-directors for Migration, Mobility, and Sustainability: the Caribbean Studies Digital Humanities Institute. I’m Laurie Taylor, I’m chair of the Digital Partnerships and Strategies department at the University of Florida, and I’m the Digital
Scholarship Director for the Digital Library of the Caribbean or, DLOC. Hi, and I’m Hélène Huet, I’m the European
Studies librarian at the University of Florida, also board member of our DH graduate certificate and the Vice Chair of the Florida DH consortium. So, our photo here is of two of our institute
faculty and our collaborators, Maritza Gonzalez and Nacha Rios at the University of Puerto Rico, and they’re at the eastern most point of the United States in St. Croix. The reason we have this picture is because all of our work is done in relation, from a situated perspective, and in partnership. Alright, so our project is based at UF in partnership with the Digital Library of the Caribbean, also known as DLOC, and so for this project we’re going to host a week-long in-person workshop, and also five additional monthly virtual workshops on collaborative DH and Caribbean studies. Participants will gain DH teaching experience, and an in-depth knowledge of how to use digital collections in teaching. The institute is going to provide training and tools, processes, and resources for developing lessons, modules, and courses. We’re going to have 26 participants (they’ve been selected) who will achieve the acquisition of concrete digital skills and DH approaches for teaching and research using open access digital digital collections. They will also be participating in an community of practice for DH, and also they will be creating open access courses and teaching materials that blend DH and Caribbean studies. So this institute is really important to us and it will give us the needed space to get to know one another and also to learn how to work together. And, so the photos here are some of our community engagement photos, distributed online collaborative course, one of our Post-docs, Crystal Felima, when she took her students to Haiti for the Haitian Studies Association conference recently. And, what we’re really working on is cultural change and building community so that’s why I have this, and then just again to reiterate, this is all part of our work of community building. Many people from the generous and generative fields connected with Caribbean studies were the impetus for this project. This project grew out of our shared work in developing digital libraries and courses together, and all of this was done within the spirit of mutual aid, which is defined by volunteering reciprocal exchange of resources and services for mutual benefit. Mutual aid, as opposed to charity, does not connote moral superiority of the giver over the receiver. It’s also about shine theory, where if you don’t shine, I can’t shine, we all need to shine together. So, we already know that this institute will
be the first of many. 26 confirmed participants. We had over 93 fantastic people apply, so we know there are so many more people to engage with. We’re looking forward to more future institutes and to sharing the results from this. Thank you. Hi. I’m Sarah Connell. I’m from the Women Writers Project at Northeastern University, and I am talking about a series which is inspired by the idea that word-embedding models are a technique from natural-language processing. They are incredibly powerful and they are
actually really intuitive methods for discovering relationships between words and large corpora of texts, but as this code snippet might sort of suggest to you, there can be a little bit of a learning curve in getting into these, so we’re trying to design an institute series that will make word-embedding models, not just a little bit more accessible, but that will also develop a community of thoughtful practice around applying these to humanities research and teaching questions. We’re going to be encouraging people to think about things like, “what kinds of research questions can even be investigated with word-embedding models? How do you know what your results are telling you? How do you know when you found something significant? How do you evaluate claims made by others, and also how do you teach these in the classroom?” So, what we’re going to be doing will be four institutes in total. There’s going to be two focused on research, and two focused on teaching at both introductory and intensive levels. The intensive ones will actually be taught
with our studio, which is what you were seeing on the last slide, but the introductory ones will be taught with a web interface that we’ve been developing, and that’s what you’re looking at here. I’m hoping this doesn’t just look a little
bit more user friendly than the last slide, but also gives you sort of a taste of what word-embedding models are capable of. What you’re looking at here are the words that are closest in vector space to the word learn, and it’s actually a really marvelous list: teach, appreciate, speak, observe, and instruct. We will be encouraging people to bring their own corpora to work on for this project, so the model that you’re looking at here was trained at Women Writers Online. We’re expecting that people will learn better if they’re working with more familiar materials, and that it will give them the chance to dive into their projects right away. This is part of our overall goal of having these have a much broader impact than just the people who show up for those four institutes, so we really want to make these very long term, and really have a very, very wide audience, and to that end, we will be publishing pretty much everything we end up producing. So that will be curricular materials, code that you can read through and run through with comments, checklists, and then we will also with our participants be having this period of discussion and support. So we’ll have virtual office hours, we will also be giving people the opportunity to get feedback and give feedback to each other,
and, we’re really excited about this: we will be publishing the materials that our own participants will be producing on our blog, and on our website. So, we’ll be sort of building together this like menagerie of really thoughtful applications of word-embedding models that are focused very specifically on the humanities. So, if that sounds as exciting to you as it
has been for us, here’s how you can get involved. So, we are actually at this very moment, accepting applications for our July series, which will be introductory research focused. You can also check out our web interface which I was just showing you there. Please do contact us. And, I want to thank the whole project team, including my co-PI Julia Flanders, and the NEH. Hi. My name is Alison Langmead. This is Aisling Quigley, and Chelsea Gunn, and we’re all here from the University of Pittsburgh. We’re presenting today on the series of NEH/ODH advanced workshops entitled Workshops on Sustainability for Digital Projects helpfully shortened. This work grew out of a previous Research and Development grant, received from the NEH Division of Preservation and Access, and we thank both of these divisions for their support of this project. As one of the deliverables for that grant, we produced the documentation for a sustainability workshop entitled The Socio-Technical Sustainability Roadmap, whose sections and hyperlink you see here, and if you go to that hyperlink there’s a link to these workshops and we do still have one sustainability workshop application due for Utah in a while. So, please go if you’d like to come and participate in this advanced workshop. For this series of NEH/ODH advanced workshops we are facilitating five 2-day instantiations of the Socio-Technical Sustainability Roadmap. To date, we have completed three of these five workshops in Pennsylvania, Georgia, and Oklahoma, and are looking forward to our two final convenings in Rhode Island and in Utah. So far, we have welcomed 70 participants from 37 institutions and 23 states. For our upcoming Providence workshop, we hope to welcome an additional 27 participants, and all but one New England state is represented, Vermont. We hope that Utah is equally as expansive and we invite you all to apply. Four additional institutions have asked to fund their own instantiations of the workshop above and beyond those supported by this grant, although only one is currently fully planned, and it is this very week. Later this week we’ll be going to Dennison University for the five colleges of Ohio. Here we see two of the many big post-it notes that we post on Twitter, and produce for each workshop. There are many digital components to this work, Excel spreadsheets and the like, but also an emphasis on face-to-face collaboration, and pen-and-paper work. We found that people come to these workshops looking for things such as planning for the end of a project, strategies for ensuring the ongoing funding of digital projects, and a general understanding of the various infrastructures that impact digital sustainability. You see here on the left some of the ways
that participants define DH sustainability upon departure. Sustainability is about articulating goals, understanding users and their needs, consolidating technical processes, documentation and communication strategies, disaster planning, making interdisciplinary and iteration, the default work process, and of course, “a lot of work.” Participants also say they leave with skills and deliverables such as structured guidelines that will help them move forward in their planning processes and ongoing project development, documentation of their social and technical infrastructures, and a clearer understanding of each, and the language needed to communicate to partners, the “goals and limits of what we can sustain.” We can offer this group a few of our preliminary findings as well. There’s a critical relationship between sustainability and trust, and as seen here on the right, common sustainability red flags include human burnout, staff turnover, institutional roadblocks, and fly-by-night project management. Thank you.

Tags:, ,

Add a Comment

Your email address will not be published. Required fields are marked *