Emerging Technologies – A Day in the Life

Share This:

The following is an idealized day in the life of an Emerging Technologies Librarian at University of Oklahoma Libraries. While there has never been a day exactly like this, most days (right now) include some significant combination of the activities described below.

OU Virtual Reality Association

8:00-9:00am – Research. Research is definitely the most cognitively intensive activity I engage in, so it has to happen early, while the coffee is still flowing through my bloodstream. Essentially, it involves synthesizing conclusions drawn in various bodies of peer-reviewed literature, including library science, philosophy of mind, and computer science. Right now, my writing is focused on certain key similarities that exist between embodied browsing activities taking place in the physical books stacks and “browsing” that takes place in immersive virtual reality environments. Ideally, this new platform affords the conditions necessary to preserve instances of serendipitous information retrieval.
9:00-10:00am – ETL Meeting. Individually, our formal training includes electrical engineering, library science, and philosophy, and we can cover a lot of ground as a result. This meeting is a way for us to submit problems/hurtles/roadblocks for group consideration and offer support in various ways as we begin a new week. For example, Cody and Stacey are visualizing and 3D printing microscopic imagery associated with the corrosion of metal surfaces (fuel tanks) for the microbiology department. Cody’s 3D printing expertise will combine with Stacey’s 3D scanning knowledge, in this case, and some of the models generated will be deployed on our OVAL virtual reality platform for networked analysis and presentation.

Course Touchdown @ the iHub Vis Lab

10:00am-11:00am – Course Touchdown. Innovation @ the EDGE is an inclusive makerspace that the emerging tech team staffs; maintains; programs for; and markets. Roughly, the focus is visualization – including physical visualization, as it were, in the form of 3D printed objects. At least once a week (and oftentimes twice a week), the EDGE offers free programming to introduce users of all academic backgrounds to the freely available tools, as well as one-off events including: Fiber Arts, “Hack Your Home”, and 3D Scanning Campus Crawls. We also support course integrations, whereby faculty hold class in the EDGE for the purpose of exposing students to a specific technology or to complete a technology-centered assignment. More than a dozen unique courses, including those from Art + Art History; Journalism; English; Chemistry; Medical Imaging; Anthropology; and Architecture departments/colleges, have touched down in Bizzell for this reason.

Painted 3D Print

11:00am-12:00pm – Prep Conference Talk. Last year I had the rare privilege of traveling extensively for work. The subject of these various talks included indoor navigation, interactive mindfulness technology, and virtual reality for research and instruction. Many of these conferences took me to places I would have otherwise overlooked (like Krakow, Bozeman, or Tucson) to present alongside various faculty with whom we have collaborated with to develop and apply one or more emerging technologies. Right now, we are preparing for an April talk at the Coalition for Networked Information’s (CNI) spring membership meeting in Albuquerque, where Zack Lischer-Katz and myself will discuss the challenges of preserving and archiving virtual reality session data.
12:00-1:00pm – Lunch! Recommended: Cheeseburger (and fries!) at the Garage; Sesame Ginger Beef Bahn-mi at Coriander Café; Stromboli at Sandro’s; chicken Kao-soi from Thai Delight, if the weather is nasty.
1:00-2:00pm – Architecture Study. This semester, an Interior Design capstone class is the subject of a joint Architecture/Libraries study on the impact of immersive virtual reality on spatial thinking and volumetric design. The students are using our VisLab at the newly constructed Innovation Hub. The four customized VR workstations available there for students (and faculty, staff, and the public) allow students to inhabit and annotate architectural designs alongside their teams. Given these students would traditionally be quite far along in their careers (that is, post construction) before such a walk-through would be possible, virtual reality analysis puts them ahead of their peers at other institutions in terms of volumetric design.

Impromptu Electronics Workshop

2:00-3:00pm – Admin Duties. Supply requests are due! Our 3D printing services are free of charge and open to the public, so it’s no surprise we initiated more than 600 prints in 2016. That includes children’s prosthetic; WWII-era tanks; weather data visualization; and 3D scanned art objects for a collaborative program with the Fred Jones Jr. Museum of Art. That means supply requests regularly include PLA filament, along with various peripherals (and maintenance/repair items) related to 3D visualization. Alongside the HCLC, we also hire, train, and manage ~15 undergraduate student employees, spread across three spaces. Beyond facilitating 3D prints, course touchdowns, and upkeep of the space, these students have presented workshops to their peers in the EDGE, and even created campus clubs, like the new OU VR Association.
3:00-4:00pm – Consultation. In 2016, the emerging tech team fielded more than 120 unsolicited technology consultations on a wide range of topics and tools. In the course of this activity, we recommended some combination of tools, experts, and spaces that can augment existing scholarship. Last year we regularly collated technology lists for use in K-12 makerspaces; reviewed undergraduate capstone projects; supported virtual “field trips” for Gateway students; supported exploratory studies on the impact of 360-degree video advertisements; 3D printed prosthetics, and lots more. Please reach out if you are interested in tech for scholarship!

Outreach at The Mercy School, Edmond

4:00-5:00pm – Grant Application. We are working closely with our peers at University of Arizona Libraries to scale up 3D scanning and immersive visualization services across both campuses. Our counterparts have deployed VR workstations of our design on their campus already, and – once the workflows for developing 3D content are refined– users on both ends will have virtual access to shared virtual collections. Professor Pailes, for example, has queued up a range of Chaco Cultural artifacts for anthropology students at OU, despite the fact that those scans feature artifacts located hundreds of miles from Norman. In the future, a joint repository of fully interactive (and networked) 3D assets might number in the millions, thereby providing sort of immediate access to models/artifacts/objects/spaces traditionally only possible via physical analysis.
5:00-? – Beer. A bunch of places around town are serving F5 on draft, and its consistently delicious.

Of course there are also absurd amounts of email correspondence and committee meetings thrown into this mix, and single events – like the Architecture Study, or off-site K12 outreach in OKC, for example – last more than a single hour. Nevertheless, this description should serve to demonstrate the amazing variety of tasks associated with 21st century librarianship.

Your friendly, neighborhood technology librarians

Matt Cook is Emerging Technologies Coordinator at OU Libraries and can be reached at mncook@ou.edu

In (Academic Social) Medias Res

Share This:

Within hours after Forbes.com published Sarah Bond’s article, Dear Scholars, Delete Your Account at Academia.edu, I began fielding questions from OU faculty members. 3,132 people from the University of Oklahoma have accounts on Academia.edu, and 3,284 have accounts on ResearchGate. Another popular “academic social network,” Mendeley, doesn’t group their accounts in the same way, but it’s likely there are OU-associated accounts there as well. There’s a lot to unpack in regards to my feelings about these companies and the “services” they offer, but for the purposes of this post I will hit on the “high points.”

SHAREOK: advancing Oklahoma scholarship, research and institutional memory

My short answer to many of the questions I get about these companies remains: Until you have a personal problem with the way they do business, then I don’t see a problem as long as you also share your work in a repository such as SHAREOK. Services such as Academia.edu offer something different than SHAREOK, so I understand people’s desire to create profiles there.

My longer answer: Similar to Facebook, Google, etc., you are not the customer when you interact with these companies, even though you may feel like one. Instead, you are the product that these services seek to monetize and/or “offer up” to advertisers. I don’t fault any business for making money; that is the imperative for them to exist. But I also see these particular companies as an extension of those who monetize what I believe should be freely shared.

Secondly, if these companies are bought, sold, or go out of business, what would happen to the content you’ve placed there? This is one reason why I advise faculty members that the first place to share articles, preprints, postprints, conference posters, proceedings, slide decks, etc. is SHAREOK. Librarians maintain an OU Faculty/Staff collection on SHAREOK where you can deposit your work, and we would be delighted to step you through how to upload to SHAREOK. The items in SHAREOK are indexed by Google and Google Scholar, so they are searchable (and findable) by researchers around the world. You still own the copyright in the work you deposit in SHAREOK, and the OU Libraries maintain the platform, the content, and the links. Most importantly, maintaining and preserving content is one of the core missions of the OU Libraries. We aren’t going out of business, so your content on SHAREOK won’t go away either.

A third consideration with any of these services (including SHAREOK) is the legality of uploading your work there. Most publishers require authors to sign a publication agreement/copyright transfer prior to a manuscript being published. It is important to read and understand this contract, because it outlines what you can/cannot do with your own work in the future. “progress” © peter honeyman. Used under a CC BY-NC licenseWhat many people don’t know is that these contracts can be negotiated to give authors more rights in the work they produce (even better for you, working with faculty members to negotiate these publishing contracts is part of my job)! For example, many publishing contracts don’t allow you to share your work without an embargo period, if they allow you to share it at all. In these cases, uploading your work to a site such as Academia.edu may be a violation of the terms of the publishing agreement, whereas uploading it to an institutional repository may not be (or can be negotiated to not be). Several years ago, a major academic publisher actively went after Academia.edu, requiring them to take down all of the publisher’s content that had been illegally uploaded, much to the surprise and dismay of these authors.

Finally, Academia.edu’s latest tactics are troubling to me from the standpoint of privacy and intellectual freedom. Personally and professionally, I find it distressing that a private company, which doesn’t adhere to the same professional ethics as librarians do, collects information about who is reading what. They then offer to share that information with you if you subscribe to their premium service, which includes analytics. And while the analytics dashboard doesn’t reveal readers’ names, it may provide enough information for you to know exactly who read your work. You may decide not to pay for Academia.edu’s analytics, but even so – what you view and download will still be tracked. This may not be dangerous for you or me (the “I’m not doing anything wrong, so I don’t care” argument), but – in my opinion – it sets a very bad precedent. What about the researcher who studies terrorism? Or whistleblowing? Or ev“Privacy” © Metro Centric. Used under a CC BY licenseen climate change? How might people at these academic social media companies create profiles and make judgments about you based on what you are reading? And what will they do with the information they collect, especially if asked for it by government entities?

Yes, much of the same can be said for Amazon and Google, and I have my share of turmoil with them too.  

In addition to SHAREOK, there are resources, such as SHERPA/RoMEO, that help authors better understand what they can/can’t do with their work; there are sites like the DOAJ and Think Check Submit that help you find the best publisher and journal for your work. There are tools such as Scholar’s Copyright Addendum Engine that make it easier to negotiate a publishing contract. My job (at least part of it) is to help you with these resources and assist you with activities such as academic publishing, copyright, and other tools of scholarly communication.

If you’re interested in reading other posts about this topic, you can start here:

Jen Waller is the Open Educational Resources and Scholarly Communication Coordinator at OU Libraries and can be reached at jenwaller@ou.edu.  

Jen Waller is the Open Educational Resources and Scholarly Communication Coordinator at OU Libraries and can be reached at jenwaller@ou.edu.  

Preserving Virtual Reality at OU Libraries

Share This:


University of Oklahoma Libraries
Oklahoma Virtual Academic Lab

With the public release of the Oculus Rift virtual reality (VR) headset in March 2016, and with many other affordable VR platforms following, VR is becoming the hottest new platform for video game content this year. At the same time, OU libraries has been developing a VR workstation, Oklahoma Virtual Academic Laboratory (OVAL), for academic teaching and research purposes. Instead of (just) shooting zombies or engaging in other ludic activities in virtual worlds, OU Libraries is exploring ways in which VR technology can contribute to learning objectives in classes and support new modes of scholarly inquiry. Already we have had great success integrating OVAL into classes and/or research projects in the fields of Biochemistry, Architecture and Interior Design, Anthropology, Marketing, English, and many more. Once faculty members see the benefits of visualizing 3D data in an immersive and interactive visualization environment,[i] they begin to see the many ways in which this platform can be used to enhance the teaching of visual and spatial skills in the classroom, and to support their own research endeavors with new tools for analyzing 3D artifacts, such as hominid skulls or biological specimens, that may be too large, small, or fragile to transport or handle directly. The applications of this technology are virtually limitless. OVAL is still being developed, but its latest build is open to the entire OU community now. A total of eight VR stations are currently available across the OU-Norman campus at the Bizzell Library (1st floor, in the Edge), the Innovation Hub (3 Partners Place, South Campus), and at the Law School Library. All of these VR workstations are networked, which enables collaborative classroom use or researcher collaboration in VR across campus.

OVAL was only recently released for public use on the OU Campus in January 2016, yet we are already thinking about how to make it sVirtual Reality Demonstration at OU Librariesustainable into the future. First of all, my position as “Postdoctoral Research Fellow in Virtual Reality Preservation and Archiving for the Sciences,” asks me to think about the long-term preservation of research data related to the scholarly use of VR at OU. This fellowship is supported by CLIR, the Council on Library and Information Resources, with support from the Sloan Foundation. As of now, common standards and best practices for the archiving and preservation of VR-related data have not been widely adopted, offering a variety of preservation challenges. On one hand, we have a range of 3D file formats with different degrees of adoption in the VR community, each presenting particular preservation concerns; and on the other hand, we have the OVAL platform itself, which combines hardware from several commercial vendors (Oculus, LeapMotion, 3DConnextion and others), some proprietary software (Unity3D), and custom code developed in-house. Thus, we can see that thinking about how to preserve the 3D objects is closely related to thinking about how to preserve the technology needed to display them consistently. If a researcher publishes a paper based on an analysis that they conducted using a VR simulation of a 3D object, for the sake of reproducibility in the findings, we need to be able to bring back that 3D model and display it in the same manner. Any software and hardware change to the platform could potentially change the viewing environment in unexpected ways, and it is important for us to develop policies and procedures to document any changes that happen to the system.

While we cannot freeze the ever-changing flow of digital technology, in my position as research fellow, I am exploring the following possibilities for ensuring long-term access to 3D objects viewed in OVAL, as well as the OVAL platform itself:

  • Assessing the sustainability of different 3D file formats: This will enable us to select and implement only file formats that are likely to be widely supported into the future, and are most likely to be suitable for long-term access.
  • Adopt a metadata schema for documenting 3D digital objects: Looking to existing 3D metadata standards and by consulting with experts who have extensive experience working with 3D models, we hope to develop a metadata schema that can be used to describe 3D objects and document their technical attributes, intellectual property rights and any usage restrictions, and provenance.
  • Developing methods for preserving and maintaining a collection of old OVAL code, such that we could go back to earlier versions of OVAL to display older content.
  • Containerization: In order to make earlier versions of the OVAL platform accessible as technology develops, we are looking into the possibility of using software containers, which create packages that contain all the code and system resources needed to run earlier versions, even as computer system configurations change.

Most, if not all, of these concerns depend on a complex of policies and practices of DOCUMENTATION! For instance, documenting which version of OVAL was used to view a particular artifact for a particular research project, is related to documenting the relationships between the digital model, the original artifact and the techniques used to create the digital model. Establishing effective guidelines for consistently and accurately documenting relationships between the original, the digital copy and the display platform, ensures that any research claims made about data visualized in OVAL can be verified or reproduced by other researchers at different institutions into the future.

As you can see, we have our work cut out for us! Please follow this ongoing project on the project website (coming Winter 2016): vrpreservation.oucreate.com

By Zack Lischer-Katz
Research Fellow in Data Curation
University of Oklahoma Libraries

[i] Elizabeth Pober (OU College of Architecture) and Matthew Cook (OU Libraries) have identified the key visual-spatial benefits of viewing 3D objects in a VR environment in their 2016 article, “The Design & Development of an Immersive Learning System for Spatial Analysis and Visual Cognition,” available here: http://bit.ly/2gmPwbM

Measuring the Impact of Research

Share This:

untitled-1

  • Online views and/or PDF downloads
  • Social media activity, e.g., Twitter or Facebook mentions
  • Discussion in blog posts
  • “Saves” to bookmarking or bibliographic management sites, e.g., Mendeley
  • Citations from sources besides academic journals, e.g., news stories.
  • In addition to providing information on different types of usage, Altmetrics can help researchers see the impact of nontraditional types of publications. While you may be able to find scholarly citations to your journal articles easily enough, you often can’t do the same for your datasets or software or blog posts. Altmetrics, however, can sometimes fill that information gap.

    Of course, altmetrics are still an emerging field, and researchers are still learning how to interpret the information that they provide. As has always been the case with citation metrics, it falls to individual researchers, departments, and academic disciplines to use them appropriately.

    Getting Started with Altmetrics:

    Deposit your work with OU’s institutional repository, SHAREOK. The Altmetric badge (from Altmetric.com) will automatically be added to each publication’s page.

    Create a profile with ImpactStory.

    Further Reading:

    By Molly Strothmann, Social & Behavioral Sciences Librarian, University of Oklahoma

    OSF Preprints Repository

    Share This:

    Open Science Framework

    The Center for Open Science has launched OSF Preprints, an open access repository of more than 350,000 preprints of scholarly publications.

    Researchers can upload preprints, create projects, and work collaboratively. With OSF for Meetings, academic organizations can register events and share documents with participants.

    Other key features include document-level metrics, data archiving, and project tracking.

    The open source infrastructure of OSF Preprints also supports three repositories: SocArXiv, PsyArXiv, and engrXiv. These preprint collections, available on independent platforms, specialize in social science, psychology and engineering scholarship, respectively.

    For the latest updates, follow @OSFramework.

    Visit OU Libraries’ Open Access page to learn more about author rights, open access, and copyright.

    By Kristal Boulden, Social Sciences & Humanities Librarian

    Digital Humanities Readings

    Share This:

    Screen Shot 2016-09-14 at 10.55.28 AMThe application What is Digital Humanities? created by Jason Heppler puts the DH field into perspective. With each refresh of the screen, one gets a new definition of digital humanities that ranges from the laughable to the profound. For those who have been on the fringe of digital humanities this application helps us realize that there are no hard and fast rules to being a digital humanist. I would argue that all one needs to be a digital humanist is a curiosity and fearlessness to explore and work with digital tools for one’s research and teaching. But still there is the question of where one begins. To gain more insight into how digital humanists work and think, I recommend the following readings as a possible starting point.

    Articles/Posts/Essays

    The Emergence of the Digital Humanities (as the Network is Everting)
    Steven E. Jones, 2016
    (From Debates in Digital Humanities)

    The Hermeneutics of Screwing Around; or What You DO with a Million Books
    Stephen Ramsey, 2014
    (From Pastplay: Teaching and Learning History with Technology)

    Digital Humanities: Where to start
    Jennifer L. Adams and Kevin G. Gunn
    C&RL News: October 2012

    APIs: How Machines Share and Expose Digital Collections

    Neoliberal Tools (and Archives): A Political History of Digital Humanities
    Daniel Allington, Sarah Brouillette, David Golumbia
    Los Angeles Times Review of Books:  May 1, 2016

    Books online

    Debates in Digital Humanities
    Lauren F. Klein, Matthew K. Gold
    2016

    Digital Humanities
    Anne Burdick, Johanna Drucker, Peter Lunenfeld, Todd Presner, Jeffrey Schnapp
    2012

    A Companion to Digital Humanities
    ed. Susan Schreibman, Ray Siemans, John Unsworth
    2004

    — Tara Carlisle, Digital Scholarship Specialist, OU Libraries

    Navigating Research Bazaar

    Share This:

    On September 7 and 8, OU Libraries will host its 2nd Research Bazaar. It will be a packed two days with workshop, presentations, and discussions. While all of the sessions can be attended as a standalone, a few could be grouped together to create a map for developing a new skill. These charts are just taste of what ResBaz has to offer. Please visit the event website to register and learn more!

    DATA MANAGEMENT

    chart of data management workshops
    Data Management Workshops

    DATA VISUALIZATION

    chart of data viz workshops
    Data Viz Workshops

    Content Management Systems/Web Publishing

    chart of cms workshops
    CMS/Web Publishing Workshops

     

    Is What You Do Data?

    Share This:

    Is What You Do Data? That was the question that my colleague Mark Laufersweiler tossed out to our planning committee as we discussed the theme for our upcoming event, ResBaz (Research Bazaar). Looking at its definition, data is plural for datum and in Latin it means “something given.” While we often ascribe data to the sciences, all of us are data creators and data consumers. When we fill out a form, write an email, take a picture, we create data, or what some datawould refer to as empirical evidence. We all create and work with data more than we realize. The next question, then, is, how do we manage that data, especially in our academic careers as students and teachers?

    We decided that we would offer a series of hands-on workshops and presentations that addressed best practices for creating, collecting, organizing, analyzing, mapping, annotating, and parsing data. We specifically wanted to communicate to students and faculty that this event is for everyone, even those who say they don’t work with data.

    This year’s ResBaz, therefore, offers a host of workshops to give us the tools and methods that will help us work effectively with our particular set of data, whether it involves analyzing a writer’s life works, mapping battles in a war, or analyzing experimental data–all need some kind of digital tool to help harness large amounts of information.

    ResBaz consists of two days of hands-on workshops that provides opportunities to try new software or help us rethink how we are organizing our data. Both days (9/7-9/8) will include lunch and afternoon receptions to give everyone an opportunity to share ideas and connect. The final session on the second day will wrap up with a Panel Discussion consisting of four OU scholars (listed below) who will talk about the different ways they use data.

    ResBaz_transparent2-cropped_small copy

    Wednesday, September 7th – Thursday, September 8th
    Full schedule and to register

    Discussion Panel: September 8th, 2:00 – 3:00 pm
    (Bizzell Memorial Library, Lower Level 1 Community Room)
    Dr. Laura Bartley, Department of Microbiology and Plant Biology
    Dr. Robert Cichewicz, Department of Chemistry and Biochemistry
    Dr. Henry Neeman, Supercomputing Ctr for Education and Research
    Dr. Sam Huskey, Department of Classics and Letters

    We look forward to seeing you at ResBaz!

    By Tara Carlisle, Digital Scholarship Specialist, OU Libraries

    Text Mining Oklahoma’s Newspapers

    Share This:

    The Oklahoma Historical Society Research Division’s website, Gateway to Oklahoma History, is a fantastic, freely available source that includes digitized Oklahoma newspapers from the 1840s to the 1920s. Like many digitization projects, it’s a work in progress, but already contains a wealth of valuable information.

    The basic search screen, shown here, allows you to search the full text of the database, metadata, title, subject and creator. The “Explore” function in the upper right corner allows you to browse locations, dates, and titles.

    OKgateway

    When I first encountered this database, I did what many of us do, but may be ashamed to admit: looked up my family name. Unfortunately, the first newspaper result is about a bank robber who shares my last name and who may or may not be related. I quickly tried to think of ways to refine my search to the decent Scriveners, but saw no easy way out. Additionally, “scrivener” is a legal term and a profession that appears fairly frequently in historic newspapers.

    Luckily, I remembered my father’s mother’s maiden name, Malahy, and quickly found a much nicer feel-good story: my grandmother had written a letter to Santa Claus that was published in a Shawnee newspaper in 1917, when she was 7 years old.

    Screen Shot 2016-08-01 at 4.56.13 PM

    Whatever type of research you are doing, here are some helpful hints for the Gateway to Oklahoma History:

    • When you do a keyword search, your terms will be highlighted in yellow within the digitized document.
    • If the term doesn’t appear on the first page, be sure to check the other pages.
      • I prefer to do this by clicking on the front page of the newspaper, and then clicking on “zoom/full page.”
      • At that point, I can easily scroll through the pages until I find the yellow highlighting, or zoom in on the page so I can actually read it.
    • Keep in mind that optical character recognition is not perfect. For example, when I searched for the name Malahy, the word “salary” came up a lot.
    • In this database, like in most that have digitized materials, there is very little after the year 1923.
    • Even though my last name is not nearly as common as some, I still run into many “scriveners” when I do a keyword search because the word was more commonly used in the past.
      • When doing research in any digital database, have patience and be flexible with your keywords.

    By Laurie Scrivener, History and Area Studies Librarian

    On the Value of Sharing Failure and Ignorance

    Share This:

    LiorahLibrarianI have gone down many blind alleys in my text-mining project, and I don’t think of them as failures; they are experiments that didn’t work out. The project involves mining the text of the U.S. horror genre television show Supernatural (the CW network, 2005-present) to try to demonstrate in an objective fashion the quality of the dialogue. I’ve hit roadblocks and met dead ends in every phase of the project so far, from creating the data, to learning computer-assisted analysis and other tools, to defining the research questions. And of course the darkest alley of all is the one I haven’t reached yet: what if the results don’t back me up?

    The first and by far the most time and labor intensive step in the process was creating the datasets – in this case, the dialogue from the first 10 years of the program, comprising 217 episodes (the SPN Corpus), the dialogue from one of the main characters (the Dean Corpus), and the dialogue from the other main character (the Sam Corpus). I attempted to create the SPN Corpus a few different ways: using transcripts of aired episodes made by fans and stripping out everything that wasn’t dialogue and associated speakers; starting with files used in closed captioning and dubbing, stripping the time stamps (which could be done automatically using one of the programs available on the internet for that purpose), and adding the speakers; and enlisting the help of a graduate student who valiantly tried to write a program to automate the process by magically comparing the transcript and caption files. The first method is time consuming. The second method turned out to be even more time-consuming and not reliably accurate. The third method worked to some extent but only, at best, about 75% of the time; going through and fixing errors was the most time-consuming of all. This last experiment is one I regret, because it took up quite a bit of the programmer’s time. I hope he is able to make use of what he learned writing the program.

    There were other stumbling blocks, the most notable of which I encountered when creating the separate Sam and Dean corpora. There are numerous times in the series when one or both of the characters was possessed, supernaturally influenced, impersonated, hallucinatory, sent to an alternate reality, or body-swapped. I waivered considerably while deciding what exactly to count as dialogue spoken by Sam or Dean, and not some version of them. I consulted some fans, got a few opinions, and tried to devise a set of criteria. In the end, there were no rules I could apply universally, and some of the decisions I made were based on instinct. But there was quite of bit of mind-changing and stress in the decision-making.

    Incidentally, from what I can tell, an actual corpus linguist wouldn’t sweat these small details. It likely wouldn’t make much difference to the results. But whether it’s the librarian in me or the fan or a combination of the two, I want precision to the extent possible.

    At various pointsLiorahpost in this project I have shared my experience. I published a chapter on issues with creating the data and finding and learning a corpus analysis toolkit. I gave a few presentations at a couple of THATCamps and a Research Bazaar (ResBaz), mostly on how to use developer Laurence Anthony’s AntConc, the tool I selected to analyze my corpora. And at each of these points, I learned something. Before writing about or presenting on AntConc I needed to learn how to use it, and at each stage I learned a bit more about computer-assisted textual analysis and corpus linguistics generally.I did a good deal of reading, playing around, and corresponding with Dr. Anthony. I progressed from being able to demonstrate only the most basic concordance and word frequency tools to becoming sufficiently knowledgeable – but not expert — at explaining the more advanced tools in theScreen Shot 2016-07-11 at 3.31.21 PM AntConc kit. I recall trying to show how to compare the text of Frankenstein to Dracula at my first THAT Camp and getting completely confused about which was the reference corpus and what I was seeing in the results, until someone in the audience showed me. It was a complete “duh” moment, more so because I was the presenter. However, no one left the room in a huff, and I didn’t hear any murmurs to the effect that the presenter was incompetent. I credit the atmosphere of the THATCamp for this; the nature of the format allows for some stumbling. We’re all learning, after all.

    It was while presenting to the ResBaz that I learned some extremely useful skills, and I only learned them because I was open about what I didn’t know. I wanted to create word lists of adjectives, nouns, and verbs used by Sam and Dean, separately. I’d found an easy part-of-speech tagger but I had two problems: first, the tagger created individual files and placed them in the same folder as the untagged files. That was 217 files I had to pull out of a folder. Second, I didn’t want the same word to appear more than once, since I’m concerned with the breadth of the vocabulary, not word frequency. But I didn’t know how to deal with the duplicates. I could make out some sotto voce discussion among two attendees. They obviously knew something I didn’t, but seemed hesitant to say so. I’m sure they were being polite and didn’t want to call me on my ignorance. But I assured them that I’d be more than grateful to hear what they had to say, no matter how elementary it might seem to them. If these two guys hadn’t have happened to be at my little presentation I would never have learned how to concatenate files using GitBash or how to deduplicate using Excel.

    At the very least, sharing ignorance can confirm that what you don’t know, others also don’t know. Maybe that tool you were hoping for doesn’t exist, yet. I asked a few text analysis experts at my second THAT Camp if they knew of a tool that could identify synonyms in a word list. No one did. But of course, if anyone reading this knows of one, I hope you’ll share it with me!

    By Liorah Golomb, Humanities Librarian, University of Oklahoma

    css.php