Text Mining Oklahoma’s Newspapers

  • by

Share This:

The Oklahoma Historical Society Research Division’s website, Gateway to Oklahoma History, is a fantastic, freely available source that includes digitized Oklahoma newspapers from the 1840s to the 1920s. Like many digitization projects, it’s a work in progress, but already contains a wealth of valuable information.

The basic search screen, shown here, allows you to search the full text of the database, metadata, title, subject and creator. The “Explore” function in the upper right corner allows you to browse locations, dates, and titles.

OKgateway

When I first encountered this database, I did what many of us do, but may be ashamed to admit: looked up my family name. Unfortunately, the first newspaper result is about a bank robber who shares my last name and who may or may not be related. I quickly tried to think of ways to refine my search to the decent Scriveners, but saw no easy way out. Additionally, “scrivener” is a legal term and a profession that appears fairly frequently in historic newspapers.

Luckily, I remembered my father’s mother’s maiden name, Malahy, and quickly found a much nicer feel-good story: my grandmother had written a letter to Santa Claus that was published in a Shawnee newspaper in 1917, when she was 7 years old.

Screen Shot 2016-08-01 at 4.56.13 PM

Whatever type of research you are doing, here are some helpful hints for the Gateway to Oklahoma History:

  • When you do a keyword search, your terms will be highlighted in yellow within the digitized document.
  • If the term doesn’t appear on the first page, be sure to check the other pages.
    • I prefer to do this by clicking on the front page of the newspaper, and then clicking on “zoom/full page.”
    • At that point, I can easily scroll through the pages until I find the yellow highlighting, or zoom in on the page so I can actually read it.
  • Keep in mind that optical character recognition is not perfect. For example, when I searched for the name Malahy, the word “salary” came up a lot.
  • In this database, like in most that have digitized materials, there is very little after the year 1923.
  • Even though my last name is not nearly as common as some, I still run into many “scriveners” when I do a keyword search because the word was more commonly used in the past.
    • When doing research in any digital database, have patience and be flexible with your keywords.

By Laurie Scrivener, History and Area Studies Librarian

Leave a Reply

Your email address will not be published. Required fields are marked *