Ngrams

The Ngram tool in Gale Digital Scholar Lab visualises the frequency of the most commonly occurring words or phrases within the documents of a Content Set. 

An Ngram is a sequence of words, where N represents the number of words. A 1gram, or unigram, is a single word; a 2gram, or bigram, is a two-word phrase etc. In Gale Digital Scholar Lab, the Ngram tool can be used to compile search terms or words that are often associated with one another. The tool can help determine if a Content Set contains specific terminology or phrases, which otherwise can be difficult to trace without intensive reading of individual documents. By providing insight into the text of documents, Ngrams can provide useful feedback to a user about their close and distant reading strategies by illuminating how writers used vocabulary or phrases.  

A versatile analysis, the Ngram tool can also be used to influence the Cleaning Tool configuration by identifying commonly misspelled or mis-captured words in the documents.  

Learn more about Ngrams in the Gale Digital Scholar Lab Learning Center.

 

PROJECTS

The Books He Carried: A Study of Lindsley Foote Hall's Reading Habits on His Travels | Julianne Peeling (University of Washington)  

 

Of Christ and Capital: The 'Sunday Question' in the 1893 Columbian Exposition | Marie Peeples, Ian Reinl, Elise Tomasian, Danielle Worthy (University of Washington)

 

Welcome to the Digital Serpent Lab | Sid, Truc, Karen, Adelina & Courtney (University of Washington) 

 

Roberto Calvi's trial: Suicide or Murder?  | Aryan Shah, Livia Ngo, Kody Chantavong and Megan Skrobut (University of Washington)   

DISCLAIMER

Any views and opinions expressed in these essays are those of the author in question, and any views or opinions from the original source material are those of the publication in question. Gale, part of Cengage Group, provides facsimile reproductions of original sources and does not endorse or dispute the content contained in them. Author affiliation and information within them are correct as of the original publication date.