UNGA Research Dashboard

This project was fund­ed by the Nation­al Sci­ence Foun­da­tion (Lan­guage Across Cul­tures: The Com­mu­ni­ca­tion Styles of World Lead­ers, #2117009) between 2021–2025. 

Project team

Leah C. Wind­sor, PI

Direc­tor, Insti­tute for Intel­li­gent Sys­tems (IIS)

Asso­ciate Pro­fes­sor of Applied Lin­guis­tics

Insti­tute for Intel­li­gent Sys­tems + Depart­ment of Eng­lish

Alis­tair Wind­sor, co-PI

Asso­ciate Pro­fes­sor

Depart­ment of Math­e­mat­i­cal Sci­ences

Fac­ul­ty Affil­i­ate, IIS

Miri­am Van Mers­ber­gen, co-PI

Asso­ciate Pro­fes­sor

School of Com­mu­ni­ca­tion Sci­ences and Dis­or­ders

Fac­ul­ty Affil­i­ate, IIS

Nicholas Simon, co-PI

Asso­ciate Pro­fes­sor

Depart­ment of Psy­chol­o­gy

Fac­ul­ty Affil­i­ate, IIS

J. Elliott Casal

Assis­tant Pro­fes­sor of Applied Lin­guis­tics

Depart­ment of Eng­lish

Fac­ul­ty Affil­i­ate, IIS

Deb­o­rah Tollef­sen

Dean, Grad­u­ate School

Pro­fes­sor

Depart­ment of Psy­chol­o­gy

Fac­ul­ty Affil­i­ate, IIS

Shaun Gal­lagher

Pro­fes­sor

Depart­ment of Psy­chol­o­gy

Fac­ul­ty Affil­i­ate, IIS

James “Rusty” Han­er

Senior Soft­ware Devel­op­er, IIS

August White

Senior Soft­ware Devel­op­er, IIS


Dashboard features

This dash­board pro­vides a vari­ety of lin­guis­tic fea­tures for three dif­fer­ent ana­lyt­ic approach­es. These include Coh-Metrix (107 fea­tures), LIWC (118 fea­tures), and LDA (50 fea­tures). Coh-Metrix pro­vides text-lev­el analy­sis of lex­i­cal and seman­tic word fea­tures. 1,2 LIWC is a word-count­ing pro­gram for pre­ci­sion insights into rates of usage. 3,4 LDA (also known as “top­ic mod­el­ing”) assigns words into groups based on their prox­im­i­ty and co-occur­rence in the cor­pus. 5,6

What the lin­guis­tic fea­tures MEAN

These three approach­es to ana­lyz­ing lan­guage span both word-order depen­dent (Coh-Metrix) and bag-of-word (LIWC, LDA) approach­es. 

Word-order depen­dent

Word-order depen­dent approach­es pre­serve the rela­tion­ship between words in a text (sen­tence, para­graph, doc­u­ment, etc.) and can be under­stood in the con­text of syn­tax parse trees. Click below for exam­ple.

The 2011 arti­cle on Coh-Metrix pro­vides a descrip­tion of the fea­tures gen­er­at­ed by this soft­ware. Coh-Metrix was devel­oped at The Uni­ver­si­ty of Mem­phis in the Insti­tute for Intel­li­gent Sys­tems by Art Graess­er, Max Louw­erse, and Danielle McNa­ma­ra. There are five high-lev­el aggre­gate fea­tures gen­er­at­ed by a prin­ci­pal com­po­nents analy­sis (PCA) — these include syn­tax sim­plic­i­ty, word con­crete­ness, nar­ra­tiv­i­ty, deep cohe­sion, and shal­low cohe­sion. Syn­tac­tic fea­tures have been used in many dis­ci­plines to under­stand the con­tex­tu­al aspects of lan­guage such as more con­crete or abstract word choic­es and the use of more or less com­plex phras­es.

Bag-of-words

LIWC (Lin­guis­tic Inquiry and Word Count) is a dic­tio­nary-based pro­gram devel­oped at UT Austin by James Pen­nebak­er. This pro­gram is a quick and pow­er­ful approach to ana­lyz­ing lan­guage by count­ing the num­ber of words in a text, pri­mar­i­ly closed-class words such as pro­nouns, con­junc­tions, arti­cles, and prepo­si­tions. 

LDA is an open-source process which can be ana­lyzed in many ways, includ­ing R, Python, and many off-the-shelf pro­grams. LDA assigns words into cat­e­gories based on their prox­im­i­ty in the cor­pus, and these terms form “top­ics”. Researchers qual­i­ta­tive­ly assign names to the words in a top­ic, based on their rela­tion­ship and exper­tise in the sub­ject mat­ter. 

Open the dash­board

How to use the dashboard

1. Deter­mine which fea­tures you would like to graph from Coh-Metrix, LIWC, and/or LDA.
2. Select the appro­pri­ate X‑axis vari­able and Y‑axis vari­able, chart type, and chart aggre­ga­tion (if applic­a­ble). You can select a line chart, bar graph, or world map. 
3. You can down­load the data as a csv, or as a chart or map.nd chart aggre­ga­tion (if applic­a­ble). You can select a line chart, bar graph, or world map. 
4. To reset the para­me­ters and mod­el a new fea­ture or rep­re­sen­ta­tion, click “Clear ISO fil­ter.”


References