• Utilizing Time Series Analysis using Facebook’s Prophet to Analyze Firearm Permits

    The debate over gun ownership and regulation in the United States remains a contentious issue, with arguments often centered around interpretations of the Second Amendment, public safety concerns, and the effectiveness of existing policies. To inform these discussions, examining the available data on firearm sales and background checks is crucial.

    In this blog post, we’ll be diving into the FBI’s National Instant Criminal Background Check System (NICS) data, as compiled by the Data Liberation Project. NICS was mandated by the Brady Handgun Violence Prevention Act of 1993 and launched by the FBI in 1998. It’s used by Federal...

  • Convolutional Neural Networks and HDBSCAN for Tagging Handwritten Archival Material

    Handwritten archival materials present a unique challenge for automated tagging and categorization. Traditional optical character recognition (OCR) techniques still often struggle with the variability and complexity of handwritten text. As a result, most handwritten material is manually tagged and categorized by human archivists, which is a time-consuming and labor-intensive process. In many cases, historical documents may use a script or writing style that is no longer in common use. Most students are no longer taught cursive, making it difficult for younger generations to read historical documents. While transformer models have shown promise in transcribing handwritten text, they require large...

  • An Undue Burden: A Look at Digital Humanities Conference Travel

    About a year ago, I updated my GIS time-lapse map looking at the disproportionate travel that digital humanities scholars in the Global South took to attend the annual Digital Humanities conference. The map itself uses data from the Index of DH Conferences put together by Matthew D. Lincoln, Scott B. Weingart, and Nickoal Eichmann-Kalwara. Over time, there have been updates to the Index, and so I’ve been able to update my map accordingly. To create the visualization, I reverse-geocoded the institutional affiliations of authors in the dataset and the conference hosting institutions using ArcGIS’ API through tidygeocoder....

  • Mustaches, Unibrows, and Shalwar Khameezes: How I Learned New Stereotypes about Myself through AI

    Recently, The Verge posted a story about attempting to create an Asian man with a white woman If you read the original article, you will find that the author struggled to generate the images with image generators attempting to give the woman “Asian features.” Because the majority of these image generators are trained on datasets that predominantly feature white individuals, the AI struggled to accurately represent an Asian man without relying on stereotypes and tropes. In addition, stereotypical images of Asian men are prominent throughout the Internet.

    At one point, Meta banned keywords that were related to Asians....

  • Prompt Exploration of AI Image Diffusion Models for Students

    For the last few years, I have had my students explore AI image generation using text prompts. It’s been a fascinating journey to see how the technology has progressed and what students can create with it. When the technology had not been so prominent, I used to have the students do a small competition using Runway. At first, it was very difficult for students to get good results, as the models were not as advanced.

    However, as the technology improved, students were able to create increasingly impressive images with their prompts. Perhaps more importantly, the “wow” factor had gone....

  • Unaccompanied Migrant Children Part 2

    For the last two blog posts, I have worked on the data set that the New York Times released about unaccompanied migrant children in the United States. These are children who have crossed the border into the United States without their parents or legal guardians. In Part 1, I explored the overall trends in the data. The second blog post centers on a searchable database. The hope is that people will be able to look at the data to get a better understanding of the situation and the challenges these children face in their communities. For instance, after searching...

  • Unaccompanied Migrant Children Part 1

    Recently, the New York Times released data looking at the number of unaccompanied migrant children who have crossed into the United States. The U.S. Department of Human Health and Services keeps this data, and the organization was able to gain access to it through the Freedom of Information Act. Conditions for often these children are dire, with many facing violence, abuse, and poverty. As the New York Times has noted, Americans have used these children to build roofs and work the night shift of dangerous jobs. Frequently, the federal agencies ignored numerous warnings about the...

  • Hello World (Seriously!)

    While this site may currently look like it is spam, I am in the process of changing the website from WordPress to Jekyll and hosting it on GitHub. I find WordPress significantly easier to use than Jekyll. Still, I believe that Jekyll will provide a better experience for the data science and cultural analytics work I plan to share on this site. With WordPress, when you have a lot of image embeddings or interactive visualizations, the site can become slow to load and navigate. Jekyll, on the other hand, generates static pages that load quickly and efficiently. Additionally, hosting the...