Data Scraping Art History Survey Texts

Holland working at a table in the Broadhead Center, measuring an entry with a ruler
Holland working at a table in the Broadhead Center, measuring an entry with a ruler

The Dean’s Summer Research Fellowship Grant was a pivotal opportunity in aiding my ability to do research for my undergraduate honors thesis. With the funds provided, I was able to extend the scope of my data and collect information. To provide context, the first research question my thesis will address is how do art historical survey texts change through editions? What are the demographics of the artists included and how, if at all, does that change through time? This past spring I examined the text, Janson’s History of Art, cataloguing 9 various editions of the text through time. This summer, I broadened my scope to include another highly regarded introductory text of art history: Helen Gardner’s Art Through The Ages.

This grant allowed for me to invest my time in manually data scraping 8 of the 16 Editions of her text. Such manual data scraping consists of recording every two dimension artwork produced after c. 1750. The variables I recorded this summer were as follows: Artist Name, Title of Work, Artist Gender, Artist Race, Artist Nationality, Medium, Date of Work, Length of Text of Work in CM, Width of Text of Work in CM, Height of Actual Work in CM, Width of Actual Work in CM, Height of Work in Text in CM, Width of Work in Text in CM, Location of Work, and Page Number. I use a ruler to measure the length and width of text regarding a particular artwork, as well as the height and width of the image of the work within the text itself.

Collecting such data will allow for me to examine the change of Helen Gardner’s Art of the Ages through time, looking at the demographic of the artists included. I will then be able to compare the demographics of artists included by Gardner compares to Janson.

The second research question I will work to answer with my manually scraped data is which variables, if any, predict the likelihood of an artist’s inclusion in art history survey texts? Does a higher median price of the works of an artist at auction influence the likelihood of inclusion? Is there a correlation between inclusion in exhibition spaces with inclusion in an introductory art history textbook? How do publications by art experts on an artist impact the likelihood of inclusion? I am working to create a statistical model that works to best predict why an artist rises to fame through canonization.