Invited Prof at Diderot (LARCA/CLILLAC-ARP joint initiative) | |
---|---|
http://humanitiesdata.org/ | |
professional webpage | |
Adresse électronique/ email: | taylor.arnold@acm.org |
Taylor is to present Geospatial Analysis and Text mining as analysed in his R textbook (45 minutes). Geospatial Data and Text mining within the UMS RIATE and the CIST research group (45 minutes) The dicussion will offer a comparison with other R packages developed for cartography and current ongoing projects Discussant for spatial data : Claude Grasland (CIST & Géographie-cités) Discussant for text mining : Nicolas Ballier
Discussants : Maria Zimina, Antonio Balvet, Nicolas Ballier
A 45 minute talk presenting the project, + a short teaser for the book Humanities Data in R
Between 1935 and 1945, the U.S. Federal Government employed a group of photographers to help build support for New Deal programs and U.S. entry into World War II. The group created a corpus of over 170 thousand documentary photographs showing daily life from all of the then 48 states. The collection, known as the Farm Security Administration-Office of War Information (FSA-OWI) archive for the two government agencies that housed the photographers, is now a common source of historical evidence for scholars of 20th century America. It has been digitized by the Library of Congress; as a work of the federal government, these digitized images are in the public domain. Photogrammar, which I helped found and currently serve as the co-director of, is a project for analyzing and visualizing the FSA-OWI collection. The web-based portion of the project (photogrammar.yale.edu) visualizes the photographs over historical maps, by historical classification schemes, and through an analysis of color composition. Computational techniques, predominantly from computer vision, have been used to infer and reconstruct metadata that is also displayed on the site. Details of some of these techniques are described in the DHQ article “Uncovering Latent Metadata in the FSA-OWI Photographic Archive”.
Photogrammar is supported by grants from the National Endowment for the Humanities and American Council for Learned Societies, and has been very well-received since the website’s launch in 2014. We have been invited to present our work at over a dozen institutions including the Museum of Modern Art and the Smithsonian’s Archives of American Art. The site has attracted nearly 1 million unique visitors and over 6 million page views over the past 12 months. It has received coverage from various media outlets, including the BBC, Atlantic, Slate, Le Monde, and NPR.
Discussant : François Brunet (LARCA)
Summary: This talk will highlight the theoretical and practical applications of data visualization to the study of cultural data. We start by describing a formal structure for data visualization and the data science process more generally. Intersections will be given between these methods and specific formalisms in humanistic fields including theories of the archive and knowledge production. The second part of the talk focuses on specific applications of exploratory data visualization to study the movement of people in New York City. We see how various visual techniques serve to both confirm some “common sense” conclusions while simultaneously challenging other widely-held notions. The talk will finish by extending these techniques to the more complex data format of networks. Once again, data visualization will provide a powerful tool for extracting and displaying new forms of knowledge.
Biographical sketch: Taylor Arnold is Assistant Professor of Statistics at the University of Richmond. Prior appointments include Lecturer of Statistics at Yale University and Senior Scientist at AT&T Labs Research. His work centres on the computational and computing challenges of doing data analysis on large scale datasets with a focus on text and image processing. Arnold’s text Humanities Data in R (Springer 2015) addresses these challenges in the context of humanities applications, a main area of application for his work. He holds grants supporting related work from the National Endowment of the Humanities (NEH) and the American Council of Learned Societies (ACLS). A forthcoming text, A Computational Approach to Statistical Learning (CRC Press 2018), further explores the technical issues of applying these techniques at scale.
A 30 minute plenary talk for PhD students in literature and civilisation: a teaser for the more intensive R sessions to take place on May 22-23-24. This talk is part of a two day LARCA event for PhD students (18-19 May).
Please register on-line https://beta.doodle.com/poll/nrfkqq2pq8em5qna#table
Directrice : Pr Natalie Kübler
Centre de Linguistique Inter-langues,
de Lexicologie, de Linguistique Anglaise
et de Corpus-Atelier de Recherche sur la Parole
EA 3967
8 place Paul Ricœur
75013 Paris
Case courrier 7002
5 rue Thomas Mann
75205 Paris cedex 13