HARVARD UNIVERSITY

Data Science Fellow in Deep Learning

Location
Cambridge, Massachusetts (US)
Posted
13 Aug 2020
End of advertisement period
13 Oct 2020
Contract Type
Fixed Term
Hours
Full Time

Data Science Fellow in Deep Learning
School    Faculty of Arts and Sciences
Department/Area    The Institute for Quantitative Social Science

Position Description     

This posting is for a data science fellow in deep learning. The fellow will work with the PI, Professor Melissa Dell, to develop document layout analysis and natural language processing methods that will be used to create a computable database of the contents of historical newspapers across thousands of American communities. The resulting data will be used by social scientists to examine questions of central importance to the health and prosperity of American communities. The fellow will be an active participant in the Harvard research community and will have opportunities to develop their own research agenda.

The position requires a thorough knowledge of deep learning methods. An ideal candidate will have a strong background in document layout analysis and/or natural language processing. The position requires a Bachelors degree. The fellow must be self-directed and able to apply the relevant research frontiers to this use case. The ideal candidate will be planning to apply to PhD programs in computer science or electrical engineering and would benefit from spending time working in a university setting on problems at the intersection of computer vision and natural language processing.

The position has a nine-month term (with a potential opportunity for extension, conditional on funding availability and performance). Start date is flexible. Interested candidates should send a CV and one page research statement detailing your experience with deep learning based methods to melissadell@fas.harvard.edu

More specifically, the project will develop deep learning-based document layout analysis methods and natural language processing methods to create a computable database for over 7,000 American community newspapers in all 50 states, encompassing over 12 million newspaper editions published between 1850 and 2010. The resulting publicly available American Communities Computable Newspaper Database will substantially enhance and democratize access to historical newspaper data by providing outputs from NLP analyses conducted on over one hundred million full article texts and headlines. This data can be used to elucidate a wide variety of research questions, with a particular emphasis on the role of the media in promoting the rise and decline of trust in scientific medicine and public institutions across 20th century America. 

Basic Qualifications    
Thorough knowledge of deep learning methods required.

Background in document layout analysis and/or natural language processing required.

A Bachelors degree is required.

The Fellow must be self-directed and able to apply the relevant research frontiers to this use case.

Additional Qualifications    
The ideal candidate will have a strong background in document layout analysis and/or natural language processing.

The ideal candidate will be planning to apply to PhD programs in computer science or electrical engineering and would benefit from spending time working in a university setting on problems at the intersection of computer vision and natural language processing.

Special Instructions    
TO APPLY:

PLEASE DO NOT APPLY ONLINE. Interested candidates should send a CV and one page research statement detailing your experience with deep learning based methods to melissadell@fas.harvard.edu

Only applicants who follow these instructions will be considered.

Contact Information    
Professor Melissa Dell

Contact Email    melissadell@fas.harvard.edu

Equal Opportunity Employer    
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, gender identity, sexual orientation, pregnancy and pregnancy-related conditions or any other characteristic protected by law.