Research Data Engineer, Computational Social Science Lab

Pennsylvania, United States
08 Nov 2020
End of advertisement period
08 Jan 2021
Contract Type
Full Time

University Overview

The University of Pennsylvania, the largest private employer in Philadelphia, is a world-renowned leader in education, research, and innovation. This historic, Ivy League school consistently ranks among the top 10 universities in the annual U.S. News & World Report survey. Penn has 12 highly-regarded schools that provide opportunities for undergraduate, graduate and continuing education, all influenced by Penn’s distinctive interdisciplinary approach to scholarship and learning.

Penn offers a unique working environment within the city of Philadelphia. The University is situated on a beautiful urban campus, with easy access to a range of educational, cultural, and recreational activities. With its historical significance and landmarks, lively cultural offerings, and wide variety of atmospheres, Philadelphia is the perfect place to call home for work and play.

The University offers a competitive benefits package that includes excellent healthcare and tuition benefits for employees and their families, generous retirement benefits, a wide variety of professional development opportunities, supportive work and family benefits, a wealth of health and wellness programs and resources, and much more.

Posted Job Title

Research Data Engineer – Computational Social Science Lab

Job Profile Title

Data Analyst D

Job Description Summary

The Computational Social Science Lab will host the research initiatives of Duncan Watts, a University Professor affiliated with the School of Engineering and Applied Science, Wharton School, and Annenberg School for Communication. Combining their resources within the Penn Integrates Knowledge Program, the schools will support the Lab’s projects in the rapidly emerging field of computational social science (CSS).

The Lab will build an innovative data infrastructure of high-quality data on phenomena of social importance from multiple providers, which implies a complex data collection and validation component. To promote mass collaboration in an open-science format among research teams from academy and industry, the Lab needs to ensure secure but flexible access to these data.

The Research Data Engineer will manage the ingestion and processing of data from external providers, implement data quality assurance procedures, manage relationships with Amazon’s AWS and other external computing clusters, and supervise third party developers.

Job Description

The Research Data Engineer for the Computational Social Science Lab will:


  • Manage the ingestion of data from external providers
  • Develop and implement rigorous data validation procedures within and across datasets from various providers
  • Design and implement cost-effective procedures for organizing, documenting, cleaning, and storing datasets
  • Administer Watts-Lab AWS and similar external computing clusters
  • In coordination with research project managers, maintain data processing pipelines and ensure the delivery of clean and validated data within project timelines
  • Design, build, and maintain the backend of the Lab’s websites and dashboards
  • Supervise third party developers of the Lab’s data infrastructure


Bachelor’s degree in computer science, software engineering, or related field and 3-5 years of work experience in data management, data-intensive research projects, and collaboration with software developers. A subject matter expert in one or more technical areas: data architecture, data engineering within “big data” systems such as Hadoop and SQL. Advanced working knowledge of SQL and relational databases, and familiarity with data architecture optimization in RDBMS. Working knowledge of a range of AWS products and experience with cloud-related technologies such as S3, EC2, EMR, IAM policies, Athena, Glue. Experience with a range of object-oriented languages such as Python and R. Strong organizational skills and an ability to manage multiple complex tasks with diverse audiences and varying deadlines.
Experience with assembling and processing natural language datasets and Experience with streaming technologies such as Kafka, Flink, or Spark Streaming preferred.

Job Location - City, State

Philadelphia, Pennsylvania

Department / School

School of Engineering and Applied Science

Pay Range

$59,703.00 - $168,837.00

Affirmative Action Statement 

Penn adheres to a policy that prohibits discrimination on the basis of race, color, sex, sexual orientation, gender identity, religion, creed, national or ethnic origin, citizenship status, age, disability, veteran status, or any other legally protected class.

Special Requirements 

Background check required after a conditional job offer is made. Consideration of the background check will be tailored to the requirements of the job.

Similar jobs

Similar jobs