Biomedical Data Management Engineer
Job Code: 4822
Job Grade: J
The Stanford Center for Genomics and Personalized Medicine (SCGPM), situated in the heart of Silicon Valley, has an exciting opportunity available for a motivated Biomedical Data Management Engineer to help us get on top of the rising tides of Big Data. The ideal person for this position is a systems engineer with a flair for data management and passion for health care. We are looking to hire two candidates for this position.
Research projects involving Big Data solutions are often proposed without adequate attention to the difficulties of large-scale data management. The masses of data that accumulate from high-throughput acquisition techniques (sequencing, mass spectrometry, and the like) require sophisticated, scalable systems to ensure that this data is accessible, searchable, shareable, and secured.
This engineer will address this problem by developing and implementing platform-agnostic biomedical data management solutions to support the needs of researchers to manage, store, access, share, and protect large stores of biomedical data, and to account for that data’s security, logging, and auditing. These solutions often require that extended dialogues with the lab(s) involved be conducted about the important metadata to be stored with the data, and the methods by which the lab(s) will need to search for and access the particular subsets of data that their analyses will require. The datasets we manage are often associated with separate, highly sensitive datasets, so the solutions we generate often need an ability to federate the storage of data between high-security clinical databases and medium-security research databases, with a seamless interface joining them.
Because many data management problems have similar solutions, we strive to produce reusable systems which can be applied to several different research data challenges. To facilitate the sharing of these reusable solutions with the research community, we often put the code for our solutions in the public domain.
Data from our collaborating laboratories span the range from medium-scale single lab datasets to large-scale consortium-wide data repositories, so solutions for solving our data management problems will need to be scalable both up and down.
TEAM: Our SCGPM core bioinformatics team is a multi-disciplinary group composed of about a dozen scientists, engineers, and software developers with complementary backgrounds, each contributing their own expertise in managing and analyzing complex biomedical data [see http://med.stanford.edu/gbsc/scgpm-team.html]. Projects supported by this team include the Stanford Genome Sequencing Center, the Stanford Center for Excellence in Stem Cell Genomics (funded by CIRM), the VA’s Million Veteran Project, and the ENCODE Functional Genomics Project.
Much of the work of this team is facilitated by the SCGPM computational facility, which provides both a best-in-class on-premises HPC cluster with thousands of cores and 10PB of storage, and managed access to the Google Cloud Platform. This facility also provides computational resources to over 130 labs at Stanford and more than 1,300 researchers.
- The successful candidate must be able to quickly grasp the objectives of research projects and assemble solutions from a range of technologies, standards, and approaches. This person should have an innate desire to learn new methods and technologies and adapt to demands of fast-paced research, because our collaborating labs are on the cutting edge of work in their respective areas.
- Previous experience working in an academic environment is a plus.
- The successful candidate will comply with university and government health and safety regulations and policies.
Additional duties include:
- Compare, evaluate, and implement new features and technologies for data management, and integrate them into new solutions. It will be important to compare and evaluate the fitness of various technologies against our collaborator use cases in terms of features, ease of use, and cost-to-user and long-term maintenance requirements.
- Conceptualize, design, implement, and develop solutions for complex system/programs independently but provide technical analysis, design, development, conversion, and implementation work. All development activities will involve presenting technical content to peers, collaborators, business teams and industry partners.
- Document system builds and application configurations; maintain and update documentation as needed. Follow SCGPM team practices in software development and data analysis. All development activities are expected to be on github with documentation on the github wiki or readthedocs as necessary. White papers and peer-reviewed publications are expected.
- Work as a project leader, as needed, for projects of moderate complexity. This engineer is often expected to take ownership of every aspect of their projects.
- Become our domain expert for biomedical data management and serve as a technical resource in use of public-domain applications developed by the SCGPM bioinformatics team.
- Mentor lower-level developers, end users, and sometimes, summer interns.
- Other duties may also be assigned.
- Four-year degree in Computer Science, Computational Physics/Biology, Biomedical Informatics, Bioinformatics, or related field.
- Knowledge of and experience with biomedical data formats (FASTQ, FASTA, BAM, Proteomics, Metabolomics, et al.).
- Strong background and interest in one or more of: Databases, System engineering, Big Data, Data mining, Data warehousing.
- Expert programming skills in Python or Java are highly desired.
- Experience interacting with electronic health records.
- Experience with cloud computing is a plus.
- Ability to rapidly learn new technologies and programming techniques.
- Excellent verbal and written communication skills.
EDUCATION & EXPERIENCE (REQUIRED):
- Bachelor's degree and five years of relevant experience, or a combination of education and relevant experience.
KNOWLEDGE, SKILLS AND ABILITIES (REQUIRED):
- Expertise in designing, developing, testing, and deploying applications.
- Proficiency with application design and data modeling.
- Ability to define and solve logical problems for highly technical applications.
- Strong communication skills with both technical and non-technical clients.
- Ability to lead activities on structured team development projects.
- Ability to select, adapt, and effectively use a variety of programming methods.
- Knowledge of application domain.
- Constantly perform desk-based computer tasks.
- Frequently sit, grasp lightly/fine manipulation.
- Occasionally stand/walk, writing by hand.
- Rarely use a telephone, lift/carry/push/pull objects that weigh up to 10 pounds.
- Consistent with its obligations under the law, the University will provide reasonable accommodation to any employee with a disability who requires accommodation to perform the essential functions of his or her job.
May work extended hours, evening and weekends.
- Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
- Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for safety; communicates safety concerns; uses and promotes safe behaviors based on training and lessons learned.
- Subject to and expected to comply with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in the University's Administrative Guide, http://adminguide.stanford.edu.