Full Job Description
Guardant Health is a leading precision oncology company focused on helping conquer cancer globally through use of its proprietary blood tests, vast data sets and advanced analytics. Its Guardant Health Oncology Platform is designed to leverage its capabilities in technology, clinical development, regulatory and reimbursement to drive commercial adoption, improve patient clinical outcomes and lower healthcare costs. In pursuit of its goal to manage cancer across all stages of the disease, Guardant Health has launched multiple liquid biopsy-based tests, Guardant360® and GuardantOMNI® , for advanced stage cancer patients, which fuel its LUNAR program, which aims to address the needs of early stage cancer patients with neoadjuvant and adjuvant treatment selection, cancer survivors with surveillance, asymptomatic individuals eligible for cancer screening and individuals at a higher risk for developing cancer with early detection. Since its launch in 2014, Guardant360® has been used by more than 7,000 oncologists, over 50 biopharmaceutical companies and all 27 of the National Comprehensive Cancer Network centers.
The Data Platform team provides an enriched and valuable ecosystem of data sources and data services that drive innovation for internal and external systems. This team is dedicated to developing advanced technology (Big Data , Cloud, Machine Learning), systems and services to make data secure, rich, high quality, and fast therefore enabling Guardant the ability to leverage its data assets in an effective and timely manner to maximize technology/business development in the extraordinarily complex oncology diagnostic and therapeutic landscape.
We connect patients with clinical trials, help clinicians order our test and receive our clinical reports, and deliver valuable genomic datasets to researchers to help uncover important insights into treatment paradigms and drug discovery. Our technology stack reflects our views of using the best tools for the job, employing Scala,Java, Python along with Kubernetes, Apache Spark, Presto, Kafka, Docker, MySQL, MongoDB and a variety of AWS services to analyze and disseminate vast volumes of genomic data.
Drive the design and development of logical data models that power extraction and transformation of genomic data into various data assets.
Work collaboratively with business, bioinformatics scientists and translates business requirements into enterprise information architecture.
Drive the architecture of data integration from various clinical application and stores, research databases and external sources.
Develop the processes for updating and maintaining terminologies, and vocabularies including mapping from local to international standards when applicable.
Review & transform data from various sources, maintain quality of the data throughout the transformation process and deliver data in a standard format and structure.
Understand the machine learning requirements and algorithms in R from the Data science team and productionize those models, which will provide key insights on various datasets.
Design the data-lakes to transfer data from multiple hetergoneous sources into the big data platform.
Architect & build data flows to integrate heterogeneous in-house, external operational data sources and analytical data sources in light of the massive amount of data the company generates and processes.
Build data backend comprised of transactional, analytic and NoSQL databases that interact with various layers of data pipelines. Maintain a data lake that supports fast and reliable analytics over terabytes of data.
Develop and improve the current data architecture, data quality, monitoring and data availability.
Work with key stakeholders, including members of the executive, product, data and design teams to assist with data-related solutions and data infrastructure needs
Build data prototypes for iterative development. Evaluate various big data products to integrate into enterprise product road map.
Define best practices for data lineage, data pipeline across the board, meeting quality principles, regulatory compliance and business needs.
Support the data environment by releasing new features, resolving issues for users, and working with other technology teams.
Minimum 4 years of experience on Big Data Platform components such as Hadoop.
Minimum 3 years of experience designing data models, solutions in Genomics space.
Expertise in data, schema, data model, how to bring efficiency in Big data systems and strong hands-on SQL knowledge.
Strong knowledge of data analysis and databases.
Experience with programming languages such Java and and Scala.
Should be strong with cloud centric deployment architectures such as AWS or Azure and understand the complete stack.
Experience with application performance monitoring and assessment desired.
Understanding of automated QA needs related to Big data.
Understanding of various Visualization platform (Tableau, D3JS, others).
Expert knowledge of healthcare including: Clinical terms and concepts;
Experience with managing data in regulated healthcare environment (HIPAA compliant) is a mandatory.
Proficiency with agile or lean development practices.
Strong object-oriented design and analysis skills.
Has a strong aesthetic sensibility that supports clear visual communication of quantitative information.
Bachelor’s degree in Computer Science or related area.
Guardant Health is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability.
All your information will be kept confidential according to EEO guidelines.
Please visit our career page at: http://www.guardanthealth.com/jobs/
To learn more about the information collected when you apply for a position at Guardant Health, Inc. and how it is used, please review our Privacy Notice for Job Applicants.