Data Wrangler

USC - Los Angeles, CA (27 days ago)4.2


The Lawrence J. Ellison Institute for Transformative Medicine of USC is a broadly multidisciplinary group of researchers (cell biologists, physical scientists, mass spectroscopists, clinicians, computational modelers) engaged in highly collaborative, translational research. Our vision is to address important clinical questions through biological and engineering innovation tightly coupled with clinically driven science.

Internal Title: Statistician I

Department: Lawrence J. Ellison Institute for Transformative Medicine of USC

Location: Health Sciences Campus/Santa Monica

Employment Type: Full-Time

Position responsibilities

We are seeking an experienced data wrangler with a strong interest in cancer biology to complement our Data Management and Analytics team. In this role, you will develop and implement robust and effective data management solutions as part of a multi-disciplinary team of clinicians, biologists, computer scientists, mathematicians, statisticians, and economists. Together, you will formulate and execute data management plans and ensure the quality and integrity of the data throughout its lifecycle. Data sets may include high-dimensional multi-omic data, time series from live cell microscopy, and complex clinical data from electronic medical records.

Specific responsibilities:
Support the lab researchers in the collection, processing and analysis of experimental, clinical and simulated biomedical data to ensure that it is properly stored both for investigation and for longer-term preservation; explore and set up appropriate storage resources and ensure that appropriate metadata are employed.
Implement and maintain a robust an Institute-wide biomedical data management infrastructure; coordinate with partners and collaborators to assist data exchange and federation.
Participate in the analysis of data, especially with regard to the more practical aspects like quality assurance, cleaning, aggregation and integration (wrangling).
Test and run data processing pipelines.

Demonstrable Personal Attributes:
Ability to thrive in a fast-paced, multi-disciplinary environment.
Strong organizational skills.
Attention to detail.
Interest in technological advances and solutions.

Knowledge and Skills:
Ability to input, output, and manipulate data using the R statistical analysis platform, including familiarity with the tidyverse ecosystem and the rmarkdown package. Sample code required.
Familiarity with data management practices in biomedical research, including metadata and ontology usages and reproducible research concepts and practices.
Working knowledge of relational databases and other storage solutions and resources (e.g. NoSQL databases, HDF5, …).
Experience with Linux shell scripting, command line tools and administrative tasks.
Facility with Microsoft Excel, Word, and PowerPoint.
Familiarity with software development processes and tools a plus.
Good writing and verbal communication abilities

Preferred Education and Experience:
Master’s degree in computer science, computational biology, engineering, information sciences or a related discipline. Combined experience/education as substitute for minimum education.
Two or more years’ experience in biomedical data collection, transformation, management, and warehousing.
Posting Salary Range: $58,245 - 97,071

Percentage of Time: 100%

Minimum Education: Master's degree, Combined experience/education as substitute for minimum education
Minimum Experience: 2 years, Combined education/experience as substitute for minimum experience
Minimum Field of Expertise: Biometry, Biostatistics or Statistics; mainframe computer and PC experience; experience with SAS, Epilog, BMDP, GLIM or SPSS.