Senior IT Data Engineer

Adaptimmune - Philadelphia, PA5.0

Full-timeEstimated: $98,000 - $130,000 a year

Data engineering is the practice of making the appropriate data accessible and available to data scientists, business analytics teams and decision makers for the purposes of down-stream scientific analysis. The senior IT data engineer will work on building, operating, and scaling data solutions, commercial data platforms and IT tools. This is a hands-on IT role that blends software development, data engineering and bioinformatics. He/she will work closely with scientific and IT experts, business analytics teams, and decision makers to enable data access, integrated data reuse and vastly improve time-to-solution for data and analytics initiatives. Scope includes:

Understanding of data and analytic requirements to choose the right IT tools, ways of working and solutions for the job (i.e. speed versus reuse; ad-hoc versus production)
Develop IT frameworks, architectures, integration ETL schemes, databases, production pipelines, visualization and applications for large-scale data processing from data source ingestion to end user consumption
Apply data governance and data security requirements to solutions
This role will require both creative and collaborative working with IT experts, scientists, and line function analytical bioinformaticians. Scope of work will include evangelizing effective data engineering practices and promoting better understanding of data and analytics. The senior data engineer will support fast paced ad-hoc data analysis, individual projects and longer-term enterprise-wide solutions.


1. Understanding of data and analytic requirements to choose the right IT tools, ways of working and solutions for the job (i.e. speed versus reuse; ad-hoc versus production)

2. Hands-on develop IT frameworks, architectures, integration ETL schemes, databases, production pipelines, visualization and applications for large-scale data processing

Data source ingestion to end user visualization
Assemble large, complex data sets that meet functional and non-functional business requirements
IT requirements, design and test documents to support technical implementation
Set up and operate heavy lifting associated with engineering data for analytics
Move IT solutions effectively into production, and manage / optimize these solutions for end users
Automate manual processes for data preparation and integration, optimize data delivery, re-design infrastructure for greater scalability, etc. to improve productivity
Streamline and prepare data for analysis through understanding of data flow and integration
Create and drive standards for data capture, storage, and transformation
3. Apply data governance and data security requirements to solutions

Participate in ensuring compliance and governance during data use
It will be the responsibility of the data engineer to ensure that the data users and consumers use the data provisioned to them responsibly through data governance and compliance initiatives
Work with data governance teams (and information stewards within these teams) and participate in vetting and promoting content created in the business and by data scientists to the curated data catalog for governed reuse
4. Become a data and analytics “evangelist,” “data guru” and “fixer”. Promote the available data and analytics capabilities and expertise and educate users in leveraging these capabilities in achieving their business goals.

5. Other duties as assigned by management in support of rapidly growing company



A bachelor or master degree in computer science, statistics, applied mathematics, computational biology, data management, data science, information systems, bioinformatics or a related field
Combination of IT software engineering, package application programming, data engineering, data integration, and data visualization skills with data science or big data experience
Demonstrated 10 years hands-on work experience in developing IT solutions in big data, small data, and/or complex data
Familiar with popular commercial and open source pipeline, data visualization and analysis tools
Solid understanding of bioinformatics-computational experimental lifecycle and model design
Good understanding of in-process manufacture, research and clinical trial data
An understanding of tools for the analysis of high dimensional data
Ability to easily partner with business users and speak the language of data with the business
High energy, confident, gets things done, yet easy going personality

Experience in Next Generation Sequencing-RNA sequencing data analysis and other bioinformatics tools
Deep understanding of computational methods, scripting and programming languages, and relevant concepts in cancer biology, immunology and/or genetics
Outstanding communication and partnering skills
Prior experience implementing centralized integrated data and analytics tools and solutions
Experience in biotechnology field
GxP experience
Immuno-oncology experience


Strong software engineering experience using computational programming languages (i.e. R, Python, Java, C++), pipeline lifecycle tools, and popular database programming languages (ie. Complex SQL, PL/SQL) for relational databases, operational data stores, ETL and data lakes
Strong ability to design, build and manage “production ready” data pipelines for data structures encompassing data transformation, data integration, data models, schemas and meta-data
Experience with integration of data from multiple data sources to support down-stream scientific analysis
Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies (i.e. ETL/ELT, data replication/CDC, message-oriented data movement, API design and access and upcoming data ingestion and integration technologies such as stream data integration, CEP and data virtualization)
Open source and commercial software package experience
Basic experience working with popular data discovery, analytics and BI software tools like Tableau, Qlik, PowerBI, etc. for semantic-layer-based data discovery and end user visualization
Experience in working with data science teams in refining and optimizing data science and machine learning models and algorithms
Demonstrated success in working with datasets to extract business value using popular data preparation tools to reduce or even automate parts of the tedious data preparation tasks.
Basic experience in working with data governance/data quality and data security teams in moving data pipelines into production with appropriate data quality, governance and security standards and certification.
Demonstrated ability to work across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through containerization techniques such as Docker, AWS, etc.
Adept in agile methodologies and capable of applying DevOps and increasingly DataOps principles to data pipelines to improve the communication, integration, reuse and automation of data flows between data managers and consumers across an organization.

Prior experience as bioinformatician, biotech software programmer or data architect a plus
Highly collaborative and supportive of business and of its ideals and strategies
Practical in approach to decision making, recommendations and problem solving that is principle-based
An understanding of the principles of oncology / immuno-oncology
Prior experience in complex biotechnology and / or pharmaceuticals industry