Your core responsibility will be to maintain and scale our infrastructure for analytics as our data volume and needs continue to grow at a rapid pace. This is a high impact role, where you will be driving initiatives affecting teams and decisions across the company. You’ll be a great fit if you thrive when given ownership, as you would be the key decision maker in the realm of architecture and implementation.
Main duties include: Database Design, Programming, Building Front End Tools, Creating Workflow, System Automation
Architect systems and end-to-end solutions that provide fast, efficient and reliable interfaces to heterogeneous data, meta data for internal users of the analytics infrastructure.
Automate existing processes and create systems that favor self-service data consumption.
Own the quality of our analytics data.
Implement a robust monitoring & logging framework that guarantees the trace-ability of inevitable incidents.
Evaluate whether the best solution for each problem at hand is to build, buy or contract the work.
Interface with data scientists, analysts, product managers and all other customers of the analytics infrastructure to understand their needs and expand the infrastructure as we grow.
BS/BA in Computer Science/Engineering, or relevant technical field with 2 – 3 years of experience as software engineer and/or data engineer and/or front-end engineer and/or full-stack engineer
Strong with Python, Java and Linux
Ability to manage data warehouse plans and communicate them to internal clients.
At least 4 years of experience as a Data Engineer, or in a role that required expertise in data pipeline technologies.
Strong overall programming skills, able to write modular, maintainable code and high quality code
Strong in web-based programming (CSS, HTML, PHP)
Experience in one or more data visualization libraries like Tableau, Plotly, InfoGram, Material Design
Strong python programming specially in python machine learning and data mining libraries (SciPy, Panda, Numpy)
Strong in Linux and shell-scripting
Specialized experience with at least one of HDFS, EMR, Redshift, Spark, Flink, or Presto.
Experience with SQL RDBMS is required.
Experience in client/server, RESTFul architecture and tools like Jenkins, RunDeck
Experience in basics of data mining, clustering, classification and comfortable to work with large matrices of data effectively
Familiar with CUDA, Blast and machine leaning engineers like Tensorflow, Torch, PyTorch, DIGITS