Posted about 1 year ago
AI Services is a consulting division of DataRobot. As Data Engineer at DataRobot AI Services, you will work on various real-world problems across a variety of industries with the primary focus on the telecom industry. You will be developing data pipeline for modeling and putting models in production. As part of your work, you will be interacting with data scientists, software engineers, and business users.
An ideal candidate should be able to both perform deeply technical work and communicate with non-technical stakeholders.
- Build end-to-end ETL pipelines to enable training and operationalization of machine learning models.
- Build code for ingesting data from relational databases, NoSQL database, flat files, and message queues into big data solutions.
- Integrating the code, produced by data scientists, into data pipelines.
- Build code or configuration to push data from big data solutions into reporting tools and other software.
- Diagnose and mitigate performance issues.
- Communicate with the customer IT personnel to clarify technical details.
- Document the implementation.
- Assist with the installation and set up of big data solutions and DataRobot.
- 2+ years of production experience of building MapReduce jobs, Spark scripts, Oozie workflows, or other Hadoop based applications.
- Good understanding of the distributed data processing.
- Experience of creating Spark scripts either in Python or Scala.
- Experience of diagnosing and mitigating performance issues in Spark scripts.
- Experience of setting up and querying Hive, Presto, or Impala databases.
- Experience of diagnosing and mitigating performance issues in Hive, Presto, or Impala queries.
- Good to have experience of creating streaming solutions and reporting tools.
Individuals seeking employment at DataRobot are considered without regards to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation.