Overview
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Clarkstech, is seeking the following. Apply via Dice today!
Job Location: Berkeley Heights, NJ or Alpharetta, GA
We are seeking highly motivated Senior Data Engineers to join a cross-functional team building next-generation data and machine learning platforms focused on recommendation systems, advanced analytics, and intelligent decision-making.
This role goes beyond traditional ETL development. We are looking for engineers who can design scalable data solutions while contributing directly to machine learning workflows that transform large-scale merchant datasets into actionable insights and personalized recommendations.
You will work at the intersection of Data Engineering, Machine Learning, and MLOps, partnering with data scientists, product teams, and platform engineers to build end-to-end systems that power analytics and recommendation capabilities.
What You””ll Do
Data Engineering & Platform Development
Design, build, and maintain scalable data pipelines processing large-scale merchant datasets.Develop robust data models and analytical datasets to support downstream machine learning use cases.Implement batch and near real-time data processing solutions.Ensure high standards for data quality, reliability, and performance across the platform.Optimize data workflows for scalability, maintainability, and cost efficiency.
Feature Engineering & Machine Learning Integration
Build feature pipelines and model-ready datasets used for recommendation systems and predictive models.Collaborate with data scientists to operationalize machine learning models.Develop and integrate model inference workflows into production systems.Support experimentation frameworks, model evaluation processes, and performance tracking.Translate business problems into scalable data and ML solutions.
Recommendation Systems Development
Contribute to the design and implementation of recommendation engines using approaches such as:Nearest Neighbor techniquesCollaborative FilteringContent-Based RecommendationsEmbedding-based methodsMachine Learning-driven recommendation modelsSupport model tuning, validation, and continuous improvement.
MLOps & Production Operations
Build and maintain machine learning deployment pipelines.Automate model training, deployment, and promotion processes.Implement monitoring and observability for data pipelines and ML services.Manage model lifecycle activities, including versioning, retraining, and rollback strategies.Partner with platform teams to ensure production readiness and operational excellence.
Required Qualifications
Bachelor””s or Master””s degree in Computer Science, Data Engineering, Information Systems, Statistics, or a related field.5+ years of experience in Data Engineering or related disciplines.Strong hands-on programming experience with Python.Experience building scalable data pipelines and data processing frameworks.Experience developing analytical datasets and performing feature engineering.Solid understanding of machine learning workflows and model integration.Experience supporting machine learning models in production environments.Strong SQL skills and experience working with large datasets.Experience designing data models for analytics and machine learning applications.
AWS Experience (Required)
Hands-on experience with AWS services including:
Amazon S3AWS GlueAmazon SageMakerAmazon ECS and/or FargateAWS IAMCloudWatchEvent-driven architectures and orchestration patterns
Experience deploying and operating data and ML workloads within AWS environments is essential.
Data Platform Experience
Experience working with modern data platforms, including:
SnowflakeData warehousing conceptsData lake architecturesMetadata management and governance practicesPerformance optimization techniques
Preferred Qualifications
Experience building recommendation systems in production environments.Familiarity with MLOps principles and frameworks.Experience with CI/CD pipelines for machine learning deployments.Knowledge of containerization technologies such as Docker.Exposure to orchestration tools such as Airflow or similar workflow platforms.Experience with distributed processing technologies such as Spark.
Nice to Have
The following skills are considered advantageous but are not required:
Experience with Generative AI, Agentic AI, or Large Language Model (LLM) applications.Familiarity with retrieval-augmented generation (RAG) architectures.Experience integrating AI agents into analytical workflows.Knowledge of vector databases and semantic search techniques.
What Success Looks Like
Successful candidates will be able to:
Build reliable, scalable data pipelines that transform raw merchant data into trusted analytical assets.Create feature engineering workflows that accelerate machine learning development.Operationalize recommendation models that improve business outcomes and customer experiences.Contribute across the full lifecycle of data and machine learning systems—from ingestion through production inference.Collaborate effectively within a pod structure, bringing strengths in either data engineering, machine learning, or both.
Ideal Candidate Profile
We are looking for Data Engineers who can move beyond traditional pipeline development and help build intelligent systems that generate insights, power recommendations, and drive data-informed decision making.
You thrive in environments where data engineering, machine learning, and production operations converge, and you enjoy solving complex problems using modern cloud-native technologies.
Company:
Jobs via Dice
Qualifications:
Language requirements:
Specific requirements:
Educational level:
Level of experience (years):
Senior (5+ years of experience)