Senior Data Engineer w/ AI & GraphRAG

Remote
- Prague, Praha, Hlavní město, Czechia
€40 - €45 per hour
Jimmy Technologies

If you are passionate about Data Engineering, cloud-native architectures, and AI applications, this role with our client offers an exciting opportunity to work on impactful projects!

Job description

We are looking for a Data Engineer for the team of our Fortune 50 client, building a scalable, production-grade data and AI-enabled systems. The main objective of the role is to design and develop scalable data pipelines, backend services, and APIs as well as build knowledge graph solutions, including graph data modeling, relationship mapping, and graph traversal logic.

This is a remote-first position with a required overlap of US working hours (2-6 PM CET).

Responsibilities

Data Modeling & ETL Development

Build ETL pipelines
Design data models.
Design and implement GraphRAG solutions combining knowledge graphs, vector search, metadata, and LLM-based retrieval.
Create Source to Target Mappings (STMs) for ETL specifications.
Implement automated ingestion patterns, incremental (delta) updates, and streaming/CDC workflows.

Collaboration & Agile Development

Gather requirements, set targets, define interface specifications, and conduct design sessions.
Work closely with data consumers to ensure proper integration.
Adapt and learn in a fast-paced project environment.

Work Conditions

Start Date: ASAP
Location: Remote
Working hours: US time zone overlap required: 2-6pm CET
Long-term contract based-role: 6+month

Job requirements

Strong SQL skills for ETL, data modeling, and performance tuning.
Experience with Neo4j, Cosmos DB or Gremlin API.
Proficiency in Python, especially for handling and flattening complex JSON structures.
Hands-on experience with Microsoft Fabric, Synapse, ADF, or similar cloud data stacks.
Knowledge of GraphRAG, retrieval systems, LLMs, embeddings and RAG pipelines.
Understanding of software engineering and testing practices within an Agile environment.
Experience with Data as Code; version control, small and regular commits, unit tests, CI/CD, packaging, familiarity with containerization tools such as Docker (must have) and Kubernetes (plus).
Excellent teamwork and communication skills.
Proficiency in English, with strong written and verbal communication skills.
Efficient, high-performance data pipelines for real-time and batch data processing.

Nice to have:

Experience with Databricks or similar data platforms.
Vector databases, Azure AI Search, or semantic search platforms.