About the Role
LendingKart seeking a highly motivated and hands-onTech Architect to join their Data Engineering team. The ideal candidate will possess deep technical expertise, a proven track record of implementing robust data pipelines , ensure data quality and the ability to mentor and guide team members. This role requires an individual who is comfortable being actively involved in coding, design, and deployment of complex data solutions, driving technical excellence, and ensuring the stability and performance of our data platform.
Some High level things you would own but not limited to:
Architect the Roadmap: Define and own the technical vision for the data platform, making critical build-vs-buy decisions and selecting appropriate technologies for batch and streaming workloads.
-
Lakehouse Implementation: Lead the design and implementation of a robust Lakehouse Architecture, utilizing the Medallion structure (Bronze, Silver, Gold) to ensure data is organized, governed, and accessible.
-
Data Modeling: Collaborate with stakeholders across the organization to translate complex business logic into efficient, scalable data models (Star Schema, Snowflake, Data Marts) tailored for high-performance analytics.
-
High-Performance ETL/ELT: Lead the design, development, and maintenance of scalable pipelines capable of handling massive datasets with low latency.
-
Real-Time Ingestion: Architect modern data ingestion strategies, focusing on Change Data Capture (CDC) mechanisms (using tools like Debezium, PeerDB) and event-driven architectures via Kafka or Google Pub/Sub.
-
Spark Optimization: Serve as the subject matter expert on Apache Spark (Batch and Structured Streaming), performing deep-dive tuning on memory management, shuffling, and partitioning to optimize costs and runtime.
-
Lead by Example: Remain hands-on (approx. 30-50% coding), writing advanced Python, Scala, and SQL to solve the most complex transformation challenges and build core frameworks for the wider team.
-
Technical Excellence: Foster a culture of engineering rigor by conducting in-depth code reviews, enforcing CI/CD best practices, and automating testing frameworks.
-
Mentorship: Mentor junior and mid-level engineers, guiding their career growth and technical upskilling.
-
Governance & Security: Implement strict protocols for data quality, lineage, role-based access control (RBAC), and compliance across the platform.
What you would possess already:
-
5 – 9 years experience in Data Engineering.
-
Expert-level proficiency and hands-on experience in Python and Scala.
-
SQL, with advanced knowledge of effective window functions, complex grouping, and aggregation techniques.
-
Deep hands-on expertise with Apache Spark for large-scale data processing (batch and structured streaming ETL). (AWS EMR, GCP Data Proc etc)
-
Proven experience working with and implementing the LakeHouse Architecture using the Medallion Model (Bronze, Silver, Gold layers).
-
Proven experience to choose the data store for various date lakehouse use cases between S3/ GCS/ADLS , MySQL, PgSQL, MongoDB
-
Solid understanding of various ETL/data streaming architectures, including Kappa and Lambda architectures.
-
Experience with Data Ingestion techniques, including Change Data Capture (CDC), specifically using tools like Debezium from Data Sources like S3/ GCS , MySQL, POSTGRES, MongoDB.
-
Hands on with Data Observability and Monitoring using Grafana , Prometheus.
-
Hands-on experience with Message Brokers such as Kafka and/or Google Pub/Sub.
-
Strong knowledge of Data Modelling principles (designing Fact tables, Dimension tables, and Data Marts).
-
Hands-on in deployment using Kubernetes
-
Experience working in a multi-cloud environment, specifically with AWS and/or GCP.
-
Practical experience with cloud-native or modern Query Engines like BigQuery, Trino, and/or RedShift.