Skip to content
View SanjayArepally's full-sized avatar

Block or report SanjayArepally

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SanjayArepally/README.md

👋 About Me

I’m a Data & AI Engineer (Automation & Integration workflows) with hands-on experience building reliable data pipelines, workflow automation, and scalable data platforms across cloud environments. I focus on designing systems that reduce manual effort, improve data reliability, and support analytics and machine learning use cases.

My work sits at the intersection of data engineering, automation, and cloud infrastructure — from ingestion and transformation to orchestration, monitoring, and optimization. I enjoy solving ambiguous data problems, improving system reliability, and building pipelines that downstream teams can trust.

I’m particularly interested in data platform engineering, workflow automation, and AI-enabled systems, and I continuously upskill by experimenting with new tools, patterns, and architectures.

🔧 What I Work On

Designing end-to-end data pipelines (ingestion → transformation → orchestration → monitoring)

Building scalable ETL/ELT workflows using Python, SQL, Spark, and cloud services

Automating data workflows and operational processes to reduce manual effort

Implementing data quality checks, validation, and lineage-aware pipelines

Optimizing performance, cost, and reliability of data systems

Supporting analytics and ML teams with clean, well-modeled datasets

🧠 Core Skills

Data Engineering & Platforms

Python, SQL, PySpark

Apache Spark, Databricks

Data modeling, partitioning, incremental processing

Cloud & Infrastructure

AWS (S3, Glue, Lambda, Redshift, EC2)

Infrastructure as Code (Terraform)

CI/CD, environment automation

Orchestration & Reliability

Airflow, scheduling, dependency management

Monitoring, alerting, retries, and backfills

Debugging data and pipeline failures

Automation & AI

Workflow automation and process optimization

AI-assisted data workflows and RAG-based systems

API integrations and system interoperability

🎯 How I Think About Engineering

Treat data as a product, not just a pipeline

Optimize for reliability, clarity, and scalability

Prefer simple, observable systems over complex ones

Design for failure, not just success

Communicate clearly with technical and non-technical teams

📫 Let’s Connect

GitHub: Sanjay Arepally

Pinned Loading

  1. games-industry-data-analysis games-industry-data-analysis Public

    Jupyter Notebook 1