Databricks Architect (m/f)
PwC
- Type
- Full Time
- Level
- Lead
- Location
- RemoteRemote OK
Job Description
Job Description & Summary Join PwC’s ATS Data & AI team and help build modern lakehouse platforms and analytics solutions on Databricks. We’re growing quickly and opening multiple Databricks opportunities across seniority levels—whether you’re a hands-on Databricks Engineer or a Databricks Solution Architect who can design end-to-end platforms and lead technical delivery. If you enjoy solving real client problems, working with cutting-edge cloud data stacks, and delivering production-grade solutions, we’d like to meet you. As a Databricks Architect , you will act as a technical authority for Databricks-based delivery—supporting teams with design decisions, reviewing implementations, and driving best practices for production. This role is ideal for someone who enjoys being close to engineering work while influencing how platforms and pipelines are built at scale. What you’ll work on (our team's project examples): Building scalable ingestion and transformation pipelines on Databricks (batch and streaming where relevant) Designing lakehouse architectures using Delta Lake and best-practice patterns (e.g., medallion) Productionizing workloads with strong DevOps/CI-CD , automation, monitoring, and cost optimization Supporting governance and secure access patterns (Unity Catalog where applicable) Working with global teams and clients to deliver measurable outcomes Key Responsibilities: Architecture & Standards: Define and promote lakehouse architecture patterns (Delta Lake, medallion approach, batch/streaming where relevant) and engineering standards for Databricks delivery. Governance & Security: Help implement secure-by-design patterns (RBAC, secrets management, networking), and support governance approaches such as Unity Catalog where applicable. DevOps & Delivery Readiness: Support production-grade delivery through CI/CD practices for Databricks assets and automation tooling. Operational Excellence: Improve monitoring/alerting, reliability practices, and runbooks to support stable production operations. Hands-on Troubleshooting: Diagnose and resolve issues across Databricks clusters, jobs/workflows, notebooks, libraries, and permissions; perform root cause analysis. Performance & Cost Optimization: Tune Spark workloads (partitioning, skew, joins, caching, file sizing), improve cluster configurations and policies, and optimize runtime/cost. Enablement: Coach engineers, run knowledge sessions, review code/notebooks, and document reusable playbooks. Required Skills & Experience: Strong hands-on experience with Databricks in production Deep knowledge of Apache Spark and Delta Lake (e.g., MERGE, OPTIMIZE/ZORDER, compaction, small files handling) Strong programming skills in Python/PySpark (Scala is a plus) and solid SQL Experience working with cloud data platforms ( Azure/AWS/GCP ) and cloud storage (e.g., ADLS/S3 ) Ability to communicate clearly with engineers and stakeholders and take ownership in ambiguous situations Preferred Skills: Unity Catalog an
Read original postingRequired Skills
PwC