Senior Data Pipeline Architect - Apache Beam & Google Cloud Platform
Core Information:
- Start Date: Immediate openings available
- Position: Senior Data Pipeline Architect - Apache Beam & Google Cloud Platform
- Compensation: Competitive salary and comprehensive benefits package
- Location: Remote (United States)
- Company: Workwarp
About Us:
Workwarp is a dynamic and innovative technology solutions provider, partnering with industry-leading companies to deliver transformative data-driven outcomes. We are passionate about leveraging cutting-edge technologies, particularly within the Google Cloud ecosystem, to build scalable, reliable, and high-performing data pipelines. We foster a collaborative and supportive environment where talented individuals can thrive, grow their skills, and make a real impact. We are currently seeking a highly motivated and experienced Senior Data Pipeline Architect with a deep expertise in Apache Beam and Google Cloud Platform to join our growing team.
The Opportunity:
As a Senior Data Pipeline Architect, you will play a pivotal role in designing, developing, and deploying robust and scalable data pipelines that power our clients' business intelligence and data analytics initiatives. You will be responsible for architecting end-to-end data solutions leveraging Apache Beam and the Google Cloud Platform, ensuring data quality, reliability, and performance. This is a remote-first position offering the flexibility to work from anywhere within the United States. You will collaborate closely with cross-functional teams, including data engineers, data scientists, and business analysts, to understand their needs and deliver impactful solutions. This role offers a significant opportunity for professional growth and the chance to work on challenging and rewarding projects.
Responsibilities:
- Data Pipeline Architecture & Design: Design and architect scalable, resilient, and efficient data pipelines using Apache Beam and Google Cloud Platform services. This includes defining data flow patterns, data transformations, and error handling strategies.
- Apache Beam Development: Develop and maintain high-quality, well-documented Apache Beam pipelines for both batch and stream processing. Leverage flexible templates and data processing frameworks to meet diverse data processing requirements.
- Google Cloud Platform Expertise: Deeply leverage Google Cloud Platform services, including Cloud SQL, BigQuery, Dataflow, Pub/Sub, Kafka, Cloud Storage, Cloud Functions, Cloud Run, and App Engine, to build and optimize data pipelines.
- Event-Driven Architecture: Design and implement solutions based on event-driven architectures, utilizing Kafka and related technologies (Confluent Kafka, Pub/Sub) for real-time data processing and event handling.
- Kafka Expertise: Utilize the Kafka Connect framework to build and manage data pipelines that ingest and export data from various sources and destinations. Experience with various connector types (HTTP REST proxy, JMS, File, SFTP, JDBC) is essential.
- Data Volume Management: Design and optimize pipelines to handle large volumes of streaming data from Kafka, ensuring data integrity and performance under high load.
- Data Modeling & Optimization: Collaborate with data engineers and data scientists to understand data requirements and design efficient data models. Optimize data pipelines for performance, cost, and scalability.
- Deployment & Monitoring: Deploy and manage data pipelines in a production environment using CI/CD pipelines and monitoring tools. Ensure pipeline reliability and proactively address performance issues.
- Collaboration & Communication: Collaborate effectively with cross-functional teams to gather requirements, communicate technical solutions, and provide mentorship to junior team members.
- Staying Current: Stay up-to-date with the latest advancements in Apache Beam, Google Cloud Platform, and data engineering technologies. Contribute to the development of best practices and internal documentation.
Qualifications:
- Experience: 5+ years of hands-on experience as a Java Apache Beam Developer, with a proven track record of designing, building, and deploying production-level data pipelines.
- Google Cloud Platform Proficiency: Strong experience working with Google Cloud SQL, Java Google Cloud Platform, and Apigee.
- BigQuery & GCP Expertise: Hands-on experience with BigQuery and the broader Google Cloud Platform ecosystem.
- Java & Apache Beam Mastery: Deep understanding of Java and the Apache Beam programming model. Experience with various Beam transforms and operations.
- Kafka & Streaming Data: Extensive experience with Kafka and real-time streaming data processing. Experience with Confluent Kafka is a plus.
- Cloud Technologies: Familiarity with Cloud SQL, Compute Engine, Cloud Functions, Cloud Run, and App Engine.
- Hadoop Ecosystem: Knowledge of open-source distributed storage and processing utilities within the Apache Hadoop family.
- Data Pipeline Design Principles: Solid understanding of data pipeline design principles, including data quality, data governance, and data security.
- Software Development Best Practices: Proficiency in software development best practices, including version control (Git), testing, and code review.
- Communication & Collaboration: Excellent communication, interpersonal, and collaboration skills.
Bonus Points:
- Experience with data warehousing solutions (e.g., Snowflake, Redshift).
- Experience with data visualization tools (e.g., Tableau, Looker).
- Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- Contributions to open-source Apache Beam projects.
To Apply:
If you are a passionate and skilled data pipeline architect looking for a challenging and rewarding opportunity, we encourage you to apply. This is a W2 position. We are not accepting C2C arrangements. Please submit your application through the link below.
Apply To This JobWe are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Apply for this job