Pison Technology ETL Tools

Pison Technology ETL Tools

Built ETL tools and automation scripts for Boston startup Pison Technology, improving data integration efficiency and reducing manual workload.

Python SQL Google Cloud Platform BigQuery Cloud Functions ETL Data Wrangling Data Integration Automation Jira Confluence Agile Scrum Startup

Overview

At Pison Technology, a Boston-based startup, I worked on building ETL tools and automation scripts to streamline their data integration processes. This was my first experience working in a professional environment with structured project management tools and methodologies, which was incredibly valuable for my growth as a data engineer.

Google Cloud Platform & Data Engineering

This project was my deep dive into Google Cloud Platform, where I learned everything from Google Cloud Functions to data warehouses like BigQuery. I built comprehensive data pipelines that transformed raw data from multiple sources into clean, structured datasets ready for analysis.

The data wrangling process was extensive - I spent significant time cleaning datasets by removing duplicates, changing column names, breaking down nested fields, flattening complex data types, and normalizing data structures. I also worked on linking and joining data from different microservices and APIs, which taught me a lot about data integration across distributed systems.

I learned about Google service credentials and security keys, which was crucial for securely connecting to various Google Cloud services. This gave me hands-on experience with cloud security best practices and authentication mechanisms.

I focused on understanding their existing data pipeline and identifying bottlenecks in their current processes. The goal was to automate repetitive tasks and create tools that would make data integration more reliable and faster.

My approach involved working closely with their data team to understand their specific needs and pain points, then designing and implementing Python-based ETL tools that would address those challenges.

Learning Professional Development Tools

This project was my introduction to professional software development practices. I learned to use Jira for task tracking and project management, which helped me understand how to break down complex projects into manageable tasks and track progress systematically.

Confluence became my go-to tool for documentation. I documented my ETL processes, data schemas, and troubleshooting guides, which taught me the importance of clear, maintainable documentation in professional environments.

Working in an Agile/Scrum environment was eye-opening. I participated in daily standups, sprint planning sessions, and retrospectives. This helped me understand how to work collaboratively, estimate task complexity, and adapt to changing requirements - skills that are crucial in any professional development role.

Automation & Business Intelligence

I implemented automated processes using scheduled queries in BigQuery, which would automatically build tables from the cleaned and transformed datasets. This automation was crucial for maintaining up-to-date data for business intelligence dashboards and ad-hoc requests.

Using SQL and Python, I created additional tables specifically designed for visualizations and BI decisions. These tables were organized in what we called the "gold layer" - the final, clean, business-ready data that supported specific dashboard requirements and fulfilled ad-hoc analytical requests from stakeholders.

The main deliverables included automated ETL scripts that could handle their data transformation needs, along with monitoring and logging tools to ensure data quality and track any issues that arose during processing.

This project helped me understand the unique challenges that startups face when it comes to data infrastructure - the need for quick, flexible solutions that can scale as the company grows, while maintaining reliability and data integrity. More importantly, it gave me my first taste of professional software development practices and team collaboration.