Meet Omar Sanchez!-image

Meet Omar Sanchez!

I'm a Data Full Stack Engineer graduated from Carnegie Mellon University with passion for Entrepreneurship. I love to build products that solve real problems and have a positive impact on people's lives. I excel in fast-paced environments where I can use my technical and leadership skills to build products that make a difference.

When I'm not trying to change the world, I'm probably reading, going to the gym, updating myself, watching Netflix or playing videogames. I'm always open to new ideas and opportunities, so feel free to reach out!

about-me-image

About me

I'm Omar Sanchez, a Data Engineer with 3 years of working experience who loves fast-paced environments and being part of meaningful endeavors. I've built Data Warehouses and Data Lakes on AWS and GCP. I've fully owned the data warehousing and analytics of operations of a fintech (I was the full data stack). I recently graduated with a master's degree from Carnegie Mellon University and I have extensive academic knowledge in Deep Learning and Machine Learning.

  • Location:Pittsburgh, PA
  • Age:25
  • Nationality:Colombian
  • Interests:Entrepreneurship, Gym, Books, Music
  • Study:Carnegie Mellon University
  • Employment:Open to work!

Education

MS, Information Systems Management - Business Intelligence & Data Analytics

Carnegie Mellon UniversityDecember 2023
  • ● CQPA: 4.12/4.0
  • ● Coursework: 11-785 Introduction to Deep Learning, 10-601 Introduction to Machine Learning, 11-667 Large Language Models, 95-828 Machine Learning for Problem Solving, 95-865 Unstructured Data Analytics.
  • ● Worked as a Teaching Assistant (TA) for the courses Unstructured Data Analytics for two semesters and Object Oriented Programming with Java for 1 semester. Served as an officer at the Colombian Student Association.

BS, Economics and International Finance,

Universidad de La SabanaDecember 2021

● CQPA: 4.4/5.0

MS, Informatics Engineering, Universidad de La Sabana

Universidad de La SabanaDecember 2020
  • ● CQPA: 4.5/5.0
  • ● Served on the team for the robotics football world championship “Robocup” in 2019 that reached semi-finalist status in the “Shield Category”. Developed a path planning algorithm and main behaviors for the striker role.

Work

Data Engineer

AmazonMay 2023 - Aug 2023
  • ● Designed and developed an AWS-based batch NLP service for cleaning, storing, processing, and aggregating open-ended questions using AWS S3, AWS Athena, AWS Step Functions, AWS Lambda, and AWS Comprehend while employing only one-third of the time allotted for this.
  • ● Established a full CI/CD pipeline with unit and integration testing, for the deployment of the solution using AWS CloudFormation, PyTest, and internal DevOps tool.
  • ● Integrated the NLP service with a third-party API, serving as the user interface for enhanced UX.

Data Engineer Trainee

FactoredApr 2022 - Jul 2022
  • ● Retrieved data from REST APIs to generate automated reports using Python, AWS Lambda, Docker, AWS S3, AWS DynamoDB, AWS SNS, and AWS Step Functions. Deployed using Serverless Framework.
  • ● Designed and implemented a layered data lake using AWS S3, the AWS Glue data catalog, and PySpark.
  • ● Developed a pipeline to extract data from both PostgreSQL and S3 to create a data warehouse following Kimball’s modeling technique in AWS Redshift using Python, Airflow on AWS EC2, AWS Glue, Apache Spark, DBT, and SQL.
  • ● Implemented CI/CD pipelines to automate infrastructure deployment using GitHub Actions and Terraform.
  • ● Consumed and transformed streaming data to produce aggregated statistics, store data, and create real-time dashboards using Python, AWS Kinesis, AWS S3, and Splunk.

Data Engineer

Bluetab (an IBM Company)Dec 2021 - Apr 2022
  • ● Built, tested, and documented several data pipelines using Apache Spark, Control-M, and other proprietary big data tools including Datio for an international bank.
  • ● Deployed hundreds of Control-M data workflows and automated the creation of workflow-definition XML files using Python, RegEx, and Pandas, decreasing by a factor of 4 the time-to-production of new pipelines.
  • ● Ensured data quality and enforced rules of completeness, consistency, integrity, etc. according to data governance expectations.

Software Engineer Analyst

ScotiabankMay 2021 - Dec 2021
  • ● Built a tool using Selenium Web Driver, Python, Bash, and Docker to automatically execute tests to evaluate the impact of change on risk metrics, making the whole process four times faster.
  • ● Led and orchestrated deployments from development to production using Jenkins and Bitbucket.

Support Data Analyst

MinkaFeb 2020 - Apr 2021
  • ● Designed, built, and maintained single-handedly and proactively a GCP-hosted Bigquery data warehouse to support complex operation reports, accounting reports, and ad hoc analysis for an online pay tech company in Latin America. The data warehouse enabled a holistic view of a range of data sources (i.e. MySQL, Google Datastore, Google Cloud Logging, CSV files, spreadsheets, Neo4J/API). Leveraged the power of GCP (Cloud functions, Dataflow/Apache Beam, Dataprep, Bigquery, Google Cloud Storage, Pub/Sub ), Data Build Tool (DBT), Javascript, and Python. This helped the business fulfill several data needs and surface costly mistakes that required attention by the C-suite.
  • ● Designed, built, and maintained various analytical reports using SQL and Data Studio (now Looker) oriented to C-level executives.
  • ● Implemented integrations between Google Bigquery and Spreadsheets to serve different reports and tables to provide a familiar and friendly interface to the warehouse to non-technical users.
  • ● Developed several tools using Javascript and Node.js to automate and optimize operative processes decreasing by a factor of 10 the errors to be manually checked and fixed.
  • ● Identified, analyzed, and resolved complex errors within the service using SQL and Bigquery, leveraging a deep understanding of the REST API services and synchronous/asynchronous messaging styles.
  • ● Automated several REST API tests using Postman and Javascript, decreasing the testing time to just one-fifth.

Skills

Here you can see a comprehensive list of my skills!

Programming languages
SQL
Python
Bash
Javascript/Typescript
Java
Cypher(Neo4j)
Databases
BigQuery
MySql
AWS S3
PostgreSQL
Neo4J
AWS Redshift
Data Tools
DBT
Looker (Data Studio)
AWS Lambda
AWS StepFunctions
Airflow
PySpark
Google DataFlow
Software Engineer Skills
AWS
GCP
REST APIs
PyTest
Git
Docker
Terraform/IaC
Selenium
Regex
React
Data Science/Analytics
Jupyter Notebooks
PyTorch
Scikit-Learn
Pandas
WanDB
Languages
Spanish
English
French

Let's connect!

If you want to connect with me for new opportunities, collaboration, or just to say hi, I'm always open to new ideas and opportunities.

© Copyright 2024 Omar Sanchez