Senior Cloud Data Engineer in AWS / Azure with diverse industry experience. Profound knowledge in analytics / streaming / IaC
Aktualisiert am 13.05.2024
Profil
Freiberufler / Selbstständiger
Remote-Arbeit
Verfügbar ab: 03.05.2024
Verfügbar zu: 100%
davon vor Ort: 10%
Azure
Databricks
Big Data Analytics
Python
SQL
AWS
Terraform
Docker
Celonis
ETL
Datenbank Managementsystem
MS Power BI
GitHub
Atlassian JIRA
Atlassian Confluence
MS Office
VBA
Streaming
Data Lake
Process Mining

Einsatzorte

Einsatzorte

Deutschland, Schweiz, Österreich
möglich

Projekte

Projekte

6 months
2023-12 - 2024-05

Building of a capacity dashboard for a transport company

Data Engineer Python Terraform SQL
Data Engineer

- Methodical support for the client in IT and solution architecture

- Analysis of the status of IT systems, processes and data flows

- Driving forward automation in the ART: Development of Data Lake and Data Warehouse concepts

- Establishing CI/CD Pipelines

- Detailed conception and implementation of IT governance, security, and process topics

- Activities in the planning and implementation of new Visions/features (implementation of "PoCs")

- Preparing impulses for architecture decisions from PoC results
AWS Lakeformation MS Power BI Athena amazon Redshift GitLab
Python Terraform SQL
Deutsche Bahn
Frankfurt am Main
1 year
2023-01 - 2023-12

Azure data architect for evaluating daily receipt

Data Architect + Data Engineer Python Terraform SQL ...
Data Architect + Data Engineer
- Establishment & implementation of a roles and rights concept for the entire Edeka Digital based on Row Level Security and Data Catalogs

- Design and development of the data mesh.
- Design, create, monitor, and maintain data pipelines.
- Data extraction, data cleansing and transformation, data modeling and processing, data integration, data monitoring and maintenance
- Building a modern data engineering pipeline to streamline data processing and analysis for business insights.
- Creation of Fact and Dimensional Data Model
- Utilizing Azure cloud services (Azure Data Factory) for orchestration and scheduling.
- Databricks as the data transformation and processing engine using PySpark.
- Leverage Python for custom data extraction, transformation, and validation.
- Store and manage structured data using SQL databases in Azure.
- Git for version control
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests
Azure GitHub Databricks Azure Data Factory Atlassian JIRA Atlassian Confluence MS Azure SQL Database Azure Blob Storage
Python Terraform SQL Pyspark Spark
EDEKA
Hamburg
1 year 8 months
2021-08 - 2023-03

AWS Product Owner / Data Engineer for marketing campaigns

Product Owner / Data Engineer Python Spark Docker ...
Product Owner / Data Engineer
- Customer Segmentation with aid of different Machine Learning algorithms

- Introduction and development of a data catalog and SLAs for cross-system interface contracts
- Creating Glue Data Catalog
- Database management and implementing the most cost-effective solutions.
- Data Modelling for big marketing data using the Data Vault approach.
- IT Governance according to EU Data Act:
    - Setting deletion deadlines & handling of personal data
    - Developing and implementing a data access system
- Scale-up of Public Cloud infrastructure using container services.
- Analysis of the business processes and deriving requirements for the data solution
- Stakeholder Management with multiple iteration circles to validate the specifications and prioritizing these in the development.
- Using Git Actions for CI/CD and setting up CD Pipeline
- Implementation of business requirements and translating them into technological processes and tools in the cloud
    - Triggering workflows using AWS Lambda and step functions
    - Implementing ETL pipelines using AWS Glue and spark
    - Implementation of Data Models in different databases
    - Connecting different source systems with REST APIs
    - Pushing and storing data in S3 data lake
    - Transformation to data warehouse from S3 data lake
    - Providing information in Microservices to other departments
    - Presentation in front of 300 people audience
    - Data Transformation via SQL SERVER
    - Using internal microservices
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests
- maintenance and support for users along the whole process from requirement analysis, specification definition and implementation

AWS Sagemaker Jupyter Notebook AWS S3 AWS Lambda AWS Redshift AWS Glue Job API Gateway GitHub Atlassian JIRA Atlassian Confluence Tableau Data Vault
Python Spark Docker Kubernetes SQL Terraform
Deloitte
Hanover
9 months
2021-01 - 2021-09

Deploying ML models for users

Azure ML Ops engineer Python TensorFlow Scikit ...
Azure ML Ops engineer
- Providing a PoC of deploying a time series analysis for customers and scale with Sagemaker (MLFlow).

- Conducting a time series analysis on purchase prediction.
- Creation of CI/CD pipeline to deploy ML model.
- Data analysis with Sagemaker notebook (Jupyter Core)
- Orchestration of glue jobs using Step functions
- Teaching of Sagemaker Studios (Jupyter Core) and pipelines and documentation
- Data Reporting on Sagemaker to assess different existing solutions and choose the best one.
- BigData Processing with Spark & pyspark
- Implementation of microservices with Sagemaker Endpoints

Azure Azure ML Studio Atlassian JIRA Atlassian Confluence MLFlow GitHub Kubernetes Helm MS Azure SQL Database Azure Data Factory
Python TensorFlow Scikit Keras Numpy Pyspark Pandas Blob storage
Telekom MMS GmbH
Munich
7 months
2020-07 - 2021-01

Building a reporting pipeline for HR processes in Azure

Product Owner / Data Engineer Python SQL M
Product Owner / Data Engineer
- Ensuring agile working procedures
- Requirements analysis and solution design
- Educational development & trainings
- AWS Athena
- Building ETL pipeline through AWS Glue and python script
- Creation of tables and views with SQL in Athena
- Providing information in Microservices to other departments
- Development in ITIL framework
- Requirement engineering
- Using Git Actions for CI/CD
- Process monitoring and data reporting for stable pipelines.
- Transformation of user?s stories to technical implementation according to Scrum
- Creation of Fact and Dimensional Data Model
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests

Python MS Azure SQL Database Databricks MS Power Apps MS Power Automate MS Power BI GitHub Atlassian JIRA Atlassian Confluence
Python SQL M
Bayer
Düsseldorf
1 year 6 months
2019-01 - 2020-06

Development of a financial model for advertising campaign

Data Engineer Python SQL MS Power BI ...
Data Engineer
- Analyzing business processes to develop requirement specifications.
- Pushing and storing data in blob storage data lake
- Database management and implementing most cost effective solutions.
- Transformation to data warehouse from blob storage
- Setting up of a Git Repository CI pipeline for Azure DevOps
- Implementing Azure Architecture
- Automizing Trigger events for pipeline flow
- Creating triggers from Power Apps
- Building a databricks repo
- Migrating SSIS Pipelines in Azure Cloud to Data Factory
- Implementing ETL in Azure Databricks & Synapse with Spark
- Implementing ML Flow in Databricks
- Requirements engineering
- Agile writing and presenting user stories in front of higher management.
- Data Reporting for Quality Inspection using Databricks and Power BI

Azure Key Vault Azure Data Factory Databricks MS Azure SQL Database Azure Devops Azure Active Directory Confluence MS SQL Server Integration Services Synapse GitHub Atlassian JIRA
Python SQL MS Power BI Bash
Henkel
Düsseldorf
8 months
2018-05 - 2018-12

Buidling a production monitoring for clothing manufacturer

Data Engineer Python SQL Dash
Data Engineer
- Ensuring agile working procedures
- Created a Public Cloud based infrastructure for a AI Vision Project in MS Azure
- Creation of a MySQL database and connection to the Azure VM
- Enhancement of a Dashboard in Dash
- Connection between different services using Node-Red
- Scale-up of Public Cloud infrastructure using container and Docker services.
- Using Gitlab CI for CI/CD development
- Local teaching of operators for new services
- Definition of a best practice manual for setting up Azure Public Cloud Infrastructure
- Connecting different source systems using REST API
- ETL Pipeline development with Azure Data Bricks & Synapse
- Data reporting as business Usecase for end clients
- Setup of ML OPS in Azure to monitor reinforcement learning

MS Azure Machine Learning Cosmos DB Azure Devops Databricks Node.js Kubernetes Helm GitLab Docker
Python SQL Dash
McKinsey & Co.
Venice

Position

Position

Data Engineer / Data Architect


Branchen

Branchen

Automotive / Consumer Goods / Transportation

Einsatzorte

Einsatzorte

Deutschland, Schweiz, Österreich
möglich

Projekte

Projekte

6 months
2023-12 - 2024-05

Building of a capacity dashboard for a transport company

Data Engineer Python Terraform SQL
Data Engineer

- Methodical support for the client in IT and solution architecture

- Analysis of the status of IT systems, processes and data flows

- Driving forward automation in the ART: Development of Data Lake and Data Warehouse concepts

- Establishing CI/CD Pipelines

- Detailed conception and implementation of IT governance, security, and process topics

- Activities in the planning and implementation of new Visions/features (implementation of "PoCs")

- Preparing impulses for architecture decisions from PoC results
AWS Lakeformation MS Power BI Athena amazon Redshift GitLab
Python Terraform SQL
Deutsche Bahn
Frankfurt am Main
1 year
2023-01 - 2023-12

Azure data architect for evaluating daily receipt

Data Architect + Data Engineer Python Terraform SQL ...
Data Architect + Data Engineer
- Establishment & implementation of a roles and rights concept for the entire Edeka Digital based on Row Level Security and Data Catalogs

- Design and development of the data mesh.
- Design, create, monitor, and maintain data pipelines.
- Data extraction, data cleansing and transformation, data modeling and processing, data integration, data monitoring and maintenance
- Building a modern data engineering pipeline to streamline data processing and analysis for business insights.
- Creation of Fact and Dimensional Data Model
- Utilizing Azure cloud services (Azure Data Factory) for orchestration and scheduling.
- Databricks as the data transformation and processing engine using PySpark.
- Leverage Python for custom data extraction, transformation, and validation.
- Store and manage structured data using SQL databases in Azure.
- Git for version control
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests
Azure GitHub Databricks Azure Data Factory Atlassian JIRA Atlassian Confluence MS Azure SQL Database Azure Blob Storage
Python Terraform SQL Pyspark Spark
EDEKA
Hamburg
1 year 8 months
2021-08 - 2023-03

AWS Product Owner / Data Engineer for marketing campaigns

Product Owner / Data Engineer Python Spark Docker ...
Product Owner / Data Engineer
- Customer Segmentation with aid of different Machine Learning algorithms

- Introduction and development of a data catalog and SLAs for cross-system interface contracts
- Creating Glue Data Catalog
- Database management and implementing the most cost-effective solutions.
- Data Modelling for big marketing data using the Data Vault approach.
- IT Governance according to EU Data Act:
    - Setting deletion deadlines & handling of personal data
    - Developing and implementing a data access system
- Scale-up of Public Cloud infrastructure using container services.
- Analysis of the business processes and deriving requirements for the data solution
- Stakeholder Management with multiple iteration circles to validate the specifications and prioritizing these in the development.
- Using Git Actions for CI/CD and setting up CD Pipeline
- Implementation of business requirements and translating them into technological processes and tools in the cloud
    - Triggering workflows using AWS Lambda and step functions
    - Implementing ETL pipelines using AWS Glue and spark
    - Implementation of Data Models in different databases
    - Connecting different source systems with REST APIs
    - Pushing and storing data in S3 data lake
    - Transformation to data warehouse from S3 data lake
    - Providing information in Microservices to other departments
    - Presentation in front of 300 people audience
    - Data Transformation via SQL SERVER
    - Using internal microservices
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests
- maintenance and support for users along the whole process from requirement analysis, specification definition and implementation

AWS Sagemaker Jupyter Notebook AWS S3 AWS Lambda AWS Redshift AWS Glue Job API Gateway GitHub Atlassian JIRA Atlassian Confluence Tableau Data Vault
Python Spark Docker Kubernetes SQL Terraform
Deloitte
Hanover
9 months
2021-01 - 2021-09

Deploying ML models for users

Azure ML Ops engineer Python TensorFlow Scikit ...
Azure ML Ops engineer
- Providing a PoC of deploying a time series analysis for customers and scale with Sagemaker (MLFlow).

- Conducting a time series analysis on purchase prediction.
- Creation of CI/CD pipeline to deploy ML model.
- Data analysis with Sagemaker notebook (Jupyter Core)
- Orchestration of glue jobs using Step functions
- Teaching of Sagemaker Studios (Jupyter Core) and pipelines and documentation
- Data Reporting on Sagemaker to assess different existing solutions and choose the best one.
- BigData Processing with Spark & pyspark
- Implementation of microservices with Sagemaker Endpoints

Azure Azure ML Studio Atlassian JIRA Atlassian Confluence MLFlow GitHub Kubernetes Helm MS Azure SQL Database Azure Data Factory
Python TensorFlow Scikit Keras Numpy Pyspark Pandas Blob storage
Telekom MMS GmbH
Munich
7 months
2020-07 - 2021-01

Building a reporting pipeline for HR processes in Azure

Product Owner / Data Engineer Python SQL M
Product Owner / Data Engineer
- Ensuring agile working procedures
- Requirements analysis and solution design
- Educational development & trainings
- AWS Athena
- Building ETL pipeline through AWS Glue and python script
- Creation of tables and views with SQL in Athena
- Providing information in Microservices to other departments
- Development in ITIL framework
- Requirement engineering
- Using Git Actions for CI/CD
- Process monitoring and data reporting for stable pipelines.
- Transformation of user?s stories to technical implementation according to Scrum
- Creation of Fact and Dimensional Data Model
- Development of tests for code validation & quality assurance e.g. unit tests, integration tests, functional tests, system tests

Python MS Azure SQL Database Databricks MS Power Apps MS Power Automate MS Power BI GitHub Atlassian JIRA Atlassian Confluence
Python SQL M
Bayer
Düsseldorf
1 year 6 months
2019-01 - 2020-06

Development of a financial model for advertising campaign

Data Engineer Python SQL MS Power BI ...
Data Engineer
- Analyzing business processes to develop requirement specifications.
- Pushing and storing data in blob storage data lake
- Database management and implementing most cost effective solutions.
- Transformation to data warehouse from blob storage
- Setting up of a Git Repository CI pipeline for Azure DevOps
- Implementing Azure Architecture
- Automizing Trigger events for pipeline flow
- Creating triggers from Power Apps
- Building a databricks repo
- Migrating SSIS Pipelines in Azure Cloud to Data Factory
- Implementing ETL in Azure Databricks & Synapse with Spark
- Implementing ML Flow in Databricks
- Requirements engineering
- Agile writing and presenting user stories in front of higher management.
- Data Reporting for Quality Inspection using Databricks and Power BI

Azure Key Vault Azure Data Factory Databricks MS Azure SQL Database Azure Devops Azure Active Directory Confluence MS SQL Server Integration Services Synapse GitHub Atlassian JIRA
Python SQL MS Power BI Bash
Henkel
Düsseldorf
8 months
2018-05 - 2018-12

Buidling a production monitoring for clothing manufacturer

Data Engineer Python SQL Dash
Data Engineer
- Ensuring agile working procedures
- Created a Public Cloud based infrastructure for a AI Vision Project in MS Azure
- Creation of a MySQL database and connection to the Azure VM
- Enhancement of a Dashboard in Dash
- Connection between different services using Node-Red
- Scale-up of Public Cloud infrastructure using container and Docker services.
- Using Gitlab CI for CI/CD development
- Local teaching of operators for new services
- Definition of a best practice manual for setting up Azure Public Cloud Infrastructure
- Connecting different source systems using REST API
- ETL Pipeline development with Azure Data Bricks & Synapse
- Data reporting as business Usecase for end clients
- Setup of ML OPS in Azure to monitor reinforcement learning

MS Azure Machine Learning Cosmos DB Azure Devops Databricks Node.js Kubernetes Helm GitLab Docker
Python SQL Dash
McKinsey & Co.
Venice

Position

Position

Data Engineer / Data Architect


Branchen

Branchen

Automotive / Consumer Goods / Transportation

Vertrauen Sie auf GULP

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das GULP Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.