Data Engineering on Google Cloud Platform

Difficulty
Rating
4days
Duration
2920,00 
+ VAT
Time: 10:30 - 17:30
Location:
When paid with Sovelto Access -credits, you will be charged the corresponding amount of credits to the Euro-price. Please contact sales: 020 7776 670 or myyntipalvelu@sovelto.fi for exact amount of credits in your case.

Register before
Spoken language: English

We are sorry, but the course is already full, please try with another date or location.

Or contact sales: 020 7776 670 or myyntipalvelu@sovelto.fi

Overview

This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform.

This class is intended for experienced developers who are responsible for managing big data transformations including:

  • Extracting, Loading, Transforming, cleaning, and validating data
  • Designing pipelines and architectures for data processing
  • Creating and maintaining machine learning and statistical models
  • Querying datasets, visualizing query results and creating reports

Objectives

  • Design and build data processing systems on Google Cloud Platform
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Derive business insights from extremely large datasets using Google BigQuery
  • Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML
  • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  • Enable instant insights from streaming data

Outline

Day 1: Serverless Data Analysis

  • Module 1: Serverless data analysis with BigQuery
  • Module 2: Serverless, autoscaling data pipelines with Dataflow

Day 2: Leveraging unstructured data

  • Module 3: Google Cloud Dataproc Overview
  • Module 4: Running Dataproc Jobs
  • Module 5: Integrating Dataproc with Google Cloud Platform
  • Module 6: Making Sense of Unstructured Data with Google’s Machine Learning APIs

Day 3: Serverless Machine Learning

  • Module 7: Getting started with Machine Learning
  • Module 8: Building ML models with Tensorflow
  • Module 9: Scaling ML models with CloudML
  • Module 10: Feature Engineering
  • Module 11: ML architectures

Day 4: Resilient streaming systems

  • Module 12: Need for real-time streaming analytics
  • Module 13: Architecture of streaming pipelines
  • Module 14: Stream data and events into PubSub
  • Module 15: Build a stream processing pipeline
  • Module 16: High throughput and low-latency with Bigtable
  • Module 17: Building Dashboards

Prerequisites

To get the most of out of this course, participants should have:

  • Completed Google Cloud Fundamentals: Big Data & Machine Learning OR have equivalent experience
  • Basic proficiency with common query language such as SQL
  • Experience with data modeling, extract, transform, load activities
  • Developing applications using a common programming language such Python
  • Familiarity with Machine Learning and/or statistics

 

Places left:
No participant limit
many
2920,00  + VAT