Logging, Monitoring and Observability in Google Cloud

This three-day instructor-led course covers techniques for monitoring, troubleshooting, and improving infrastructure and application performance on the Google Cloud Platform.
google badge
3 day course
Supporting material
Virtual, Private
Virtual Classroom
A convenient and interactive learning experience, that enables you to attend one of our courses from the comfort of your own home or anywhere you can log on. We offer Virtual Classroom on selected live classroom courses where this will appear as an option under the location drop down if available. These can also be booked as Private Virtual Classrooms for exclusive business sessions.
A private training session for your team. Groups can be of any size, at a location of your choice including our training centres.

This course covers the techniques for monitoring, troubleshooting, and improving infrastructure and application performance in Google Cloud Platform (GCP) guided by the principles of Site Reliability Engineering (SRE).

Using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees will gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.

Our Logging, Monitoring and Observability in Google Cloud is available as a private training session and will run over three consecutive days. It can be delivered at our training facilities in San Francisco or Baltimore, a location of your choice or via Virtual Classroom.

Course overview
Who should attend:

This course is intended for the following participants for Cloud architects, Administrators and SysOps personnel or Cloud developers and DevOps personnel.

Walk away with the ability to:
  • Plan and implement a well-architected logging and monitoring infrastructure
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Create effective monitoring dashboards and alerts
  • Monitor, troubleshoot, and improve GCP infrastructure
  • Analyze and export GCP audit logs
  • Find production code defects, identify bottlenecks, and improve performance
  • Optimize monitoring costs

Attendees should have basic scripting or coding ability and proficiency with command-line tools and Linux operating system environments. They should also understand the principles of GCP as covered in the one day course Google Cloud Platform Fundamentals: Core Infrastructure or have equivalent experience.

Course agenda
Module 1: Defining a Monitoring Plan
  • Understand the four golden signals: latency, traffic, errors, and saturation
  • Define SLIs (measures of customer pain)
  • Define critical performance measures
  • Define SLOs and SLAs
  • Define error budgets
Module 2: Introduction to GCP Monitoring Tools
  • The purpose and capabilities of GCP operations-focused components [Logging, Monitoring, Error Reporting, and Service Monitoring]
  • The purpose and capabilities of GCP application performance management focused components (Debugger, Trace, Profiler)
Module 3: Monitoring Critical Systems
  • Use the default dashboards appropriately
  • Build custom dashboards to show resource consumption and application load
  • Define uptime checks to track aliveness and latency
Module 4: Alerting Policies
  • Defining alerting policies
  • Defining alerts based on policy violations
  • Optimise alerts for actionability
  • Know types of alerts and common uses for each
  • Implement best practices for alerting policies
  • Define and alert on resource groups
  • Manage alerting policies programmatically using GCP Monitoring API
Module 5: Configuring GCP Services for Observability
  • Define the monitoring project architecture in accordance with best practices
  • Define Cloud IAM roles for monitoring
  • Define labels and tags for resources
  • Bake agents into VM images for app visibility in Compute Engine
  • Install Kubernetes Monitoring
  • Expose app data for Kubernetes Engine apps using Prometheus and OpenCensus
Module 6: Advanced Logging and Analysis
  • Know and choose among resource tagging approaches
  • Connect application errors to Logging using Error Reporting
  • Define log sinks (inclusion filters) and exclusion filters; understand the batch-vs.-realtime nature of data availability in log sinks
  • Create metrics based on logs
  • Define custom metrics
  • Export logs to BigQuery
  • Analyze logs using BigQuery
Module 7: Analyzing GCP Audit Logs
  • Use Admin Activity audit logs to track changes to the configuration or metadata of resources
  • Use Data Access audit logs to track accesses or changes to user-provided resource data
  • Use System Event audit logs to track GCP administrative actions
Module 8: Managing Incidents
  • Define incident management roles and communication channels
  • Mitigate incident impact
  • Troubleshoot root causes
  • Resolve incidents
  • Document incidents in a post-mortem process
Module 9: Investigating Application Performance Issues
  • Use Debugger to identify code defects in production
  • Use Trace to find performance bottlenecks in production
  • Use Profiler to find resource-intensive functions in an application
Module 10: Optimizing the Costs of Monitoring
  • Understand the billing of monitoring components within GCP
  • Analyze the resource utilization of monitoring components within GCP
  • Implement best practices for controlling the cost of monitoring within GCP
Book this course
Call our sales team today
Don't miss out
Keep up to date with news, views and offers from Jellyfish Training.
Your data will be handled in accordance with our Privacy Policy