Logging, Monitoring and Observability in Google Cloud
This three-day instructor-led course covers techniques for monitoring, troubleshooting, and improving infrastructure and application performance on the Google Cloud Platform.
3 day course
Supporting material
Google Cloud Partner of the Year
Private
Private
A private training session for your team. Groups can be of any size, at a location of your choice including our training centres.
This course covers the techniques for monitoring, troubleshooting, and improving infrastructure and application performance in Google Cloud Platform (GCP) guided by the principles of Site Reliability Engineering (SRE).
Using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees will gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.
Jellyfish has recently been named a Google Cloud Specialisation Partner of the Year. This title recognises our commitment to provide world-leading Cloud-based Training solutions that help our clients succeed. Our Logging, Monitoring and Observability in Google Cloud is available as a live Virtual Classroom and will run over three consecutive days. It can be delivered as a private training session at a our training venue in the Arenco Tower, a location of your choice or via Virtual Classroom.
Course overview
Who should attend:
This course is intended for the following participants: Cloud architects, Administrators and SysOps personnel or Cloud developers and DevOps personnel.
Walk away with the ability to:
Plan and implement a well-architected logging and monitoring infrastructure
Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
Create effective monitoring dashboards and alerts
Monitor, troubleshoot, and improve GCP infrastructure
Analyse and export GCP audit logs
Find production code defects, identify bottlenecks, and improve performance
Optimise monitoring costs
Prerequisites
Attendees should have basic scripting or coding ability and proficiency with command-line tools and Linux operating system environments. They should also understand the principles of GCP as covered in the one day course Google Cloud Platform Fundamentals: Core Infrastructure or have equivalent experience.
Course agenda
Module 1: Introduction to Google Cloud Monitoring Tools
Understand the purpose and capabilities of Google Cloud operations-focused components (Logging, Monitoring, Error Reporting, and Service Monitoring)
Understand the purpose and capabilities of Google Cloud application performance management focused components (Debugger, Trace, and Profiler)
Module 2: Avoiding Customer Pain
Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation
Measure customer pain with SLIs
Define critical performance measures
Create and use SLOs and SLAs
Achieve developer and operation harmony with error budgets
Module 3: Alerting Policies
Develop alerting strategies
Define alerting policies
Add notification channels
Identify types of alerts and common uses for each
Construct and alert on resource groups
Manage alerting policies programmatically
Module 4: Monitoring Critical Systems
Choose best practice monitoring project architectures
Differentiate Cloud IAM roles for monitoring
Use the default dashboards appropriately
Build custom dashboards to show resource consumption and application load
Define uptime checks to track aliveness and latency
Module 5: Configuring Google Cloud Services for Observability
Integrate logging and monitoring agents into Compute Engine VMs and images
Enable and utilize Kubernetes Monitoring
Extend and clarify Kubernetes monitoring with Prometheus
Expose custom metrics through code, and with the help of OpenCensus
Module 6: Advanced Logging and Analysis
Identify and choose among resource tagging approaches
Define log sinks (inclusion filters) and exclusion filters
Create metrics based on logs
Define custom metrics
Link application errors to Logging using Error Reporting
Export logs to BigQuery
Module 7: Monitoring Network Security and Audit Logs
Collect and analyse VPC Flow logs and Firewall Rules logs
Enable and monitor Packet Mirroring
Explain the capabilities of Network Intelligence Center
Use Admin Activity audit logs to track changes to the configuration or metadata of resources
Use Data Access audit logs to track accesses or changes to user-provided resource data
Use System Event audit logs to track GCP administrative actions
Module 8: Managing Incidents
Define incident management roles and communication channels