Spotted a bug? Have a great idea? Help us make google.dev great!

In this codelab, the following lab environment will be automatically deployed to a Google Cloud Platform project by a script provided.

Accelerate 2017 %2F Networking 101 lab drawings.png

These exercises are ordered to reflect a common cloud developer experience as follows:

  1. Set up your lab environment and learn how to work with your GCP environment.
  2. Use of common open source tools to explore your network around the world.
  3. Testing and monitoring your network and instances.
  4. Cleanup.

As you progress, you'll learn how to perform basic networking tasks on Google Cloud Platform (including Compute Engine instances) and how GCP might differ from an on-premises setup. As indicated above, we'll set up a demo environment with a network and 5 subnetworks that you will use throughout the lab.

What you'll learn

  • How to measure latency between Google Compute Engine regions and zones
  • How to test network connectivity and performance using open source tools
  • How to inspect network traffic using Open Source tools

What you'll need

How will you use this tutorial?

Read it through once Read it and complete the exercises

How would rate your experience with Google Compute Engine?

Novice Intermediate Proficient

Self-paced environment setup

If you don't already have a Google Account (Gmail or Google Apps), you must create one. Sign-in to Google Cloud Platform console (console.cloud.google.com) and create a new project:

networking-project.png

Remember the project ID, a unique name across all Google Cloud projects (the name above has already been taken and will not work for you, sorry!). It will be referred to later in this codelab as PROJECT_ID.

Next, you'll need to enable billing in the Developers Console in order to use Google Cloud resources and enable the Compute Engine API.

Running through this codelab shouldn't cost you more than a few dollars, but it could be more if you decide to use more resources or if you leave them running (see "cleanup" section at the end of this document). Google Compute Engine pricing is documented here.

New users of Google Cloud Platform are eligible for a $300 free trial.

Use Google Cloud Shell

To interact with the Google Cloud Platform we will use the Google Cloud Shell throughout this code lab.

Google Cloud Shell is a Debian-based virtual machine pre-loaded with all the development tools you'll need that can be automatically provisioned from the Cloud Console. This means that all you will need for this codelab is a browser (yes, it works on a Chromebook).

To activate Google Cloud Shell, from the Cloud console simply click the button on the top right-hand side (it should only take a few moments to provision and connect to the environment):

activateCloudShell.png

cloudShellWelcome.png

Once connected to the cloud shell, you should see that you are already authenticated and that the project is already set to your PROJECT_ID. Run the following command and you should see the following output:

gcloud auth list

Command output

Credentialed accounts:
 - <myaccount>@<mydomain>.com (active)
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If for some reason the project is not set, simply issue the following command :

gcloud config set project <PROJECT_ID>

Looking for you PROJECT_ID? It's the ID you used in the setup steps. You can find it in the console dashboard any time:

Project_ID.png

First, we will set up the lab network through Google Cloud Shell by using a predefined Deployment Manager script.

Let's enable the Deployment Manager API on your project.

Navigate to: https://console.developers.google.com/apis/api/deploymentmanager/overview in the browser, make sure your <PROJECT_ID> project is selected on top, and hit the Enable button to enable Google Cloud Deployment Manager V2 API (if not already enabled).

Screenshot 2017-01-06 at 09.56.57.png

To run the script, open Cloud Shell in your lab project and copy the scripts to your local environment with gsutil:

mkdir nw-testing
cd nw-testing
gsutil cp gs://nw-testing/* .

if you list files in the nw-testing folder you should see a set of .yaml and .jinja files used for simpler deployment of resources. Now create the initial deployment with the following command:

gcloud deployment-manager deployments create nw-testing  \
--config networking-lab.yaml

This command sets up the environment consisting of one network with five subnetworks in different regions and five Debian VMs in those subnetworks. Some basic networking tools are also pre-installed by the deployment manager script.

The following diagram shows the setup created:

Accelerate 2017 %2F Networking 101 lab drawings.png

Optional: While the deployment is completing, feel free to look at the Deployment Manager Configuration (the .yaml and .jinja files) downloaded with the Cloud Shell Code Editor. -Launch the Code Editor by selecting it in the shell on the right:

Once the deployment is finished you see output like this:

Waiting for create operation-1483456646603-545322a75e0f9-b7026fac-84fde2fd...done.                                                                                      Create operation operation-1483456646603-545322a75e0f9-b7026fac-84fde2fd completed successfully.
NAME            TYPE                   STATE      ERRORS  INTENT
nw-testing      compute.v1.network     COMPLETED  []
asia-east1      compute.v1.subnetwork  COMPLETED  []
asia1-vm        compute.v1.instance    COMPLETED  []
e1-vm           compute.v1.instance    COMPLETED  []
eu1-vm          compute.v1.instance    COMPLETED  []
europe-west1    compute.v1.subnetwork  COMPLETED  []
us-east1        compute.v1.subnetwork  COMPLETED  []
us-west1-s1     compute.v1.subnetwork  COMPLETED  []
us-west1-s2     compute.v1.subnetwork  COMPLETED  []
w1-vm           compute.v1.instance    COMPLETED  []
w2-vm           compute.v1.instance    COMPLETED  []

To verify, once you return to the Cloud Console, navigate to Compute Engine:

Then hit refresh under VM instances:

You should see five VMs like this:

Screenshot 2017-01-03 at 16.22.03.png

Try to connect to one of the VMs by clicking on the SSH button on the Console.

You should succeed and get a command prompt!

Screenshot 2017-01-04 at 11.32.55.png

Use ping to measure the latency between instances within a zone, within a region, and between all the regions.

For example, to observe the latency from the US East region to the Europe West region run the following command after opening an SSH window on the e1-vm:

ping eu1-vm

Use Ctrl-C to exit the ping.

The latency you get back is the "Round Trip Time" (RTT) , the time the packet takes to get from e1-vm to eu1-vm plus the response from eu1-vm to e1-vm.

Ping uses the ICMP Echo Request and Echo Reply Messages to test connectivity.

❔What is the latency you see between regions? What would you expect under ideal conditions? What is special about the connection from eu1-vm to asia1-vm?❔

See the answer on the next page.

Answer to the question

Under ideal conditions, the latency would be limited by the ideal speed of light in fiber, which is roughly 202562 km/s or or 125866 miles/s. (Actual reachable speed is still a bit lower than that).

You can estimate the length of the fiber either by distance as the crow flies (straight line) or by land transport. You have to multiply the result by two to account for a round trip.

Between continents as the crow flies is usually the only way. If you want to estimate latency for a customer before testing, road distance is usually the better estimate, as roads, like fibers, don't follow ideal paths. You can use any mapping tool such as this one to estimate the distance.

For the available GCE regions, we know the location. We can calculate the ideal latency as shown in the following example:

VM 1: e1-vm (Berkeley County, South Carolina)

VM 2: eu1-vm (St. Ghislain, Belgium)

Distance as the crow flies: 6837.20km

Ideal latency: 6837.20 km / 202562 km/s * 1000 ms/s * 2 = 67.51 ms

Observed latency: 93.40 ms (minimum counts)


The difference is due to a non-ideal path (for example, transatlantic fibers all landing in the NY/NJ area) as well as active equipment in the path (much smaller difference).

See this table for all ideal / observed latencies:

Screenshot 2017-01-10 at 15.24.47.png

As you can see the latency between the EU and Asia locations is very high. This is the case because Google Compute Engine does not have a direct link it can use between Europe and Asia at this time.

Screenshot 2017-05-19 at 15.36.16.png

From a networking point of view, it is recommended that if you run a service using only ONE global location, that location be in Central US. Depending on how your user-base is split, US East or West might also be recommended.

Pinging external hosts

You can also ping any well known hosts (hosts where you know the physical location) to see how the latency compares to the ideal latency (for example, ping co.za in South Africa).

Ping can also be used to measure packet loss: at the end of a run it mentions the number of lost packets and the packet loss in percent. You can use several flags to improve testing. For example:

ping -i0.2 w2-vm #(sends a ping every 200ms)
sudo ping -i0.05 w2-vm -c 1000 #(sends a ping every 50ms, 1000 times)
sudo ping -f -i0.05 w2-vm #(flood ping, adds a dot for every sent packet, and removes one for every received packet)  - careful with flood ping without interval, it will send packets as fast as possible, which within the same zone is very fast
sudo ping -i0.05 w2-vm -c 100 -s 1400 #(send larger packets, does it get slower?)

Traceroute is a tool to trace the path between two hosts.

As a traceroute can be a helpful first step to uncover many different network problems, support or network engineers often ask for a traceroute when diagnosing network issues.

Let's try it out.

From any VM (e.g. e1-vm) run a traceroute, for example:

traceroute www.icann.org

Now try a few other destinations and also from other sources:

  • VMs in the same region or another region (eu1-vm, asia1-vm, w2-vm)
  • www.wikipedia.org
  • www.adcash.com
  • bad.horse (works best if you increase max TTL, so traceroute -m 255 bad.horse)
  • Anything else you can think of

Use Ctrl-C if at any time you want to return to the command line.

❔ What do you notice with the different traceroutes? ❔

See the answer on the next page.

Answer to the question

You might have noticed some of the following things:

  • Last hop on traceroute is not destination: This is true for nearly all external examples. The reason for this is that traceroute performs a reverse DNS lookup for every host in the path. The reverse lookup for the last host might be not implemented (e.g. www.stackoverflow.com) or might be different than the name given for the forward DNS (e.g. www.gnu.org)
  • Traceroute shows only stars at the end: This means there is probably a firewall in-between blocking either the incoming UDP/ICMP packets or the outgoing ICMP packets (or both). With some hosts (e.g. www.wikipedia.org) you observe different behaviour with traceroute or mtr, which shows that UDP packets only seem to be discarded.
  • Other VMs (even on different continents), www.google.com, www.adcash.com seem only one hop away: This is due to the network virtualization layer. In certain settings, the TTL of the inner packet is never decreased, although there are many physical hosts in-between. www.google.com and www.adcash.com (their website is hosted on Google Cloud Platform) both are cases where a routing happens mostly encapsulated due to packets staying inside the Google (Software Defined) Network.
  • Multiple paths showing: Traceroute always sends three packets with the same TTL, and those might be routed over different paths (for example, different MPLS TE paths or ECMP routing). So this is nothing to worry about.
  • Traceroute shows stars in the middle: This is because a host in the middle might not respond correctly with TTL exceeded messages or those might be filtered somewhere on the way.
  • Traceroute to bad.horse looks funny: This is an intended easter egg and can be built with a bunch of public IPs and virtual routers. See this post on how to create such a traceroute if you're interested.

MTR

You can also use the tool "mtr" (Matt's traceroute) for a continuous traceroute to the destination and to also capture occasional packet loss. It combines the functionality of traceroute and ping and also uses ICMP echo request packets instead of UDP for the outgoing packet.

Try:

mtr www.icann.org

and any other hosts. Use q to quit.

Some important caveats when working with traceroute/mtr:

  • Traceroutes only show the route from the source to the destination hosts. Please note that routes in IP can be asynchronous and the return path can be very different from the forward path. So to get a full picture, you will need to provide traceroute from the source to the destination as well as from the destination to the source. Often a forward traceroute suddenly "jumps in latency" for no reason, while the reason is only visible from a very different reverse path between the hops.
  • High latency or even loss on intermediate hops do not necessarily indicate a problem. Many hardware routers treat packets destined for/originating from the router in software, so they are slow, while packets passing through are forwarded in hardware.
  • The number of hops is largely irrelevant and a high number of hops does not indicate a problem.

You can use iperf to test the performance between two hosts. One side needs to be set up as the iperf server to accept connections.

First do a very simple test:


On eu1-vm run:

iperf -s #run in server mode

On e1-vm run:

iperf -c eu1-vm #run in client mode, connecting to eu1-vm

You will see some output like this:

------------------------------------------------------------
Client connecting to eu-vm, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 10.20.0.2 port 35923 connected with 10.30.0.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   298 MBytes   249 Mbits/sec

On eu1-vm use Ctrl-C to exit the server side when not needed anymore.

Test this between different VMs. You will see that within a region, the bandwidth is limited by the 2 Gbit/s per core egress cap.

Between regions you reach much lower limits, mostly due to limits on TCP window size and single stream performance. You can increase bandwidth between hosts by using other parameters. e.g. use UDP:

On eu1-vm run:

iperf -s -u #iperf server side

On e1-vm run:

iperf -c eu1-vm -u -b 2G #iperf client side - send 2 Gbit/s

This should be able to achieve a higher speed between EU and US.

Even higher speeds can be achieved by running a bunch of TCP iperfs in parallel.

On eu1-vm run:

iperf -s

On e1-vm run:

iperf -c eu1-vm -P 20

The combined bandwidth should be really close to the maximum achievable bandwidth.

Test a few more combinations, if you use Linux on your laptop you can test against your laptop as well. You can also try iperf3 which is available for many OSes, but this is not part of the lab.

As you can see, to reach the maximum bandwidth, just running a single TCP stream (for example, file copy) is not sufficient; you need to have several TCP sessions in parallel. Reasons are TCP parameters such as Window Size and functions such as Slow Start (see TCP/IP Illustrated for excellent information on this and all other TCP/IP topics). Tools like bbcp can help to copy files as fast as possible by parallelizing transfers and using configurable window size.

Running TCPDump interactively

On w1-vm (or any other VM) run:

sudo tcpdump -c 1000 -i eth0 not tcp port 22

Now on w2-vm (or any other VM) run:

ping -c 100 w1-vm

Switch your window to w1-vm and you should see the incoming ICMP packets (along with some organic traffic). You can exit the tcpdump via Ctrl-C.

Saving a packet capture file

Now let's collect a full packet capture for an HTTP request, similar to what a support engineer might request from you.

On us-vm1 install the Apache webserver

sudo apt-get -y install apache2

Start collecting a packet capture with port 80 traffic (the -s 1460 command tells it to collect the full packets, not just the header):

sudo tcpdump -i eth0 -c 1000 -s 1460 -w webserver.pcap tcp port 80

In another window on us-vm2, make a HTTP request to the webserver for an existing page and another one for a non-existent page.

curl w1-vm

curl w1-vm/404.php

You should not see any output on the tcpdump as it is written to a file. Stop the tcpdump on w1-vm by pressing Ctrl-C.

The webserver.pcap file includes a capture of the packets.

[Optional] Analyzing the packet capture file

First we can "read" the packet capture by using tcpdump.

sudo tcpdump -nr webserver.pcap

This shows some details, but it doesn't show more than just the basic protocol, source and destination information.

To get more information you can use tools like Wireshark or Cloudshark.


But first you need to get the file copied to your own machine

Laptop with Google Cloud SDK (gcloud) installed

If you have gcloud on your local laptop and are authenticated to your project, you can use the following command to copy the webserver.pcap file to the current directory:

gcloud config set project <your-project-name> #only if not done before

gcloud compute copy-files w1-vm:~/webserver.pcap webserver.pcap --zone west1-b

No Google Cloud SDK (gcloud) on local laptop

If you don't have gcloud installed on your local laptop (for example, Chromebook), you can use the Cloud Shell to copy the file to Google Cloud Storage. Run the following commands:

gcloud compute copy-files w1-vm:~/webserver.pcap webserver.pcap --zone us-west1-b

gsutil mb gs://username-lab #replace username with a unique username

gsutil cp webserver.pcap gs://username-lab/

You can now download the webserver.pcap file to your local machine with the Storage Browser (click the Bucket followed by the file). After copying the file to your local machine delete the bucket again.

gsutil rm -r gs://username-lab/

If you have a Windows, Mac, or Linux laptop, you can use Wireshark to open and analyze the pcap file.

After you open the file, you can click on the different packets and inspect the headers and content with the lower pane (not shown in this lab. Ask your facilitators for assistance if required).

If you don't want to install software and the packet capture does NOT contain confidential information (like in this case) you can use CloudShark instead, which is a simple Cloud version of Wireshark.

  1. Sign up (you can use your Google account)
  2. Upload the packet capture by dragging it in the "Upload Files" box or clicking the "Upload Files" box and selecting the file.
  1. Screenshot 2015-09-28 at 20.18.20.png
  2. Select the webserver.pcap file to view it. You should see a screen like this:
    Screenshot 2015-09-28 at 20.21.04.png

You can now see similar information as in TCPDump on the top where you see each package. However Cloudshark (and Wireshark) are aware of some L7 protocols (for example, HTTP) and can decode those. When you select a packet, in the middle tab you can drill into the protocols from outer (Ethernet frame) to inner (HTTP) layer and expand on the different sections and headers. On the bottom you see the Ethernet frame in hexadecimal and see where exactly the information Cloudshark/Wireshark is decoding is located.
Screenshot 2015-09-28 at 20.27.23.png

Play around with CloudShark/Wireshark a bit more. Please note that the advanced functionality, like following TCP streams, is only available in the paid version of CloudShark. This functionality is available in Wireshark for free.

Let's release the resources created during the code lab. Please make sure you are in the Cloud Shell for these commands (not one of your VM instances).

To delete the automatically created deployment (with the networks and subnetworks) run the following command in Cloud Shell:

gcloud deployment-manager deployments delete nw-testing

You have passed the Network Performance Testing Codelab!

What we've covered

  • How to measure latency between Google Compute Engine regions and zones
  • How to test network connectivity and performance using open source tools
  • How to inspect network traffic using Open Source tools

Next Steps

  • Use this knowledge in your own deployments
  • Learn more about Networking on Google Cloud Platform

Learn More