{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# EG vs. DigitalOcean\n", "\n", "In August 2016 we've seen claims that encrypted connection from Egypt to DigitalOcean hosts are heavily throttled.\n", "\n", "Based on our testing, the `digitalocean.pcap` file contains the record of several thousand attempts to establish HTTPS connection to [alexmerkel.com](https://alexmerkel.com) from an Egyptian ISP." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Massaging the data" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!tshark -r digitalocean.pcap -n -T fields \\\n", " -e frame.number \\\n", " -e frame.time_epoch \\\n", " -e tcp.flags.str \\\n", " -e tcp.srcport -e tcp.dstport \\\n", " -e tcp.seq -e tcp.ack \\\n", " -e tcp.options.timestamp.tsval -e tcp.options.timestamp.tsecr >pcap.tsv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`RST` packets from our host have no [TCP timestamp](https://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_timestamps) options, so it's easier to filter them out when they don't provide any valuable information." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+ awk !($5 == 443 && $3 == \"*********R**\") pcap.tsv\r\n" ] } ], "source": [ "!set -o xtrace; awk '!($$5 == 443 && $$3 == \"*********R**\")' pcap.tsv >good.tsv" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 196941 pcap.tsv\r\n", " 196922 good.tsv\r\n", " 393863 total\r\n" ] } ], "source": [ "!wc -l pcap.tsv good.tsv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dump has ~200k packets." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import datetime\n", "import matplotlib\n", "import matplotlib.pylab as plt\n", "fromtimestamp = datetime.datetime.fromtimestamp" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "head = 'no time_epoch flags srcport dstport seq ack tsval tsecr'.split()\n", "types = [np.uint32, np.float64, str, np.uint16, np.uint16, np.uint32, np.uint32, np.uint32, np.uint32]\n", "d = pd.read_csv('good.tsv', delimiter='\\t',\n", " names=head,\n", " dtype=dict(zip(head, types)))\n", "d.time_epoch -= d.time_epoch.min() # to see milliseconds better" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are some TCP packets arriving from the server. It's possible to use TCP timestamps to understand RTT to the server better as server echoes last seen client's TCP timestamp back. PCAPs contain the exact wall clock timestamp corresponding to the client's TCP timestamp which makes it possible to estimate RTT." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "smpl = (54832, 15126214) # just a sample" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "no 30313\n", "time_epoch 1723.59\n", "flags *******A****\n", "srcport 443\n", "seq 1\n", "ack 297\n", "tsval 241524960\n", "Name: (54832, 15126214), dtype: object" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resp = d[d.srcport == 443]\n", "resp = resp.drop_duplicates(['dstport', 'tsecr'], keep='first')\n", "resp = resp.set_index(['dstport', 'tsecr'], verify_integrity=True)\n", "resp.index.rename(['port', 'tcpts'], inplace=True)\n", "rs = resp.loc[smpl]\n", "rs" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | no | \n", "time_epoch | \n", "flags | \n", "srcport | \n", "dstport | \n", "seq | \n", "ack | \n", "tsval | \n", "tsecr | \n", "
---|---|---|---|---|---|---|---|---|---|
30289 | \n", "30294 | \n", "1722.474151 | \n", "*******A**** | \n", "54832 | \n", "443 | \n", "1 | \n", "1 | \n", "15126214 | \n", "241524681 | \n", "
30290 | \n", "30295 | \n", "1722.475029 | \n", "*******AP*** | \n", "54832 | \n", "443 | \n", "1 | \n", "1 | \n", "15126214 | \n", "241524681 | \n", "