Tievolu's Tomato STM Monitor

Tievolu's Tomato STM Monitor

Contents

Introduction
Setup Instructions
   Option 1 - Trusting
   Option 2 - Secure
General Information
STM Prevention
Intelligent STM Prevention/Mitigation
Customising Settings
Change History
Feedback

Introduction

This script is for use on routers running Tomato, connected to the Internet through a Virgin Media UK cable connection. VM's cable connections implement Subscriber Traffic Management (STM) i.e. the WAN bandwidth is limited for a set time (typically five hours) when a predefined amount of data is transferred between certain hours of the day. When STM is triggered, your upload or download bandwidth is cut by 75% and your perfect finely tuned QOS settings are rendered utterly useless.

This script aims to cope with STM by monitoring the WAN bandwidth usage and applying suitable QOS settings. There are three modes of operation:

  1. STM Mitigation - This is the default mode. If an STM limit is exceeded, QOS limits are applied in accordance with the bandwidth limits imposed by STM.
  2. STM Prevention - If bandwidth usage is on target to exceed an STM limit, QOS limits are applied to prevent that from happening. Activated by the "-p" option (see below).
  3. Intelligent STM Prevention/Mitigation - If bandwidth usage is on target to exceed an STM limit, the script applies STM Prevention only if it will be better (in terms of long term data transfer) than allowing STM to be triggered and mitigated. Activated by the "-i" option (see below).

When the STM sentence has been served the QOS settings return to normal.

Setup Instructions

Option 1 - Trusting

Add the following to your "WAN Up" script, save and reboot, to have your router download the latest version of the script at startup and set it to run every two minutes:

    wget -O /tmp/stm-monitor.sh http://www.tievolu.co.uk/stm/stm-monitor.sh
    chmod 755 /tmp/stm-monitor.sh
    cru a STM-Monitor "1-59/2 * * * * /tmp/stm-monitor.sh [broadband type] [-p|-i]"
    logger -t STM-Monitor "Downloaded `head /tmp/stm-monitor.sh | grep \"STM Monitor\" | sed 's/# //g'`"

Where [broadband type] = S5, M, L, XL, ML20, L30, XL30, XXL, XL60 or XXL100, [-p] activates STM Prevention, and [-i] activates Intelligent STM Prevention/Mitigation.

Note: Default tier settings assume that your upload speed has been upgraded to the latest 2012 speeds. If you haven't been upgraded yet you can customise the configuration accordingly.

Option 2 - Secure

Alternatively, if you're a security conscious type and you don't trust that I won't modify the script to do nasty things to your router (I won't, but I could :o) ), you can download the script to your JFFS partition permanently and run it from there. Of course, this way you won't automatically download any improvements I make.

So, assuming your router has enough flash, and you like the script the way it is right now, create and mount your JFFS partition using the Tomato GUI (Administration -> JFFS2) and then issue the following commands:

    wget -O /jffs/stm-monitor.sh http://www.tievolu.co.uk/stm/stm-monitor.sh
    chmod 755 /jffs/stm-monitor.sh

These commands only need to be run once i.e. just connect to your router via telnet/ssh and run them - don't add them to your WAN Up or Startup script.

Then you just need the following command in your WAN Up or Startup script to set up the cron job when the router boots:

    cru a STM-Monitor "1-59/2 * * * * /jffs/stm-monitor.sh [broadband type] [-p|-i]"

Where [broadband type] = S5, M, L, XL, ML20, L30, XL30, XXL, XL60 or XXL100, [-p] activates STM Prevention, and [-i] activates Intelligent STM Prevention/Mitigation.

Note: Default tier settings assume that your upload speed has been upgraded to the latest 2012 speeds. If you haven't been upgraded yet you can customise the configuration accordingly.

General Information

The upload QOS bandwidth values are set at ~91% of the cable modem bandwidth limiter (download set at 100%), which works well for my XL connection. I'm assuming this should work ok for most users. If it doesn't, you can customise your settings accordingly.

The details of the STM policy are based on the information published by VM.

Events are recorded in the system log, and details are displayed in a small web page located at:

    http://[router IP address]/ext/stm-monitor.htm   (or https://... if you've enabled SSL)        e.g. http://192.168.1.1/ext/stm-monitor.htm

The page will look something like this:

Tomato STM Monitor v1.02

Status at 16:05

STM DescriptionStartEndBandwidth LimitBandwidth UsedAverage Rate
Daytime Downstream (XL)10:0015:007000 MBN/AN/A
Evening Downstream (XL)16:0021:003500 MB0.86 MB3.69 KB/s (29.52 kb/s)
Evening Upstream (XL)15:0020:001400 MB278.76 MB74.34 KB/s (594.72 kb/s)

STM Prevention: Inactive

STM Mitigation: Inactive

Current QOS bandwidth: 20480 kb/s down, 700 kb/s up

Customisable settings:

  • STM Prevention Threshold: 80%
  • Normal Inbound Bandwidth: 20480 kb/s
  • STM Inbound Bandwidth: 5120 kb/s
  • Normal Outbound Bandwidth: 700 kb/s
  • STM Outbound Bandwidth: 174 kb/s

Latest Log Entries

Feb 1 17:37:47 tomato user.notice STM-Monitor: Downloaded Tomato STM Monitor v1.01
Feb 1 17:39:07 tomato user.notice STM-Monitor: STM sentence resumed: applying STM QOS configuration (until 22:29)
Feb 1 22:31:01 tomato user.notice STM-Monitor: STM sentence served: applying normal QOS configuration
Feb 2 15:03:07 tomato user.notice STM-Monitor: Setting preventative outbound QOS limit of 575 kbits/s
Feb 2 15:47:08 tomato user.notice STM-Monitor: Removing preventative outbound QOS limit
Feb 2 16:03:13 tomato user.notice STM-Monitor: Setting preventative inbound QOS limit of 1187 kbits/s
Feb 2 16:33:07 tomato user.notice STM-Monitor: Removing preventative inbound QOS limit

The entries in the table are self explanatory, except perhaps for the Average Rate column. The script takes the average transfer rate for each active STM period and compares it to the maximum permissible rate - i.e. the rate which would mean that exactly 100% of the bandwidth limit was used. If the actual transfer rate is less than 90% of the maximum rate the cell is green, if it's between 90% and 100%it is amber, and anything greater than 100% is red. This lets you check easily whether or not you are heading for some STM punishment!

The script assumes that the time on your router is accurate to within a minute or so, so make sure your NTP settings are correct. It also only works correctly if your router is left on throughout each STM period, because rebooting destroys the bandwidth stats it relies on.

Note that the the script uses NVRAM variables to track STM sentences, so if an STM limit is exceeded the sentence will continue to be served correctly even if the router is rebooted part way through. However, the same is not true for a power cycle, because the NVRAM variables are not committed to flash.

STM Prevention

With STM prevention enabled, the script monitors the average transfer rate, checking whether it has exceeded the maximum rate (i.e. when the cell is red and you're on course to trigger STM). If this happens, the script calculates how much bandwidth you have left in this STM period, and limits the inbound or outbound QOS such that you not quite be able to transfer that amount of data before the STM period ends. The actual limit imposed differs for inbound and outbound QOS - for outbound QOS the rate is set such that 90% of the remaining bandwidth could be uploaded, but for inbound QOS the rate is set such that only 50% of the remaining bandwidth can be transferred.

Unfortunately it has to be as harsh as 50% to work properly for inbound QOS. If it's set any higher we won't get the result we want because VM counts all the data that arrives at the router before QOS throws some of it away in accordance with the temporary limit we've imposed, and it's in TCP's nature to keep stuffing more data down the pipe to try and increase the transfer rate. I've done extensive testing and 50-55% is about as high as you can go.

For both inbound and outbound QOS, the temporary limit will be removed when the average rate drops below 90% of the maximum, or when the STM period ends.

STM prevention only becomes active once a certain portion of your bandwidth limit has been transferred. The "STM prevention threshold" is, by default, 80% of the limit for the relevant STM period. This setting can be modified as per the instructions below.

The end result of STM prevention is that you may take a more severe bandwidth cut than if you triggered STM, but that cut will usually last for a much shorter period of time.

Note that triggering STM Prevention under these circumstances is not always beneficial. For example, on M+ in the 1600-2100 evening downstream STM period, with STM Prevention enabled with the default 80% threshold, you could download 600MB (80% of 750MB) at full speed before STM Prevention kicked in. Assuming you downloaded that amount at the full 10Mb/s, STM Prevention would kick in at 16:08 and limit your bandwidth to 7.9KB/s so that you could only download 90% of the remaining 150MB over the next 4 hours 52 mins. If STM Prevention was not enabled you'd be STM'd two minutes later at 16:10 and receive five hours limited to 320KB/s.

So in this scenario STM Prevention is clearly much worse than allowing STM to be triggered. Intelligent STM Prevention/Mitigation aims to address this.

Intelligent STM Prevention/Mitigation

The Intelligent STM Prevention/Mitigation mode automatically decides which course of action is best - Prevention or Mitigation. If you are on target to exceed an STM limit, the script works out which option will allow more data to be transferred in the long term. Specifically, the algorithm works like this:

  1. If the average transfer rate indicates that STM will be triggered (i.e. the cell in the table is red), calculate how long it will be before that happens.
  2. Calculate how much data could be transferred over that time at the uncapped rate, plus the amount of data that could be transferred during the STM sentence at the capped rate.
  3. Calculate how much data could be transferred over the same total time period if we enabled STM Prevention to prevent STM from being triggered.
  4. Compare the values from (2) and (3), and proceed with the option that will allow more data to be transferred.

The details of the algorthim's decisions are written to the system log and can be seen in the Latest Log Entries section of the status page.

Customising Settings

Most settings can be customised by creating a file called stm-monitor.cfg in either /tmp or /jffs (the file in /tmp takes precedence). The file should define one or more of the following variables:

    STM_PERIOD_BW_LIMIT_1
    STM_PERIOD_BW_LIMIT_2
    STM_PERIOD_BW_LIMIT_3
    STM_PREVENTION_THRESHOLD
    STM_INBOUND_BANDWIDTH_WITHOUT_STM
    STM_INBOUND_BANDWIDTH_WITH_STM
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM
    STM_OUTBOUND_BANDWIDTH_WITH_STM

If you're not using JFFS, you can create the file in /tmp every time the router boots by entering command like this in your WAN Up script:

    echo "STM_PERIOD_BW_LIMIT_1=" >> /tmp/stm-monitor.cfg
    echo "STM_PERIOD_BW_LIMIT_2=" >> /tmp/stm-monitor.cfg
    echo "STM_PERIOD_BW_LIMIT_3=" >> /tmp/stm-monitor.cfg
    echo "STM_PREVENTION_THRESHOLD=" >> /tmp/stm-monitor.cfg
    echo "STM_INBOUND_BANDWIDTH_WITHOUT_STM=" >> /tmp/stm-monitor.cfg
    echo "STM_INBOUND_BANDWIDTH_WITH_STM=" >> /tmp/stm-monitor.cfg
    echo "STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=" >> /tmp/stm-monitor.cfg
    echo "STM_OUTBOUND_BANDWIDTH_WITH_STM=" >> /tmp/stm-monitor.cfg

For example, to set the STM prevention threshold to 50%:

    echo "STM_PREVENTION_THRESHOLD=50" >> /tmp/stm-monitor.cfg

Default values are as follows:

    XXL100 (100Mb down, 10Mb up)
    STM_PERIOD_BW_LIMIT_1=20000
    STM_PERIOD_BW_LIMIT_2=10000
    STM_PERIOD_BW_LIMIT_3=12000
    STM_PREVENTION_THRESHOLD=80
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=102400 # Not sure if this is accurate
    STM_INBOUND_BANDWIDTH_WITH_STM=51200
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=9318 # 10240 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=2330 # 9318 * 0.25

    XL60 (60Mb down, 6Mb up)
    STM_PERIOD_BW_LIMIT_1=10000
    STM_PERIOD_BW_LIMIT_2=5000
    STM_PERIOD_BW_LIMIT_3=7000
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=64453 # 66000000 bps
    STM_INBOUND_BANDWIDTH_WITH_STM=30720
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=5591 # 6144 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=1398 # 5591 * 0.25

    XXL (50Mb down, 5Mb up)
    STM_PERIOD_BW_LIMIT_1=10000
    STM_PERIOD_BW_LIMIT_2=5000
    STM_PERIOD_BW_LIMIT_3=6000
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=51757 # 53000000 bps
    STM_INBOUND_BANDWIDTH_WITH_STM=25879
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=4659 # 5120 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=1630 # 4659 * 0.25

    XL30 (30Mb down, 3Mb up)
    STM_PERIOD_BW_LIMIT_1=7000
    STM_PERIOD_BW_LIMIT_2=3500
    STM_PERIOD_BW_LIMIT_3=4200
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=32548 # 33330000 b/s
    STM_INBOUND_BANDWIDTH_WITH_STM=16274
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=2962 # 3333000 b/s * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=740 # 2962 * 0.25

    L30 (30Mb down, 3Mb up)
    STM_PERIOD_BW_LIMIT_1=7000
    STM_PERIOD_BW_LIMIT_2=3500
    STM_PERIOD_BW_LIMIT_3=4200
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=32548 # 33330000 b/s
    STM_INBOUND_BANDWIDTH_WITH_STM=16274
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=2962 # 3333000 b/s * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=740 # 2962 * 0.25

    ML20 (20Mb down, 2Mb up)
    STM_PERIOD_BW_LIMIT_1=7000
    STM_PERIOD_BW_LIMIT_2=3500
    STM_PERIOD_BW_LIMIT_3=3000
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=20480
    STM_INBOUND_BANDWIDTH_WITH_STM=5120
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=1864 # 2048 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=466 # 1864 * 0.25

    XL (20Mb down, 2Mb up)
    STM_PERIOD_BW_LIMIT_1=7000
    STM_PERIOD_BW_LIMIT_2=3500
    STM_PERIOD_BW_LIMIT_3=3000
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=20480
    STM_INBOUND_BANDWIDTH_WITH_STM=5120
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=1864  # 2048 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=466 # 1864 * 0.25

    L (10Mb down, 1Mb up)
    STM_PERIOD_BW_LIMIT_1=3000
    STM_PERIOD_BW_LIMIT_2=1500
    STM_PERIOD_BW_LIMIT_3=1500
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=10240
    STM_INBOUND_BANDWIDTH_WITH_STM=2560
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=932  # 1024 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=233 # 932 * 0.25

    M (10Mb down, 1Mb up)
    STM_PERIOD_BW_LIMIT_1=1500
    STM_PERIOD_BW_LIMIT_2=750
    STM_PERIOD_BW_LIMIT_3=750
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=10240
    STM_INBOUND_BANDWIDTH_WITH_STM=2560
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=932  # 1024 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=233 # 932 * 0.25

    S5 (5Mb down, 512Kb up)
    STM_PERIOD_BW_LIMIT_1=500
    STM_PERIOD_BW_LIMIT_2=250
    STM_PERIOD_BW_LIMIT_3=200
    STM_INBOUND_BANDWIDTH_WITHOUT_STM=10240
    STM_INBOUND_BANDWIDTH_WITH_STM=2560
    STM_OUTBOUND_BANDWIDTH_WITHOUT_STM=932 # 1024 * 0.91
    STM_OUTBOUND_BANDWIDTH_WITH_STM=233 # 932 * 0.25

Change History

Feedback

I now have very little time to work on this script, but please direct any questions, comments or suggestions to this thread on the Tomato forum, and I'll see what I can do.