Subversion Repositories DevTools

Rev

Rev 4719 | Blame | Compare with Previous | Last modification | View Log | RSS feed

Sleep Tight
===========

Overview
--------
Sleep tight is a python script _sleep_tight.py_ which interacts with AWS EC2 RESTful API to start and stop machines. Generally it would run it under a cron or scheduled task, though it can be run manually. Sleep tight queries tags on an instances to determine the actions it should perform.


Running
-------
Sleep tight can be run from the command line, or as a cron job.
There are currently no command line options, its operation is controlled by tags on instances.

Command line:
`python sleep_tight.py`

Suggested crontab
`*/3 * * * * python /home/tshtstsync/bin/sleep_tight.py >> /home/tshtstsync/sleep_tight.log 2>&1`

This will three times per hour, every day, all day.


Basic Logic Flow
----------------
Simply sleep tight will perform the following logic:

```flow
st=>start: Start
e=>end

firstInstance=>operation: Get first instance
getTags=>operation: Get tags from instance
isActionDay=>condition: Is today in ACTION_DAYS?
noKill=>condition: Is NO_KILL_UNTIL >= now?
lastStop=>condition: Haven't stopped this hour?
hourStop=>condition: STOP_HOUR_GMT = now?
stateStop=>condition: Is in running state?
stop=>operation: Stop instance
noStart=>condition: Is NO_BOOT_UNTIL >= now?
lastStart=>condition: Haven't started this hour?
hourStart=>condition: START_HOUR_GMT = now?
stateStart=>condition: Is in stopped state?
startUp=>operation: Start instance
moreInstances=>operation: Get next instance
hasInstance=>condition: End of Instances?

st->firstInstance->getTags->isActionDay
noKill->lastStop->hourStop->stateStop->stop
noStart->lastStart->hourStart->stateStart->startUp
moreInstances->hasInstance->getTags


getTags->isActionDay
isActionDay(yes)->noKill
isActionDay(no)->moreInstances
noKill(yes)->lastStop
noKill(no)->noStart
lastStop(yes)->hourStop
lastStop(no)->noStart
hourStop(yes)->stateStop
hourStop(no)->noStart
stateStop(yes)->stop
stateStop(no)->noStart
stop->noStart
noStart(yes)->lastStart
noStart(no)->moreInstances
lastStart(yes)->hourStart
lastStart(no)->moreInstances
hourStart(yes)->stateStart
hourStart(no)->moreInstances
stateStart(yes)->startUp
stateStart(no)->moreInstances
startUp->moreInstances
moreInstances->hasInstance
hasInstance(no)->getTags
hasInstance(yes)->e
```

Or in ASCII
```


               (START)
                  |
                  |--------<--------
                  |                 |
                 / \                |
               /     \              |
(END) ---No- /  Next   \            |
             \Instance?/            |
               \     /              |
                 \ /                |
                  |                 ^
                 Yes                |
                  |                 |
             ------------           |
             | Get Tags |           |
             ------------           |
                  |                 |
                 / \                |
               / Is  \              ^
             /  Action \ -No----->--|
             \  Day?   /            |
               \     /              |
                 \ /                |
                  |                 |
                 Yes                |
                  |                 |
                 / \                ^
               / >=  \              |
             / NO KILL \ -No-->-    |
             \  UNITL? /        |   |
               \     /          |   |
                 \ /            |   |
                  |             |   |
                 Yes            |   |
                  |             |   ^
                  |             |   |
                 / \            v   |
               / Have\          |   |
             / Stopped \ -Yes->-|   |
             \   this  /        |   |
               \ hour/          |   |
                 \ /            |   |
                  |             |   ^
                 No             |   |
                  |             v   |
                  |             |   |
                 / \            |   |
               /     \          |   |
             /STOP_HOUR\ -No-->-|   |
             \  now?   /        |   |
               \     /          |   ^
                 \ /            |   |
                  |             v   |
                 Yes            |   |
                  |             |   |
                  |             |   |
                 / \            |   |
               /     \          |   |
             /  State  \ -No-->-|   ^
             \ Running?/        |   |
               \     /          v   |
                 \ /            |   |
                  |             |   |
                 Yes            |   |
                  |             |   |
                  |             |   |
              --------          |   ^
              | STOP |          |   |
              --------          v   |
                  |             |   |
                  |------<------    |
                  |                 |
                 / \                |
               / >=  \              |
             / NO STRT \ -No----->--^
             \  UNITL? /            |
               \     /              |
                 \ /                |
                  |                 |
                 Yes                |
                  |                 |
                  |                 |
                 / \                ^
               / Have\              |
             / Started \ -Yes---->--|
             \   this  /            |
               \ hour/              |
                 \ /                |
                  |                 |
                 No                 |
                  |                 ^
                  |                 |
                 / \                |
               /     \              |
             /STRT_HOUR\ -No----->--|
             \  now?   /            |
               \     /              |
                 \ /                |
                  |                 ^
                 Yes                |
                  |                 |
                  |                 |
                 / \                |
               /     \              |
             /  State  \ -No----->--|
             \ Stopped?/            |
               \     /              ^
                 \ /                |
                  |                 |
                 Yes                |
                  |                 |
                  |                 |
              ----------            |
             | START UP |           |
              ----------            ^
                  |                 |
                   -------------->--
                  
```

Tags
----
USER tags are those configurable by the user.
APP tags, or application tags are those used by sleep tight to store state
Any non optional tags are created and set to the default values if they are not present on the instance.

Tag Type | Tag             | Default    | Description
-------- | --------------- | ---------- | --------------
USER     | Name            |            | (Optional) Used in output messages to help identify an instance.
USER     | START_HOUR_GMT  | 99         | The hour in GMT/UTC which the instance is to be started.
USER     | STOP_HOUR_GMT   | 10         | The hour in GMT/UTC which the instance is to be stopped.
USER     | NO_BOOT_UNTIL   | 2030-01-01 | Only attempt to start the instance after this date.
USER     | NO_KILL_UNTIL   | 2000-12-31 | Only attempt to stop the instance after this date.
USER     | ACTION_DAYS     | MTWHF__    | The days of the week to start or stop.
APP      | LAST_AUTO_START |            | The date and hour GMT which sleep tight started the instance
APP      | LAST_AUTO_STOP  |            | The date and hour GMT which sleep tight stopped the instance

#### Name
The name is an optional tag which is queried from the instance for use in log messages, if this tag is not present the instance id is used.
e.g.
2014-11-28 07 : i-18592226 80 99 10  ---  Do Nothing for TSH-TST-EPG-SERVER
                  ^ instance id used if name tag is missing      ^ name is append to end of message.

#### START_HOUR_GMT
If the current GMT hour is the same as this value sleep tight may attempt to start the instance, provided other conditions are met. In order to stop sleep tight from starting the instance set it to a number larger than 24.
**Default:** 99*
> **Note:** This value must be an integer.
> **Note:** START_HOUR_GMT and STOP_HOUR_GMT cannot be the same value even if _disabled_ with a larger value.

#### STOP_HOUR_GMT
If the current GMT hour is the same as this value sleep tight may attempt to stop the instance, provided other conditions are met. In order to prevent sleep tight from stopping the instance set it to a number larger than 24.
**Default:** *10*
> **Note:** This value must be an integer.
> **Note:** START_HOUR_GMT and STOP_HOUR_GMT cannot be the same value even if _disabled_ with a larger value.

#### NO_BOOT_UNTIL
Sleep tight will only attempt to start the instance if the current GMT hour is past midnight GMT of the supplied date.
The date must be in the format YYYY-MM-DD where YYYY is a 4 digit year, MM is a two digit month padded with zeros for months Jan-Sept, DD is a two digit date padded with zeros for days 1-9.
The following are valid values:
: 2014-01-12
: 2014-09-09
: 2014-12-31
**Default:** *2030-01-01*
> **Note:** If the date supplied is not a valid date sleep tight will apply the default NO_BOOT_UNTIL date to logic processing.

#### NO_KILL_UNTIL
Sleep tight will only attempt to stop the instance if the current GMT hour is past midnight GMT of the supplied date.
The date must be in the format YYYY-MM-DD where YYYY is a 4 digit year, MM is a two digit month padded with zeros for months Jan-Sept, DD is a two digit date padded with zeros for days 1-9.
The following are valid values:
: 2014-01-12
: 2014-09-09
: 2014-12-31
**Default:** *2030-01-01*
> **Note:** If the date supplied is not a valid date sleep tight will apply the default NO_KILL_UNTIL date to logic processing.

#### ACTION_DAYS
Specify the day of the week in which to apply start or stop actions. Sleep tight will attempt to lookup the current day in the supplied tag, if that character is present action processing will continue. Any characters other than those listed have no affect, and the list of days may be in any order.

Character | Day of Week 
--------- | ------------
M         | Monday
T         | Tuesday
W         | Wednesday
H         | Thursday
F         | Friday
S         | Saturday
U         | Sunday

The following are valid values:
: M
: MTWHFSU
: WMFUTHS
: __--++U++__--ZZZ
**Default:** *MTWHF__*
> **Note:** Thursday, and Sunday are not the first letter of their name, the value represent Thursday is **H**, while Sunday is **U**
> **Note:** Currently only uppercase letters are searched for. Lowercase letters will be ignored.


Logging
-------
Sleep tight does not log to a file. All messages are printed to standard out. It is recommended if using a cron job you redirect this to a file.
The messages are generally in the form.
DATE HOUR : INSTANCE_ID INSTANCE_STATE START_HOUR STOP_HOUR  --- ACTION_MESSAGE for INSTANCE_NAME
e.g.
2014-11-27 07 : i-09eca937 80 99 10  ---  Stopping - no action already shutting/shutdown for EBRIO-Bastion
2014-11-27 10 : i-18592226 16 99 10  ---  Stopping instance for TSH-TST-EPG-SERVER
2014-11-28 07 : i-5bb72794 80 99 10  ---  Do Nothing for COCT-EPG-TEST
2014-11-28 07 : i-def4b1e0 16 98 99  ---  (boot:2030-01-01) (kill:2030-01-01) no kill until for TSH-Sync-Server
2014-11-28 07 : i-c1ac300e 16 23 10  ---  (boot:2000-01-01) (kill:2030-01-01) no kill until(boot:2000-01-01) (kill:2030-01-01) no boot until for pfsense_VPN

> **Note:** Exceptions from python or Amazon may be present in the logs in their standard representation.


Python Libraries
----------------
Sleep tight uses the following python libraries:
boto.ec2
datetime
sys
time


AWS Configuration
-----------------
Sleep tight looks for a configuration file ~/.boto
The file should contain a [sleep_tight] record header with the following information:

[sleep_tight]
region = ap-southeast-2
aws_access_key_id = <replace-with-access-key-id>
aws_secret_access_key = <replace-with-secret-access-key>


AWS IAM Permissions
-------------------
Create a new IAM user or modify an existing user. 
Sleep tight at a minimum requires ec2 Describe anything, StartInstances, StopInstances permissions

Describe any permissions
{
  "Version": "2012-10-17",
  "Statement": [
    {
            "Effect": "Allow",
            "Action": "ec2:Describe*",
            "Resource": "*"
    }
  ]
}

Start and Stop Instances permissions
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:StartInstances",
        "ec2:StopInstances"
      ],
      "Resource": [
        "arn:aws:ec2:ap-southeast-2:786476282193:instance/*"
      ],
    }
  ]
}


Known Bugs and Limitations
--------------------------
Multiple regions has not yet been tested.
Currently will not assign an elastic IP address on startup to an instance.
Defaults to stopping instances at 10am GMT time, 6pm Perth time, there is no configuration file or similar to change this default without a variable in the code.
No support for timezones other than GMT.
Defaults NO_BOOT_UNTIL and NO_KILL_UNTIL if the dates are not in the correct format..
Currently instance appears to need a public/elastic IP address to talk to AWS, though this could be VPC configuration.


    > Good night, sleep tight, don't let the bed bugs bite.