Category Archives: Tech Blog

CodeIgniter session race conditions

About a year ago, HiretheWorld upgraded from CodeIgniter 1.7 to 2.0. During the upgrade process, we discovered a major problem with the session library: there was a race condition that effectively reset user sessions whenever an ajax request was made during the session rotation window. This bug has been somewhat mitigated in CodeIgniter 3, but since it hasn’t been “fixed”, we still need to use our fix from a year ago for our migration to CodeIgniter 3.

The problem exists because of the way browsers make ajax requests (or at least the way we make ajax requests), and the way CodeIgniter handles session rotation. We used to make two simultaneous ajax requests every poll period to update message counters on the client side. This was quite inefficient, and we’ve since consolidated those requests into a single request, but it revealed the race condition in the session library. Most of the time it should not happen to us any more, but given sufficiently long sessions, we should expect to hit the race condition regularly.

In fact, that was how we discovered the problem: sessions appeared to be reset after a random period, and because that was how we kept track of logged in users, users complained that they were automatically logged out of the system when they still had their browser open. Although many sites time out user sessions for security or performance reasons, that was not our intention, and in fact for some users, the effective timeout period was quite short.

Looking into our logs, we discovered that people were being logged out, not quite randomly, but at multiples of the session rotation period. To confirm the problem, we dropped the session rotation time down to a few seconds and were able to consistently reproduce the problem in a usably short time. We pinned the problem down to ajax requests because we noticed that “random” logouts would only happen after ajax requests. If we increased our ajax poll period to much longer than the session rotation period, sessions would only be lost at multiples of the ajax poll period. Having done a high level behavioural analysis on the system and come up with a recipe for triggering the bug, it was time to look into the session code to figure out what was wrong.

CodeIgniter stores sessions in a table indexed by a session_id field. This is a unique identifier that is randomly generated both at session creation and rotation. The session_id is sent out to the browser as a cookie after every request, and this overwrites whatever session_id is stored in the cookie. This system works quite well during normal operation when only one request is sent between the client and the server at any time. During normal operation, the same cookie would be exchanged between the client and the server. At session rotation, the server would send a new cookie to the client which would store it and re-send the new cookie back to the server on the next request. This also works properly.

What doesn’t work is when the server receives two simultaneous requests during a session rotation (simultaneous being defined as a request sent to the server before all other in-flight requests are received). Normally, what should happen during session rotation is (client is ‘C’; server is ‘S’, session_id is the numerical suffix; S1 (S2) is the session rotation operation):

request 1:  C1  --> S1 (S2) --> C2
request 2:  C2  --> S2      --> C2

During a race condition, this is what happens:

request 1:  C1  --> S1*
request 2:  C1  --> S1 (S2) --> C2
request 1 (cont.):  S1*     --> C2

The first request completes after the second request, but the second request is where the session rotation was performed. The browser receives a set-cookie when the second request replies, but then it receives another set-cookie when the first request replies. Even though the server correctly stored the new cookie, the new cookie will never be sent back to the server because it was immediately overwritten by the old cookie. Since the server expects the new cookie, and doesn’t keep track of the old cookie any more after rotation, the new session is effectively abandoned. On a subsequent request, when the client sends the old cookie back to the server, it is not recognized by CodeIgniter, and CodeIgniter creates a new session for the client (along with a new cookie).

CodeIgniter 3 mitigates this problem by not performing a session rotation if it detects an ajax request. It does this in the sess_update function which contains the following check:

/* Changing the session ID during an AJAX call causes problems,
 * so we'll only update our last_activity
 */
if ($this->CI->input->is_ajax_request())

Although this avoids the race condition where it normally occurs, it doesn’t fix the the underlying problem. Given enough time, or requests that are not identified as ajax-type, the problem will still occur. The standard solution to this problem is to store the old session_id key so that we still have the value to look up if the client sends it to the server. Because both the old and new session_id keys are available, if the server detects that an old session_id was sent by the client, it could send the updated session_id back to the client as a new set-cookie. With the default session rotation time of 5 minutes, only two session_id values need to be stored because each request round trip time is expected to be well less than the rotation time.

A pull request has already been sent to EllisLab, but has not been accepted yet. In the meantime, we’ve overridden some functions in the session library to achieve the same functionality. These changes are included below:

/*********************************************************************
 * overrides to codeigniter session library
 *
 * Handle race condition during session_id updates.
 * - Keep the old session id around in case we have to handle race
 *   conditions.
 * - Changes are marked with the string "old_session_id_changes".
 *
 * session table changes:
 * ALTER TABLE `sessions` ADD COLUMN `old_session_id` VARCHAR(40)  DEFAULT NULL COMMENT 'old session id' AFTER `user_data`,
 *  ADD INDEX `old_session_id`(`old_session_id`);
 * DELETE FROM `sessions`;
 *********************************************************************/
 
/**
 * Fetch the current session data if it exists
 *
 * @return  bool
 */
public function sess_read()
{
  // Fetch the cookie
  $session = $this->CI->input->cookie($this->sess_cookie_name);
 
  // No cookie?  Goodbye cruel world!...
  if ($session === NULL)
  {
    log_message('debug', 'A session cookie was not found.');
    return FALSE;
  }
 
  // Decrypt the cookie data
  if ($this->sess_encrypt_cookie === TRUE)
  {
    $session = $this->CI->encrypt->decode($session);
  }
  else
  {
    // encryption was not used, so we need to check the md5 hash
    $hash  = substr($session, strlen($session)-32); // get last 32 chars
    $session = substr($session, 0, strlen($session)-32);
 
    // Does the md5 hash match?  This is to prevent manipulation of session data in userspace
    if ($hash !==  md5($session.$this->encryption_key))
    {
      log_message('error', 'The session cookie data did not match what was expected. This could be a possible hacking attempt.');
      $this->sess_destroy();
      return FALSE;
    }
  }
 
  // Unserialize the session array
  $session = $this->_unserialize($session);
 
  // Is the session data we unserialized an array with the correct format?
  if ( ! is_array($session) OR ! isset($session['session_id'], $session['ip_address'], $session['user_agent'], $session['last_activity']))
  {
    $this->sess_destroy();
    return FALSE;
  }
 
  // Is the session current?
  if (($session['last_activity'] + $this->sess_expiration) < $this->now)
  {
    $this->sess_destroy();
    return FALSE;
  }
 
  // Does the IP match?
  if ($this->sess_match_ip === TRUE && $session['ip_address'] !== $this->CI->input->ip_address())
  {
    $this->sess_destroy();
    return FALSE;
  }
 
  // Does the User Agent Match?
  if ($this->sess_match_useragent === TRUE && trim($session['user_agent']) !== trim(substr($this->CI->input->user_agent(), 0, 120)))
  {
    $this->sess_destroy();
    return FALSE;
  }
 
  // Is there a corresponding session in the DB?
  if ($this->sess_use_database === TRUE)
  {
    /*
     * begin old_session_id_changes
     *
     * Search both session_id and old_session_id fields for the
     * incoming session id.
     *
     * used to be:
     * $this->CI->db->where('session_id', $session['session_id']);
     *
     * Manually create the OR condition because it causes the least
     * disturbance to existing code.
     *
     * Store the session id from the cookie so that we can see if we
     * came in through the old session id later.
     */
    $this->CI->db->where( '(session_id = ' . $this->CI->db->escape($session['session_id']) . ' OR old_session_id = ' . $this->CI->db->escape($session['session_id']) . ')' );
    $this->cookie_session_id = $session['session_id'];
    /*
     * end old_session_id_changes
     */
 
    if ($this->sess_match_ip === TRUE)
    {
      $this->CI->db->where('ip_address', $session['ip_address']);
    }
 
    if ($this->sess_match_useragent === TRUE)
    {
      $this->CI->db->where('user_agent', $session['user_agent']);
    }
 
    $query = $this->CI->db->limit(1)->get($this->sess_table_name);
 
    // No result?  Kill it!
    if ($query->num_rows() === 0)
    {
      $this->sess_destroy();
      return FALSE;
    }
 
    // Is there custom data?  If so, add it to the main session array
    $row = $query->row();
    if ( ! empty($row->user_data))
    {
      $custom_data = $this->_unserialize($row->user_data);
 
      if (is_array($custom_data))
      {
        foreach ($custom_data as $key => $val)
        {
          $session[$key] = $val;
        }
      }
    }
 
    /*
     * begin old_session_id_changes
     *
     * Pull the session_id from the database to populate the curent
     * session id because the old one is stale.
     *
     * Pull the old_session_id from the database so that we can
     * compare the current (cookie) session id against it later.
     */
    $session['session_id'] = $row->session_id;
    $session['old_session_id'] = $row->old_session_id;
    /*
     * end old_session_id_changes
     */
  }
 
  // Session is valid!
  $this->userdata = $session;
  unset($session);
 
  return TRUE;
}
 
// --------------------------------------------------------------------
 
/**
 * Write the session data
 *
 * @return  void
 */
public function sess_write()
{
  // Are we saving custom data to the DB?  If not, all we do is update the cookie
  if ($this->sess_use_database === FALSE)
  {
    $this->_set_cookie();
    return;
  }
 
  // set the custom userdata, the session data we will set in a second
  $custom_userdata = $this->userdata;
  $cookie_userdata = array();
 
  // Before continuing, we need to determine if there is any custom data to deal with.
  // Let's determine this by removing the default indexes to see if there's anything left in the array
  // and set the session data while we're at it
  foreach (array('session_id','ip_address','user_agent','last_activity') as $val)
  {
    unset($custom_userdata[$val]);
    $cookie_userdata[$val] = $this->userdata[$val];
  }
 
  /*
   * begin old_session_id_changes
   *
   * old_session_id has its own field, but it doesn't need to go into
   * a cookie because we'll always retrieve it from the database.
   */
  unset($custom_userdata['old_session_id']);
  /*
   * end old_session_id_changes
   */
 
  // Did we find any custom data?  If not, we turn the empty array into a string
  // since there's no reason to serialize and store an empty array in the DB
  if (count($custom_userdata) === 0)
  {
    $custom_userdata = '';
  }
  else
  {
    // Serialize the custom data array so we can store it
    $custom_userdata = $this->_serialize($custom_userdata);
  }
 
  // Run the update query
  $this->CI->db->where('session_id', $this->userdata['session_id']);
  $this->CI->db->update($this->sess_table_name, array('last_activity' => $this->userdata['last_activity'], 'user_data' => $custom_userdata));
 
  // Write the cookie.  Notice that we manually pass the cookie data array to the
  // _set_cookie() function. Normally that function will store $this->userdata, but
  // in this case that array contains custom data, which we do not want in the cookie.
  $this->_set_cookie($cookie_userdata);
}
 
// --------------------------------------------------------------------
 
/**
 * Update an existing session
 *
 * @return  void
 */
public function sess_update()
{
  // We only update the session every five minutes by default
  if (($this->userdata['last_activity'] + $this->sess_time_to_update) >= $this->now)
  {
    return;
  }
 
  // _set_cookie() will handle this for us if we aren't using database sessions
  // by pushing all userdata to the cookie.
  $cookie_data = NULL;
 
  /*
   * begin old_session_id_changes
   *
   * Don't need to regenerate the session if we came in by indexing to
   * the old_session_id), but send out the cookie anyway to make sure
   * that the client has a copy of the new cookie.
   *
   * Do an isset check first in case we're not using the database to
   * store extra data.  The old_session_id field only exists in the
   * database.
   */
  if ((isset($this->userdata['old_session_id'])) &&
      ($this->cookie_session_id === $this->userdata['old_session_id']))
  {
    // set cookie explicitly to only have our session data
    $cookie_data = array();
    foreach (array('session_id','ip_address','user_agent','last_activity') as $val)
    {
      $cookie_data[$val] = $this->userdata[$val];
    }
 
    $this->_set_cookie($cookie_data);
    return;
  }
  /*
   * end old_session_id_changes
   */
 
  // Save the old session id so we know which record to
  // update in the database if we need it
  $old_sessid = $this->userdata['session_id'];
  $new_sessid = '';
  do
  {
    $new_sessid .= mt_rand(0, mt_getrandmax());
  }
  while (strlen($new_sessid) < 32);
 
  // To make the session ID even more secure we'll combine it with the user's IP
  $new_sessid .= $this->CI->input->ip_address();
 
  // Turn it into a hash and update the session data array
  $this->userdata['session_id'] = $new_sessid = md5(uniqid($new_sessid, TRUE));
  $this->userdata['last_activity'] = $this->now;
 
  // Update the session ID and last_activity field in the DB if needed
  if ($this->sess_use_database === TRUE)
  {
    // set cookie explicitly to only have our session data
    $cookie_data = array();
    foreach (array('session_id','ip_address','user_agent','last_activity') as $val)
    {
      $cookie_data[$val] = $this->userdata[$val];
    }
 
    /*
     * begin old_session_id_changes
     *
     * Save the old session id into the old_session_id field so that
     * we can reference it later.
     *
     * Rewrite the cookie's session id if there are zero affected rows
     * because that means that another request changed the database
     * under the current request.  In this case, we want to return a
     * value consistent with the previous request.  Reread immediately
     * after the update call here to minimize timing problems.  This
     * should be in a transaction for databases that support them.
     *
     * Also rewrite the userdata so that future calls to sess_write
     * will output the correct cookie data.
     *
     * used to be:
     * $this->CI->db->query($this->CI->db->update_string($this->sess_table_name, array('last_activity' => $this->now, 'session_id' => $new_sessid), array('session_id' => $old_sessid)));
     */
    $this->CI->db->query($this->CI->db->update_string($this->sess_table_name, array('last_activity' => $this->now, 'session_id' => $new_sessid, 'old_session_id' => $old_sessid), array('session_id' => $old_sessid)));
 
    if ($this->CI->db->affected_rows() === 0)
    {
      $this->CI->db->where('old_session_id', $this->cookie_session_id);
      $query = $this->CI->db->get($this->sess_table_name);
 
      // We've lost track of the session if there are no results, so
      // don't set a cookie and just return.
      if ($query->num_rows() == 0)
      {
        return;
      }
 
      $row = $query->row();
      foreach (array('session_id','ip_address','user_agent','last_activity') as $val)
      {
        $this->userdata[$val] = $row->$val;
        $cookie_data[$val] = $this->userdata[$val];
      }
 
      // Set the request session id to the old session id so that we
      // won't try to regenerate the cookie again on this request --
      // just in case sess_update is ever called again (which it
      // shouldn't be).
      $this->cookie_session_id = $this->userdata['old_session_id'];
    }
    /*
     * end old_session_id_changes
     */
  }
 
  // Write the cookie
  $this->_set_cookie($cookie_data);
}
 
/*********************************************************************
 * end overrides to codeigniter session library
 *********************************************************************/

The major drawback to this patch is that it adds a new field: old_session_id, which requires a change that will not be present in a standard CodeIgniter installation. We think this isn’t too much of a concern given the benefits of having “correct” code.

As for stability, we’ve been using it for more than a year now, and haven’t run into any problems.

Posted in Tech Blog | Leave a comment

CodeIgniter3 and PHPUnit

HiretheWorld is upgrading to CodeIgniter 3. As part of the upgrade process, we had to make sure that automated testing still passed.

We use PHPUnit internally for unit testing some of our models, and that broke after the CodeIgniter upgrade. There were a few problems:

  • _call_hook was renamed to call_hook
  • a new define VIEWPATH was introduced
  • the function get_mimes was added to system/core/Common.php

Fortunately, we didn’t run into any more problems with:

  • the removal of the EXT define
  • core/Input functions that return NULL instead of false

The first problem was fixed by replacing all calls to _call_hook with calls to call_hook in PHPUnit’s copy of CodeIgniter.php which we have in application/third_party/CIUnit/core/CodeIgniter.php.

The second problem was fixed by adding the VIEWPATH define to bootstrap_phpunit.php. We have this file in application/third_party/CIUnit/bootstrap_phpunit.php. This was copied directly from our index.php.

    // The path to the "views" folder
    $view_folder = (isset($view_folder)) ? $view_folder : null;
    if ( ! is_dir($view_folder))
    {
      if ( ! empty($view_folder) && is_dir(APPPATH.$view_folder.'/'))
      {
        $view_folder = APPPATH.$view_folder;
      }
      elseif ( ! is_dir(APPPATH.'views/'))
      {
        header('HTTP/1.1 503 Service Unavailable.', TRUE, 503);
        exit('Your view folder path does not appear to be set correctly. Please open the following file and correct this: '.SELF);
      }
      else
      {
        $view_folder = APPPATH.'views';
      }
    }
 
    if (($_temp = realpath($view_folder)) !== FALSE)
    {
      $view_folder = realpath($view_folder).'/';
    }
    else
    {
      $view_folder = rtrim($view_folder, '/').'/';
    }
 
    define('VIEWPATH', $view_folder);

The third problem was fixed by copying the get_mimes function from CodeIgniter’s core/Common.php to PHPUnit’s CIUnit/core/Common.php which we have in application/third_party/CIUnit/core/Common.php.

if ( ! function_exists('get_mimes'))
{
  /**
   * Returns the MIME types array from config/mimes.php
   *
   * @return  array
   */
  function &get_mimes()
  {
    static $_mimes = array();
 
    if (defined('ENVIRONMENT') && is_file(APPPATH.'config/'.ENVIRONMENT.'/mimes.php'))
    {
      $_mimes = include(APPPATH.'config/'.ENVIRONMENT.'/mimes.php');
    }
    elseif (is_file(APPPATH.'config/mimes.php'))
    {
      $_mimes = include(APPPATH.'config/mimes.php');
    }
 
    return $_mimes;
  }
}

All this wasn’t too difficult, but it was another TODO, especially after a google search didn’t turn up any results for CodeIgniter 3 and PHPUnit. Complete versions of the 3 files that needed modifications are below (these are .php files despite the .txt extension).

Posted in Tech Blog | Leave a comment

Apache rewrite and quantum superimposition

or what happens when a programmer tries to work with server configuration files

As you know, HiretheWorld runs CodeIgniter on top of a standard LAMP stack. Today we’re going to talk about the problems we had with the ‘A’, and how programmers really shouldn’t be messing around with configuration files.

The goal was pretty simple: put basic http authentication onto a development server so that a password prompt is given to off-site users. This is easily done with the following:

<VirtualHost>
  ServerName target.hiretheworld.com
  DocumentRoot /var/www/target
 
  <Directory /var/www/target>
    AuthType Basic
    AuthName "HTW Development Server"
    AuthUserFile /var/www/htpasswd
 
    Order allow,deny
    Allow from all
    Deny from 10.0.0.1
    Require valid-user
    Satisfy any
  </Directory>
</VirtualHost>

10.0.0.1 is the ip address of the firewall, and we want to disallow requests from it unless they are authenticated with valid credentials (what the Satisfy any is for). Order allow,deny is sufficient for us because we ordinarily want anyone inside the local network to access the server, and only deny remote users. This is a pretty standard configuration and it works.

The problem started when someone wanted remote access to a path: target.hiretheworld.com/noAuthRequired/....

At first it didn’t seem like a difficult problem because we already make exceptions to paths like that in our .htaccess file. The challenge was how to do that inside httpd.conf inside a VirtualHost block. mod_access allows us to allow and deny from ip addresses and domains quite easily, but how would one allow access to a particular path? It turns out that the SetEnvIf directive is quite helpful because it gives access to the Request_URI among other things, which means that we’d need something like SetEnvIf Request_URI ^/noAuthRequired/.

The first iteration of the solution might look something like this:

Order allow,deny
Allow from all
Deny from 10.0.0.1
 
SetEnvIf Request_URI ^/noAuthRequired/ AuthNotRequired
Allow from env=AuthNotRequired
 
Require valid-user
Satisfy any

Except that the order is allow,deny, which we don’t want to change, and which gives precedence to Deny from 10.0.0.1. This isn’t too much of a problem because logical transforms are what programmers do every day. The most obvious next step is to use AuthNotRequired as a control variable for Deny from 10.0.0.1 too.

Order allow,deny
Allow from all
 
SetEnvIf Remote_Addr 10.0.0.1 authState=1
SetEnvIf Request_URI ^/noAuthRequired/ authState=2
 
SetEnvIf authState 1 AuthRequired
Deny from env=AuthRequired
 
Require valid-user
Satisfy any

Logically, this seems to work: we keep track of the AuthRequired variable and deny access if the variable is set. However it doesn’t work with our setup.

The logic says that if we come in from the firewall, we would go to authState=1. Then, if the requested uri happened to be /noAuthRequired/..., we would go to authState=2. What actually happens is that authentication is always required, both to an unprotected path and to /noAuthRequired/.

Thinking that something else was broken in the logic, I tried SetEnvIf authState 2 AuthRequired, the effect of which was to not require authentication to unprotected paths, but to require authentication to /noAuthRequired/, which is the opposite of what we wanted.

It also presented a logical impossibility: how could authState have the value of 1 and 2 at the same time? This was proven from the behaviour of the server always wanting authentication for /noAuthRequired/, and only wanting authentication for unprotected paths when authState=1. At first it seemed that authState never went to 2, but it also clearly went to 2 because it was possible for unprotected paths to be accessed without authentication.

After a long while of reading through mostly unhelpful documentation, and having google not helping much either, I remembered that we use mod_rewrite, and that it goes through a few passes to route an incoming request to the final request uri. As mentioned at the beginning of this post, we use CodeIgniter. CodeIgniter, like most PHP frameworks, is helpful in that it provides an .htaccess file that contains the rewrite rule RewriteRule ^(.*)$ index.php/$1 [L,QSA]. This helps keep the request uri clean, but it also makes mod_rewrite go through its routing rules at least twice: once to resolve index.php, and once to pass the final request uri through index.php.

Working inside httpd.conf without the reminder that index.php was being rewritten, I kept asking how it was possible for AuthState to be two values at the same time. The theories went as far as: maybe Apache treats each variable as a set (which was proven false experimentally). mod_env was also unhelpful because no SetEnv directive runs until all the SetEnvIf directives are run. It turns out that authState was 1 on one pass of the rewrite, and 2 on the next, which was why it appeared to take on both values at the same time. Realizing this, the solution to the problem was again quite simple (no thanks to mod_rewrite).

Order allow,deny
Allow from all
 
SetEnvIf Remote_Addr 10.0.0.1 authRequired
SetEnvIf Request_URI ^/noAuthRequired/ !authRequired
SetEnvIf Request_URI ^/index.php !authRequired
 
Deny from env=AuthRequired
 
Require valid-user
Satisfy any

authRequired is set on an incoming request from 10.0.0.1, but is immediately unset by matching the incoming request to ^/index.php. This allows mod_rewrite to move to the next pass where Remote_Addr is still 10.0.0.1, but Request_URI is now the final uri /noAuthRequired/, which causes authRequired to be unset, and which is what we’re interested in. The other case is trivially shown to work because if /noAuthRequired/ is not matched, authRequired is still set and the request is denied by mod_access.

Lessons learned? Apache configuration files are not your standard imperative programming environment where everything starts off from the top of the file (where they logically should). Instead, directives are interpreted in different orders according to Apache’s rules, of which there are many. Once mod_rewrite is introduced into the mix, things get even more convoluted because a configuration file may be run more than once (behind your back), possibly producing what looks to an outside observer like variables that take on two states at the same time.

That, and not having breakpoints or any way of examining variable states when debugging hurts.

Posted in Tech Blog | Leave a comment

Migrating Unit Tests from Selenium to Watir Webdriver

Until recently, HiretheWorld did all it’s front-end testing using Selenium. Our tests were quite old and they were pretty slow and unstable, they would fail fairly often for no reason, and would pass if the test was run again. I was tasked with improving our front-end testing process, the options were to try to optimize our selenium tests or find a new testing framework. Since both would require going through every test one by one, I decided to do some research of what different frameworks were out there.

Truth is, there aren’t that many – the big two are Selenium and Watir, with a bunch of other, much smaller, less well supported frameworks like windmill.

Watir

We eventually decided to use watir because it has a very clean and straight forward api, and Watir tests are written in ruby, like our old selenium based tests were, so it made the transition easier. We used watir webdriver, which uses selenium webdriver to drive the browser. The main difference between using webdriver through watir rather than selenium is that watir’s api is easier to use and has more features, making it much easier to write clean, efficient tests. Watir also has a ton of documentation and examples, this cheat sheet as well as the watir-webdriver site were great resources.

Diagram showing how our front-end testing setup fits together - test-unit plus Watir, driven by a test script - talking to web-driver, which drives the browser, which then outputs to the screen, or into a virtual framebuffer

Diagram showing how our front-end testing setup fits together - test-unit plus Watir, driven by a test script - talking to web-driver, which drives the browser, which then outputs to the screen, or into a virtual framebuffer

Setting Up Watir

Setting up watir is easy, you just need ruby, and a few gems. Some additional downloads can make the whole process better, such as using xfvb to do headless testing (no browser window) and using chromedriver to use google chrome as your test browser, which will speed things up. Also, a unit testing framework is needed to make batch testing easier; we use the test-unit framework.

  • Install Ruby
  • Install Ruby Gems
    • Ubuntu: sudo apt-get install rubygems
    • Windows: comes with installer
  • Install xvfb (if you want headless tests, linux only)
    • Ubuntu: sudo apt-get install xfvb
  • Install the following gems: watir-webdriver, test-unit, headless (if you want headless tests)
    • Ubuntu: sudo gem install watir-webdriver etc...
    • Windows: gem install watir-webdriver etc...
  • Download chromedriver if you want to use google chrome for testing (recommended)
  • More detailed instructions here: http://watir.com/installation/

Setting Up Tests with Test-Unit

First you need to setup your test unit class in a new ruby file

require "rubygems"
gem "test-unit"
require "test/unit"
require "watir-webdriver"
 
class TestExample < Test::Unit::TestCase

Next we create the setup and teardown functions – these are the functions that test-unit will run before and after each test:

# setup is run before every test
def setup
  $browser = 'chrome' if $browser.nil?
  $site = 'http://test.localhost' if $site.nil?
 
  if $headless
    require 'headless'
    $headless = Headless.new
    $headless.start
  end
  if $browser == 'chrome'
    $b = Watir::Browser.new :chrome
  elsif $browser == 'firefox'
    $b = Watir::Browser.new :ff
  elsif $browser == 'ie'
    $b = Watir::Browser.new :ie
  end
 
  $b.goto $site
end
 
# teardown is run after every test
def teardown
  $b.close
  if $headless
      $headless.destroy
  end
end

The setup code will pick a default browser and site if none is specified, then create the browser and go to the specified site. The teardown will close the browser; it can be useful to take a screenshot here so you can see where your test failed:

# take screenshot of end of test, useful for failures/errors
time = Time.new
$b.driver.save_screenshot(File.dirname(__FILE__) + '/screenshots/' + @method_name + '_' + time.strftime('%Y%m%d_%H%M%S') + '.png');

We also add a bit of code at the start of the file, after the requires, to handle arguments so that it’s easy to pick your browser and turn headless on/off using command line arguments:

require "rubygems"
gem "test-unit"
require "test/unit"
require "watir-webdriver"
 
# check arguments for browser or headless specification
ARGV.each { |arg|
    if arg.downcase.include? 'chrome'
        $browser = 'chrome'
    elsif arg.downcase.include? 'firefox'
        $browser = 'firefox'
    elsif arg.downcase.include? 'ff'
        $browser = 'firefox'
    elsif arg.downcase.include? 'ie'
        $browser = 'ie'
    elsif arg.downcase.include? 'headless'
        $headless = true
    end}
 
class TestExample < Test::Unit::TestCase

Writing Tests

For each test case you need a ruby function starting with the word test. Tests generally start with some manipulations of your site, followed by asserts to make sure everything behaved as expected. An example for our site is to fill in some of our forms, press the save button, then go back and see if everything is still there.

In test-unit you have three main assert functions, assert_true, assert_false and assert_equal. Your test will pass if the expression after the assert function give the expected outcome.

assert_equal 'Magic/More Magic', $b.text_field(:name, 'organization_name').value
assert_equal 'As mentioned above, we make magic and more magic.', $b.text_field(:name, 'question_38').value
assert_equal 'People who like magic and more magic, as opposed to less magic.', $b.text_field(:name,'question_39').value
assert_equal 'Im putting stuff into question 41', $b.text_field(:name,'question_41').value

Running Tests

You can run your tests just like running any ruby script, and you can pass parameters to which we can use to change things like the browser like above.

  • Run test:
    • ruby testexample.rb
  • Run test in headless firefox:
    • ruby testexample.rb headless firefox

Tips for writing good tests

The key to writing efficient tests is not to use explicit waits, our old tests were full of sleep functions to wait for pages to load or for ajax to load, this made them both slow and unstable. Instead, you should wait for an event, watir makes this very easy. I have a few other tips/tricks that I learned through the process of creating our tests as well as researching online.

Put everything you can in functions.

Putting everything you can in functions makes it easy to change repeated code, and when writing tests you will repeat things often. This will make it much easier to change your tests later while debugging or when you change your site.

def browse_to_new_project
  $b.goto $site + "/designtourney/projects/new"
end
 
def click_logo_design
  $b.link(:class, 'logo-design').click
end
 
def form_fill_first_page
  $b.text_field(:name, 'organization_name').set('Magic/More Magic')
  $b.text_field(:name, 'question_38').set('As mentioned above, we make magic and more magic.')
  $b.text_field(:name, 'question_39').set('People who like magic and more magic, as opposed to less magic.')
  $b.link(:id=> 'show-more').click
  $b.text_field(:name, 'question_41').set('Im putting stuff into question 41')
  $b.text_field(:name, 'question_45').set('Im putting stuff into question 45')
end

Waiting for Ajax

Watir will automatically wait for a page to load, but that won’t help you if your site uses a lot of ajax. A really handy function in watir for dealing with ajax loads is the wait_while_present function. It’s good practice to have an ajax loader animation displayed while your site loads an ajax call, using the wait_while_present function you can take advantage of this by waiting while the loader is present:

def wait_for_ajax
  $b.div(:id, 'ajax-loader').wait_while_present
end

Waiting for javascript animation

We use the fallr jquery plugin on HiretheWorld. It’s a slick looking animated modal box for prompts and alerts. These animations can cause trouble for automated tests. Using the wait functions in watir and a little javascript you can make this a little easier. I still had to use a short sleep here as waiting for a javascript function can be a little finicky, but it still helps make the test more stable.

$b.div(:id, 'fallr').wait_until_present
$b.wait_until{ $b.execute_script('return $(\'#fallr-wrapper\').is(\':animated\')') == false }
sleep 0.5
$b.link(:id, 'fallr-button-yes').click
$b.div(:id, 'fallr-overlay').wait_while_present

Dealing with timeouts

No matter how good you make your tests you will still occasionally get timeouts, you can make watir retry your timeouts using the ruby timeout class. I based this off of code I found here

def load_link(waittime)
  begin
    Timeout::timeout(waittime)  do
    yield
  end
  rescue Timeout::Error => e
    puts "Page load timed out: #{e}"
    retry
  end
end
 
def browse_to_new_project
load_link(30){ $b.goto $site + "/designtourney/projects/new" }
end
 
def click_logo_design
load_link(30){ $b.link(:class, 'logo-design').click }
end

Logging errors to a file

Another useful trick is to log your errors to a file. You can do this by overriding the test-unit run method. Do this at the top of your test script. I based this code off of this stackoverflow post

module Test
  module Unit
    class TestSuite
      alias :old_run :run
      def run(result, &progress_block)
        old_run(result, &progress_block)
        File.open('errors.log', 'w'){|f|
          result.faults.each{|err|
            case err
              when Test::Unit::Error, Test::Unit::Failure
                f << err.test_name
                f << "\n"
              #not in log file
              when Test::Unit::Pending, Test::Unit::Notification, Test::Unit::Omission
              end
          }
        }
      end
    end
  end
end

Retrying errors

Even with well written tests, you can still occasionally get random errors for no apparent reason. To help combat this, I created a script that will retry all the errors in the error log file from above. You can simply run this script after your tests if any errors happen, to see if they were repeatable errors.

# create string of all args
args = ""
ARGV.each { |arg| args+=" "+arg }
f = File.open("errors.log") or die "Unable to open file..."
 
# start with an empty array
errors=[]
f.each_line {|line|
  errors.push line
}
 
if errors.length > 0
  puts 'Attempting to resolve errors'
  try = 1
  while try <= 3
    puts "Try number: "+try.to_s
    errors.each_with_index{|name, i|
      test = /(.+?)\((.+?)\)/.match(name)
      if system "ruby \""+test[2]+".rb"+args+"\""
        errors[i] = false
      end
    }
    errors.delete(false)
    if errors.length == 0
      puts 'All errors resolved successfully!'
      break
    end
    try+=1
  end
  File.open('errors.log', 'w'){|f|
    errors.each{|error|
      f << error
      f << "\n"
    }
  }
  if errors.length != 0
    puts 'Errors unresolved'
  end
else
  puts 'There are no errors in errors.log'
end

Conclusion

Rewriting our tests turned out to be a great success for us. Not only are our tests a lot more stable now, but we also saw a 40% speed increase in running our entire testing suite! Below is the final code for this example:

require "rubygems"
gem "test-unit"
require "test/unit"
require "watir-webdriver"
 
# check arguments for browser or headless specification
ARGV.each { |arg|
    if arg.downcase.include? 'chrome'
        $browser = 'chrome'
    elsif arg.downcase.include? 'firefox'
        $browser = 'firefox'
    elsif arg.downcase.include? 'ff'
        $browser = 'firefox'
    elsif arg.downcase.include? 'ie'
        $browser = 'ie'
    elsif arg.downcase.include? 'headless'
        $headless = true
    end}
 
module Test
  module Unit
    class TestSuite
      alias :old_run :run
      def run(result, &progress_block)
        old_run(result, &progress_block)
        File.open('errors.log', 'w'){|f|
          result.faults.each{|err|
            case err
              when Test::Unit::Error, Test::Unit::Failure
                f << err.test_name
                f << "\n"
              #not in log file
              when Test::Unit::Pending, Test::Unit::Notification, Test::Unit::Omission
              end
          }
        }
      end
    end
  end
end
 
class TestExample < Test::Unit::TestCase
  # setup is run before every test
  def setup
    $browser = 'chrome' if $browser.nil?
    $site = 'http://test.localhost' if $site.nil?
 
    if $headless
      require 'headless'
      $headless = Headless.new
      $headless.start
    end
    if $browser == 'chrome'
      $b = Watir::Browser.new :chrome
    elsif $browser == 'firefox'
      $b = Watir::Browser.new :ff
    elsif $browser == 'ie'
      $b = Watir::Browser.new :ie
    end
 
    $timeout_length = 30
 
    load_link($timeout_length){ $b.goto $site }
  end
 
  # teardown is run after every test
  def teardown
    # take screenshot of end of test, useful for failures/errors
    time = Time.new
    $b.driver.save_screenshot(File.dirname(__FILE__) + '/screenshots/' + @method_name + '_' + time.strftime('%Y%m%d_%H%M%S') + '.png');
    $b.close
    if $headless
        $headless.destroy
    end
  end
 
  def browse_to_new_project
    load_link($timeout_length){ $b.goto $site + "/designtourney/projects/new" }
  end
 
  def click_logo_design
    load_link($timeout_length){ $b.link(:class, 'logo-design').click }
  end
 
  def form_fill_first_page
    $b.text_field(:name, 'organization_name').set('Magic/More Magic')
    $b.text_field(:name, 'question_38').set('As mentioned above, we make magic and more magic.')
    $b.text_field(:name, 'question_39').set('People who like magic and more magic, as opposed to less magic.')
    $b.link(:id=> 'show-more').click
    $b.text_field(:name, 'question_41').set('Im putting stuff into question 41')
    $b.text_field(:name, 'question_45').set('Im putting stuff into question 45')
  end
 
  def first_page_asserts type = 'regular'
    assert_equal 'Magic/More Magic', $b.text_field(:name, 'organization_name').value
    assert_equal 'As mentioned above, we make magic and more magic.', $b.text_field(:name, 'question_38').value
    assert_equal 'People who like magic and more magic, as opposed to less magic.', $b.text_field(:name,'question_39').value
    assert_equal 'Im putting stuff into question 41', $b.text_field(:name,'question_41').value
  end
 
  def wait_for_ajax
    $b.div(:id, 'ajax-loader').wait_while_present
  end
 
  def load_link(waittime)
    begin
      Timeout::timeout(waittime)  do
      yield
    end
    rescue Timeout::Error => e
      puts "Page load timed out: #{e}"
      retry
    end
  end
 
  def test_save_for_later
    browse_to_new_project
 
    click_logo_design
 
    form_fill_first_page
    $b.link(:class, 'save').click
    wait_for_ajax
 
    assert_true $b.div(:id, 'fallr').visible?
 
    browse_to_new_project
 
    $b.div(:id, 'fallr').wait_until_present
    $b.wait_until{ $b.execute_script('return $(\'#fallr-wrapper\').is(\':animated\')') == false }
    sleep 0.5
    $b.link(:id, 'fallr-button-yes').click
    $b.div(:id, 'fallr-overlay').wait_while_present
 
    # These assertions make sure the stuff for the first page is still all there
    first_page_asserts
  end
end

If you’ve got any questions about this, just leave a comment and we’ll try to help you out!

Posted in Tech Blog | Tagged , , | 2 Comments

Comment me (out) if you want to live!

What a day to launch our tech blog (on a Friday, no less).

For the last couple of weeks, we’ve been kicking around the idea of starting a tech blog where we could talk about the things that go on internally behind the scenes, and inside the servers, at HiretheWorld. Today, we actually launched it.

The first article to be posted was Importing contacts with Cloudsponge in CodeIgniter, because that happened to be latest feature that we launched – or so we thought.

It turns out that at the same time, we also decided to start using Markdown on our blog. The combination wasn’t good because the next thing we knew, the server appeared to die. For us, server death is normally a consequence of too much memory being used, causing excessive swapping to disk. After the panic and subsequent restart of the server, we did as most good server admins do: look into the server logs to figure out what happened.

Yes, it was excessive memory usage, as we thought, and as top showed, but since everything was working before the post, we thought that there was no reason for the server to die. Tracing into apache’s error log, we had line after line of (e.g.):

[Fri May  4 16:40:18 2012] [apc-warning] Unable to allocate memory for pool. in /hiretheworld.com/blog/wp-includes/comment-template.php on line 913.
[Fri May  4 16:40:18 2012] [apc-warning] Unable to allocate memory for pool. in /hiretheworld.com/blog/wp-includes/script-loader.php on line 25.
[Fri May  4 16:40:18 2012] [apc-warning] Unable to allocate memory for pool. in /hiretheworld.com/blog/wp-content/plugins/facebook-comments-for-wordpress/facebook-comments.php on line 30.
[Fri May  4 16:40:18 2012] [apc-warning] Unable to allocate memory for pool. in /hiretheworld.com/blog/wp-content/plugins/shortcode-exec-php/shortcode-exec-php.php on line 45.
[Fri May  4 16:40:18 2012] [apc-warning] Unable to allocate memory for pool. in /hiretheworld.com/blog/wp-content/plugins/shortcodes-ultimate/shortcodes-ultimate.php on line 15.

etc...

which brought us to the conclusion that we were probably loading too many plugins inside wordpress.

And if we were just good programmers, we would have left it there: increase the memory, restart apache, and hope everything would continue working. But we’re better than that: we’re semi-paranoid server admins, so we also looked into apache’s access log to see what the system was doing at the time it died.

And here’s what we found:

37.59.16.126 - - [04/May/2012:16:32:38 -0700] "HEAD /blog/tech-blog/importing-contacts-with-cloudsponge-in-codeigniter?utm_source=rss&utm_medium=dlvr.it&utm_campaign=dlvr.it&https%3A%2F%2Fwww.hiretheworld.com%2F%3Futm_source=dlvr.it HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
37.59.16.129 - - [04/May/2012:16:33:30 -0700] "HEAD /blog/company/site-updates/invite-your-friends-make-money-helping-your-friends-find-work-or-workers?utm_source=rss&utm_medium=dlvr.it&utm_campaign=dlvr.it&https%3A%2F%2Fwww.hiretheworld.com%2F%3Futm_source=dlvr.it HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
37.59.16.129 - - [04/May/2012:16:33:39 -0700] "HEAD /blog/company/site-updates/invite-your-friends-make-money-helping-your-friends-find-work-or-workers?utm_source=rss&utm_medium=dlvr.it&utm_campaign=dlvr.it&https%3A%2F%2Fwww.hiretheworld.com%2F%3Futm_source=dlvr.it HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
37.59.16.137 - - [04/May/2012:16:33:44 -0700] "HEAD /blog/company/site-updates/invite-your-friends-make-money-helping-your-friends-find-work-or-workers?utm_source=rss&utm_medium=dlvr.it&utm_campaign=dlvr.it&https%3A%2F%2Fwww.hiretheworld.com%2F%3Futm_source=dlvr.it HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
50.57.110.189 - - [04/May/2012:16:38:16 -0700] "GET /blog_integration/getIntegrationData/?share_url=&share_ref=&og_title= HTTP/1.1" 301 313 "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
50.57.110.189 - - [04/May/2012:16:38:23 -0700] "GET /blog_integration/getIntegrationData/?share_url=&share_ref=&og_title= HTTP/1.1" 301 313 "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)"
50.18.135.144 - - [04/May/2012:16:17:47 -0700] "GET /blog/tech-blog/importing-contacts-with-cloudsponge-in-codeigniter?utm_source=rss&utm_medium=dlvr.it&utm_campaign=dlvr.it&https%3A%2F%2Fwww.hiretheworld.com%2F%3Futm_source=dlvr.it HTTP/1.1" 200 36634 "-" "Python-urllib/2.7"

16:38:16… 16:38:23… 16:17:47… 16:17?!? — the blog post that went back in time and killed the server! The terminator blog post ;)

It was actually apache taking forever to retire an old request (because of the swapping), but it certainly looked like the server went back in time.

The moral of the story? Don’t write blog posts about inviting your friends, because they just might invite crawlers to your site. But if you insist on doing that, at least make sure that there’s enough memory on the server to give it a fighting chance.

If you don’t, you might end up with an apparent causality loop, forcing the system to send in the Terminators:

[Fri May 04 16:41:03 2012] [warn] child process 27889 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27870 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27890 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27872 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27379 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27873 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27891 still did not exit, sending a SIGTERM
[Fri May 04 16:41:03 2012] [warn] child process 27874 still did not exit, sending a SIGTERM

Won’t someone think of the child processes?

# I want to live!

Posted in Tech Blog | Leave a comment