Genius Open Source Libraries

Some time ago, Genius Engineering decided to unify the manner in which we encode values that contain user input. We previously depended upon the PHP built-in htmlentities() and some simple wrappers around it for our encoding needs, but this function alone can’t safely sanitize tainted data in all contexts. Furthermore, we didn’t have a unified vision of whether encoding should happen immediately upon receipt of data from the user or when we display that data to the user. The ambiguity of our security arrangement, and the lack of encoding functions appropriate for all contexts led the engineering team to look for better options in PHP security for the prevention of cross-site scripting (XSS) and SQL injection vulnerabilities. While there is plenty of information about these issues and what must be done to fix them, there is a distinct dearth of libraries in PHP to properly encode strings for all of the situations.

When the right tool for the job doesn’t exist, you build it. We came up with a set of functions to sanitize tainted data in any of the places that it is output to the user. The functions are very straightforward: give them a string and you get back one that is fully escaped. Output from the gosSanitizer functions can be safely used as a double-quoted string in an HTML attribute or JavaScript context, or as a single-quoted string in an SQL context.

// Output an unsafe string, presumably user input
$xss = '<script>alert(\'oh snap\');</script>';
echo 'If your entered your name as ' . $xss . ', we\'d be in trouble.<br />' . "\n";
 
// Sanitize that string, and output it safely
$htmlContentContext = gosSanitizer::sanitizeForHTMLContent($xss);
echo "But if we sanitize your name, " . $htmlContentContext . ", then all is well.<br />\n";
 
echo '<h2>HTML Attribute</h2>';
// We can also safely sanitize it for an HTML attribute context
$htmlAttributeContext = gosSanitizer::sanitizeForHTMLAttribute($xss);
echo 'Tainted strings can also be used in an
    <a href="http://google.com" title="' . $htmlAttributeContext . '">HTML attribute</a>
    context.<br />' . "\n";
 
echo '<h2>JavaScript string</h2>';
// And we can even make strings used in JavaScript safe
$jsString = '\';alert(1);var b =\'';
echo '<script type="text/javascript">
var a = \'' . $jsString . '\';
var aSafe = \'' . gosSanitizer::sanitizeForJS($jsString) . '\';
</script>';

We have created a project on Launchpad to host the Genius text sanitizing libraries. The project consists of three modules: Core and Utility which provide general purpose support functions, and Sanitizer, which holds the functions used above. In the case of Sanitizer, all of the functions are static, and can be accessed through the gosSanitizer class. To use the Genius Sanitizer, you’ll need all three modules: Core, Utility, and Sanitizer itself. All of the Genius modules are loaded using the autoloader defined in Core/gosConfig.inc.php, so including this file is all that is needed to use any of the Genius Open Source libraries.

// Include the Genius config file
require_once 'Core/gosConfig.inc.php';
// Use gos* classess & functions here

We plan to continue adding modules to the Genius Open Source libraries collection in the future. Keep an eye on this blog for announcements!

Edited 2010-08-30 to reflect prefix change from “sg” to “gos”

Releasing Every Fortnight

Genius.com’s successful adoption of agile practices has been covered at some length in earlier postings, including Presenting on Going Agile with Scrum and An Agile Fortnight.  Building on this success, we have most recently reached the point where the completed user stories for any given sprint at not only ‘potentially shippable’ but are actually deployed to production. So, how did we get here and how long did it take?

Testing as the foundation

One of the key elements of our success in bi-weekly product releases is the commitment to increasing automated test coverage – both unit tests and functional automation tests.

With rapid rate of change – and new features in every release – it is imperative that developers know immediately if their check-ins have caused a build to break. This is only possible with a concerted investment in unit testing and QA automation. In our cases, we proceeded in phases, each taking approximately 4 months to implement:

  1. All check-ins must have associated unit tests. While we did not take the time to retrofit existing code, all new or modified code was required to have associated unit tests
  2. All product builds must run the complete unit test suite. We use Hudson, integrated with JUnit, mbUnit, Test::Unit, jsUnity, and PHPUnit to execute all the unit tests with every build and to report on failures at any stage
  3. Run builds on every checkin.
  4. All regression tests in TestRun (our test plan management tool) must be automated using Selenium and added to the nightly build. This took some time and had to be done incrementally. With an end-to-end test that required 3 days of manual testing by the entire QA team when we started, the impact of incremental investments in test automation began to pay off quickly. Automation of existing regression tests became a background task for the QA Engineers for each sprint. Developers also pitched in, writing helper functions to ease automation and writing automated tests themselves.
  5. All stories must have associated Selenium RC automated functional tests checked in and added to the nightly build test. In addition to the manual functional testing, every new story must have associated automated tests checked in and executing (via Hudson) nightly so that we were not adding to the regression debt.
  6. Run an acceptance test of functional tests on every checkin.

When is a story done?

We established a very rigorous definition of ‘done’ for stories to ensure a consistent quality level. We also adopted ‘story swarming’ (applying as many developers/QA/DB to the story) to shorten times on individual stories and to avoid having many stories open at once.

For a story to be done:

  1. All phases completed (in our case ‘To Do’, ‘In Progress’, ‘Security Review’, ‘Ready for QA’, ‘In QA’, ‘Validated’)
  2. Unit testing complete
  3. Security reviewed (code reviewed for web application security vulnerabilities)
  4. Validated by QA
  5. Test cases documented in TestRun
  6. Automated QA testing complete
  7. Validated by Product Owner
  8. All Operational considerations have been addressed

Providing all these conditions have been met, the story will be demonstrated to the company at the Sprint Review on the second Friday of the two-week Sprint and released to customers the following Tuesday.

What else needs to be considered?

One of the things I often get asked about when moving so quickly is the coherency of the architecture and the user experience. At Genius, we employ several methods to ensure the architecture is appropriately scalable and maintainable and that the product is easy to use:

  1. NMI (needs more information) stories. For user stories that have a significant impact on user experience or the underlying architecture, the team will first complete an NMI. NMI stories are focused on a subset of the team determining user flow (with leadership from the Product Designer) and/or underlying architecture (with leadership from the Technical Leads and the Development Director). The input to an NMI story is a list of questions that need answering (such as “how will the Marketing user…?” or “How can we ensure continuous availability of this feature during system maintenance?” The output of NMIs is a user flow or technical design, and a documented list of tasks for an upcoming sprint.
  2. Development framework. Ease of use is a key differentiator at Genius, as is performance. We evaluated several frameworks and determined that to achieve the level of user interactivity required (Ajax) we would need to build our own lightweight PHP framework. This framework is now the basis for all new functionality added to the product – not only speeding development, but further ensuring consistency in coding and usability.
  3. Designated ‘leads’ in each of the major technical components or code bases of the product, Technical Operations and User Experience with primary responsibly to making the team productive – and secondary responsibility to completing story tasks for the sprint.

Another concern with bi-weekly deployments is releasing partially complete features. As a SaaS provider, all the software we release to our production servers is immediately available to customers, so our goal is to complete at least a minimal feature set within each release. That said, we do make use of a beta flag (set by the provisioning team) to preview new features with customers or internally. This, combined with feature-based provisioning, can provide a lot of control over what an individual customer user can see or access. Of course, in the case that work on an existing feature is partially complete, we will typically rollback the code to the prior version (excluding it from the current sprint) to prevent user inconsistencies.

What’s up next?

The next step in our process evolution is to parallelize the nightly functional build tests (which currently contains over 600 Selenium scripts and runs for over 3 hours) so they can be run with every build. We are taking a two-pronged approach to this:

  1. Virtualized Selenium servers in-house. These will be used to run functional tests against every build for a single browser.
  2. Sauce Labs Sauce On Demand for cross-browser Selenium testing of all the automated functional tests on a daily basis.

In the future we will provide updates on our experiences with Sauce Labs and any other process developments.

Serving RESTful URLs with mod_rewrite

We’ve been experimenting with an internal API to our app to facilitate development of UI tests by our QA team.  After much discussion (likely to described in a blog post some time down the line), the decision was made to provide the API in a RESTful style over HTTPS.  Rather than make a separate PHP script at each location an API request could possibly land, it made more sense to have all requests routed to a single handler that delegated the requests to the appropriate class.  Doing this in Apache required a little configuration and research.

Apache provides a module for rewriting incoming URLs, thus giving us the ability to route all requests for the API to the PHP handler.  Configuring this required a simple regular expression.

RewriteRule ^/api.*$ /api/handler.php [L]

That was simple enough.  No more information is required in the URL by the handler, it has access to the originally requested resource through $_SERVER['REQUEST_URI'].  This now means that a request to /api/team/1, would be actually be sent to /api/handler.php, which would then determine that a request for the details of the team with id 1 should be processed.

You may have noticed something odd about the rewrite above: there is no query string.  Indeed, the query string is not specified and that is wrong.  Two things are at play here, (1) RewriteRule does not match against any text in the query string, and (2) the substitution URL completely rewrites the URL, so the original query string is not magically carried over. This is simultaneously useful and troublesome. The mod_rewrite module provides several directives for working with the URL, one of which is named RewriteCond, which, in addition to the URL itself, lets you match against several of the other HTTP things going on (aka Server Variables).  One of these is the query string.  Matching against the query string (or anything for that matter) in a RewriteCond gives you access to back reference those matches in an ensuing RewriteRule directive using the %n syntax in addition the canonical $n syntax of the regular expression used for RewriteRule.

As useful as this is, in our situation we want to maintain the entire query string, which did not require use of the RewriteCond directive at all.  We accomplish this by,

RewriteRule ^/api.*$ /api/handler.php?%{QUERY_STRING} [L]

For full details of the mod_rewrite module, check out the doc, http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html.  The list of Server Variables available for substitution can be found in the discussion of the RewriteCond directive.  For some examples of the RewriteCond directive, check out: http://fantomaster.com/faarticles/rewritingurls.txt.

Hibernate Resource Management with Callbacks

Hibernate is a popular ORM library that uses abstractions of SQL transactions and other DB concepts. Like anything that deals with resources that must be cleaned up (network sockets, file handles, DB connections, transactions, etc.), ensuring that these resources are cleaned up correctly can get pretty verbose. For details on how to structure resource cleanup code in general, see David M. Lloyd’s article on the subject.

As our use of Hibernate grew beyond a few simple DB interactions, the amount of mostly-duplicated boilerplate code became more and more irritating, so we created some helpers to cut down on the duplication. I suspect other people probably have the same concern, so I’ll show how we were able to simplify our Hibernate interactions (as well as reducing error-prone duplicate code).

Hibernate Basics

The Hibernate class—technically an interface—that typically starts a Hibernate “conversation” is SessionFactory. Most people probably start off with a simple HibernateUtil class just like the one described in the Hibernate tutorial. This is simply a way to easily access a single SessionFactory instance.

Here’s how you could use this to create an object in the database.

// Assume User is a persistent class (e.g. mapped with JPA or
// Hibernate annotations or a User.hbm.xml file)
User user = new User();
Integer id;
 
Session session = HibernateUtil.getSessionFactory().openSession();
Transaction tx = session.beginTransaction();
id = session.save(user);
tx.commit();
session.close();

This doesn’t do robust error handling, though. In fact, HibernateException can be thrown from every single one of these method calls, though you wouldn’t know it if you put this code in a Java editor — HibernateException extends RuntimeException, so it is unchecked. If tx.commit() threw HibernateException, the session would never be closed. This can cause memory leaks. Here’s a better version of the code with more error handling.

final Session session = sessionFactory.openSession();
 
try {
    final Transaction tx = session.beginTransaction();
    try {
        id = session.save(user);
        tx.commit();
        session.close();
    } finally {
        if (!tx.wasCommitted()) {
            try {
                tx.rollback();
            } catch (HibernateException e) {
                // log
            }
        }
    }
} finally {
    if (session.isOpen()) {
        try {
            session.close();
        } catch (HibernateException e) {
            // log
        }
    }
}

That’s an awful lot of code just to create a row in the DB! However, we can do better. First, move the contents of the finally blocks into their own methods. They can be static methods as they have no state, but you’ll want to have a logger available to record when resources can’t be closed. Here, I use SLF4J‘s Logger interface in a static logger field.

public static void tolerantClose(Session session) {
    if (session.isOpen()) {
        try {
            session.close();
        } catch (HibernateException e) {
            logger.warn("An error occurred while closing the session.", e);
        }
    }
}
 
public static void tolerantDispose(Transaction tx) {
    // we're not in XA/JTA, so wasCommitted should be reliable. See the javadocs.
    if (!tx.wasCommitted()) {
        try {
            tx.rollback();
        } catch (HibernateException e) {
            logger.warn("Failed to rollback", e);
        }
    }
}

Focusing on the real work

Using methods to close sessions and transactions will help, but it’s still quite verbose overall. Pretty much everything that’s done with Hibernate is done in the context of a Session and a Transaction, so what if we hide all the setup and teardown of Session and Transaction and focus on just the work that needs to be done? First, an interface to represent the work to do:

public interface HibernateCallback {
    T execute(Session session, Transaction tx) throws DaoException;
}

I’m declaring the method to throw DaoException, a simple subclass of Exception. This is because for the way we use Hibernate it makes more sense to have Hibernate interactions throw checked exceptions than to throw unchecked exceptions, but if you like Hibernate’s HibernateException, feel free to remove uses of DaoException.

Now, code to run a HibernateCallback:

public  T runCallback(final HibernateCallback callback) throws DaoException {
    T result;
 
    try {
        final Session session = this.sessionFactory.openSession();
 
        try {
            final Transaction tx = session.beginTransaction();
            try {
                result = callback.execute(session, tx);
                tx.commit();
                session.close();
            } finally {
                DbResourceCloser.tolerantDispose(tx);
            }
        } finally {
            DbResourceCloser.tolerantClose(session);
        }
    } catch (HibernateException e) {
        throw new DaoException("Could not execute hibernate callback", e);
    }
 
    return result;
}

There are several new things in this method:

  • The session factory is referenced as a field. I recommend creating a class that wraps a SessionFactory and exposes runCallback and other methods without exposing the SessionFactory itself. You may even be able to have HibernateUtil (or equivalent) only expose this wrapper and never expose SessionFactory at all.
  • The method is generic and has its own T generic parameter. The wrapper class itself need not be generic (and should not be). An example of how this method is used (below) should make this clear.
  • DbResourceCloser is simply a class containing the methods described above.

This is how the wrapper and callback can be used together.

SessionFactoryWrapper wrapper = new SessionFactoryWrapper(sessionFactory);
 
final User user = new User();
HibernateCallback callback = new HibernateCallback() {
    @Override
    public Integer execute(Session session, Transaction tx) {
        return (Integer) session.save(user);
    }
};
 
Integer id = wrapper.runCallback(callback);

That has a much better ratio of work done to code written. (Also, nothing in that code is specific to User; you could use it to save any persistent class. You may wish to put a method that does just that on your version of SessionFactoryWrapper, but note that changes that happen once a Hibernate Session has been closed will not be automatically tracked by Hibernate. This is fine if you have already set up all the data in the persistent class before you save it.) Now that we have this core abstraction done, a lot of other things become simpler. What if you want to use a Work object to do some raw JDBC commands? We can easily add that to the wrapper class:

public void runWork(final Work workCallback) throws DaoException {
    HibernateCallback hbCallback = new HibernateCallback() {
        @Override
        public Void execute(Session session, Transaction tx) {
            session.doWork(workCallback);
            return null;
        }
    };
 
    this.runCallback(hbCallback);
}

Now that it’s easy to do raw JDBC operations, let’s further illustrate the convenience of callbacks by making a way to simply operate on every result returned by a prepared statement. First, the callback interface:

public interface PreparedStatementCallback {
    String getQueryString();
 
    void configurePreparedStatement(PreparedStatement stmt) throws SQLException;
 
    void processRow(ResultSet resultSet) throws SQLException;
}

And the method that uses the callback:

public void runPreparedStatementCallback(
        final PreparedStatementCallback preparedStatementCallback)
        throws DaoException {
 
    final Work workCallback = new Work() {
        @Override
        public void execute(Connection connection) throws SQLException {
            String query = preparedStatementCallback.getQueryString();
            final PreparedStatement stmt = connection.prepareCall(query);
            try {
                preparedStatementCallback.configurePreparedStatement(stmt);
 
                final ResultSet res = stmt.executeQuery();
                try {
                    connection.commit();
 
                    while (res.next()) {
                        preparedStatementCallback.processRow(res);
                    }
                } finally {
                    res.close();
                }
            } finally {
                stmt.close();
            }
        }
    };
 
    this.runWork(workCallback);
}

Now it’s easy to do simple operations with prepared statements:

PreparedStatementCallback callback = new PreparedStatementCallback() {
    @Override
    public String getQueryString() {
        return "SELECT u.id FROM user u WHERE u.id &gt; ?";
    }
 
    @Override
    public void configurePreparedStatement(PreparedStatement stmt) throws SQLException {
        stmt.setInt(1, 471);
    }
 
    @Override
    public void processRow(ResultSet resultSet) throws SQLException {
        int userId = (resultSet.getInt("id"));
        System.out.println("Found a user id greater than 471: " + userId);
    }
}
 
wrapper.runPreparedStatementCallback(callback);

This is a Hibernate tutorial, after all, so how about another callback that’s the Hibernate-level equivalent of the prepared statement callback? This one is for easily performing read-only operations on the persistent entities returned from using a Criteria. This is the callback interface:

public interface DaoCriteriaReadOnlyCallback {
    Criteria getCriteria(StatelessSession session);
 
    T cast(Object o);
 
    void delegate(T dao);
}

The cast() method is simply so that delegate() need not deal with casting the Objects returned by Hibernate to the appropriate persistent class. (You could also do it with clever usage of Class#cast().) The usage example below should make it clear how this is used, but first we need the method that runs the callback.

public  void runCriteriaCallback(DaoCriteriaReadOnlyCallback callback) throws DaoException {
    try {
        // Read only session
        final StatelessSession statelessSession = sessionFactory.openStatelessSession();
        try {
            final Transaction tx = statelessSession.beginTransaction();
            try {
                Criteria crit = callback.getCriteria(statelessSession);
                final ScrollableResults cursor = crit.scroll(ScrollMode.FORWARD_ONLY);
 
                try {
                    while (cursor.next()) {
                        callback.delegate(callback.cast(cursor.get(0)));
                    }
 
                    cursor.close();
                    tx.commit();
                    // StatelessSession#close is not idempotent, called only in finally block
                } finally {
                    DbResourceCloser.tolerantClose(cursor);
                }
            } finally {
                DbResourceCloser.tolerantDispose(tx);
            }
        } finally {
            DbResourceCloser.tolerantClose(statelessSession);
        }
    } catch (HibernateException e) {
        throw new DaoException("Could not execute hibernate callback", e);
    }
}

Here’s how such a callback could be used.

DaoCriteriaReadOnlyCallback callback = new DaoCriteriaReadOnlyCallback() {
    @Override
    public Criteria getCriteria(StatelessSession session) {
        return session.createCriteria(User.class)
                .add(Restrictions.eq("status", UserStatus.ACTIVE));
    }
 
    @Override
    public User cast(Object o) {
        return (User) o;
    }
 
    @Override
    public void delegate(User user) {
        System.out.println("Got a user: " + user.getId());
    }
}
 
wrapper.runCriteriaCallback(callback);

Using these tools, it’s easy to create methods that quickly and safely perform CRUD operations, as well as to execute more sophisticated logic like the criteria-based callback. This is far from the only way to organize Hibernate code, though, so feel free to comment if you have suggestions or improvements.

Xdebug Quickstart: Profiling in PHP

Preface

There are numerous ways to evaluate the performance of a PHP script. The simplest, our good friend microtime, allows you to do targeted benchmarking of certain sections of code:

$startTime = microtime(true);
functionCall();
$timeDiff = microtime(true) - $startTime;
 
echo 'functionCall() took '. $timeDiff .' seconds';

This works extremely well for quickly testing the performance of a specific piece of code. However, this approach has a few drawbacks:

  • Lack of granularity – You only get one number, which is the total amount of time taken between the start and stop points. It reveals very little about what the code is doing that causes it to take that much time.
  • Invasiveness – In order to instrument your application, you have to change the code by inserting timing statements in all the relevant places. With a large application and little knowledge about the location of the code causing performance issues, this can clutter your codebase.
  • Verbosity – This approach requires at least two lines of code for every section of the app that you want to time.

Thankfully, there are helper classes for this, like PEAR_Benchmark. With this class, you can easily set marks at critical points of your application and get finer-grained reporting on runtime results. As you can see in the supplied example, PEAR_Benchmark is simple to use. However, it still requires that you edit your code to set the timing points. This might need dozens of individual marks inserted, only to (assumedly) be removed before the code is pushed to production. Alas…

But fear not, dear reader, for there is a better way! And it is called profiling.

The Joy of Profiling

What does profiling buy you? How about:

  • Total request time
  • Time spent in every function that was hit during the request (either in absolute time or as a percentage of the request)
  • Number of times each function was called over the course of the request
  • Rainbows bursting forth from your monitor!

Okay, maybe not exactly rainbows…but look at this colorful chart! It represents the percentage of the request time spent in each function as a portion of the total graph area:

KCachegrind visualization

This chart makes it plainly obvious where your program is spending its time. And the best part? Profiling requires absolutely no code modification! That’s right, no more timing statements sprinkled liberally throughout the code to figure out where the slowdown is. So how do you get all this juicy informational goodness?

It’s Not Just A Debugger

Enter Xdebug, a PHP extension that allows you to (among several other very useful things) generate profiling reports on your code. Xdebug is a PECL extension, which means that installing it is easy as pie (note: I am assuming a *nix system here). Just run the following on the command line:


> pecl install xdebug

This should download and install the extension for you. From here, just configure the following settings in your php.ini file:

  • Use the Xdebug extension:
    Put the following line into your php.ini file (changing out the “/your/particular/path/to/” section with the location of your xdebug.so extension):

    zend_extension=/your/particular/path/to/xdebug.so
  • Turning on the profiler:
    To optionally generate a profiler report, put the following into your php.ini after the line to include Xdebug:

    xdebug.profiler_enable_trigger = 1

    You can now trigger the profiler for an individual script run. For web requests, you can turn the profiler on by passing in the query parameter XDEBUG_PROFILE=1. For example, http://www.example.com/testScript.php?XDEBUG_PROFILE=1 would create a profile for the testScript.php file.

    To generate a profiler report for a script running on the CLI, set the environment variable XDEBUG_CONFIG to the value “profiler_enable=1″. For convenience (and because I tend to forget the exact format required), I set up the following shell alias:

    alias phpx='XDEBUG_CONFIG="profiler_enable=1" php'

    Now I just run phpx <script name> and Xdebug will create a profiler report automatically. How convenient!

    You also have the option to turn on profiling for every single PHP script execution. I generally recommend against this, as profiler report files can be extremely large (on the order of gigabytes) and generating many of them can fill up your disk in short order. That said, to profile on every script execution, use the following instead of xdebug.profiler_enable_trigger:

    xdebug.profiler_enable = 1

  • Set up a location for your profiler reports:
    You can tell Xdebug where it should put the reports it generates. The default is /tmp—I recommend that you put it somewhere with a few gigabytes free, just in case.

    xdebug.profiler_output_dir = /tmp
  • Change the naming convention for the reports:
    The name of the files generated by Xdebug is created automatically based on the xdebug.profiler_output_name string, which allows some variables. The ones I find to be most interesting are:

    • %s = script path (_home_httpd_html_test_xdebug_test_php)
    • %u = timestamp (microseconds)…format: 1179434749_642382
    • %p = pid

    Xdebug ships with a default of cachegrind.out.%p, which I don’t really like. I use the following instead:

    xdebug.profiler_output_name = cachegrind.out.%s.%u

Once you have your settings arranged to your liking, give it a go and have a look at the report it generates. Sure is a lot of text in there, huh? Now, obviously, this isn’t particularly useful on its own. You need to install a program that can read these files and visualize the data for you. A few options:

  • KCachegrind – A free KDE application that provides, in addition to the standard performance data, interesting visualizations of how much time each function took relative to the overall script run time.
  • Webgrind – A free web-based report analyzer. Webgrind is fairly simple to set up and runs on any OS.
  • MacCallGrind – A commercial, Mac-native application.
  • WinCacheGrind – A free Windows application.

Now you’re all set! You can both generate profiler reports and read them. Enjoy!

An Agile Fortnight

After attending a talk on Agile recently, myself and a couple of other Genius folks became the center of attention because we have implemented the Agile process quite successfully.  Most of the interest seemed to be around our day-to-day process and logistics—something that seems to be glossed over in most discussions of Agile. What follows is a rundown of what each two week sprint looks like in Genius Engineering.

Sizing meeting

As part of the continuous agile process, we have a sizing meeting every week where our product owner gives us a quick overview of user stories that are further down the backlog. We look at enough material to last us 2-3 sprints and give the team a long term view of our direction. These meetings are valuable for keeping the team thinking ahead about what might be coming in the future in addition to giving the product owner feedback about what details the team will need to complete the stories.

Sprint Planning

Our sprints start on Monday and stretch for two weeks until Friday of the next week. The first day begins with sprint planning, where we choose what stories we are going to do for the next fortnight. Our product owner presents a list of user stories, in priority order, along with detailed acceptance criteria. The vast majority of the stories aren’t new—they were previously presented to the team at the aforementioned sizing meeting. We go through each story in the list, spending a few minutes on each one reviewing the story and acceptance criteria that were defined previously.

When the team has chosen enough stories to keep them busy for the entire sprint plus an extra stretch goal or two, we head back to the top of the list to commit on which stories we will do for the sprint. Usually, this involves choosing the stories from the prioritized list until we have accumulated enough story points to keep us busy for the whole sprint; most likely, a number right around our current velocity. Sometimes, however, a story is sized large enough that we don’t feel ready to tackle just yet, and we’ll skip over it, grabbing some smaller stories further down the list. Occasionally, the team also decides that there isn’t enough detail in the story, or there are too many questions about how the story will be implemented. In these cases, we will put an NMI (Need More Information) story into the sprint. NMIs usually boil down to a couple of meetings amongst those who have the most knowledge of what needs to be done (product owner, users) and how it will be done (experts in affected code areas or tools) to flesh out the story.

After sprint planning, we take a break and have lunch; long meetings are pretty taxing. After lunch, It’s time for task breakdown.

Task Breakdown

By committing to a story, the team is saying that they know how to complete it. If we know how to complete a story, then we ought to be able to synthesize (nearly) all of the tasks that will need to be performed to Achieve. For each story, we get everyone who might be involved in it—not just experts in the area—to gather around our task board. Those most familiar with the story lead the discussion about how the team should go about implementing the user story; someone else notes down each task onto a square Post-it note, and puts that in the to-do column on the board.

Astute observers will notice that I didn’t mention quality assurance throughout the entire planning process above. That is because at Genius, unlike a lot of other organizations, QA is part of the engineering team. Our QA engineers participate in all of our planning, from Meet & Greet to task breakdown. QA being involved in a story from the beginning gives a whole lot of insight into what customers expect and how they will use what is created.

Starting Work

Once the team has chosen stories and broken out all of the tasks, we begin the real work. Whenever we open a new story, the team leads get together to make sure we have enough resources to dedicate to working on it, and that we won’t be stretched too thin—we try to swarm on stories, so as to get each one through development as quickly as possible. It depends upon the nature of the stories, but we generally have one to two stories open for every four developers.

Day to Day

Every day of the sprint, we have a daily standup meeting that takes about 10 minutes. The team gets together around our scrum board and each person answers three questions:

  • What have you done since last standup?
  • What are you doing until next standup?
  • Have you had any impediments?

We usually don’t actually ask the questions—everyone knows what to do—except as a reminder if someone forgot to answer one of them. The first question is usually answered by describing what tasks you have completed, other team members you have been working with, or impediments you have resolved for others. Looking ahead usually means telling the team what tasks you expect to complete over the next day, or at least the stories you will be involved in. Impediments are hopefully rare, and usually include accidental breakage caused by other team members or waiting on external information.

Story Flow

Stories begin with all of their tasks in the To Do column on our board; a developer picks up the task and moves it to In Progress while they work on it. When the developer has done what is necessary for the task, including writing unit tests to exercise any changed or newly added code, they move the task to Security Review. Since any missed encodings can lead to exploitable holes, we have another developer review the committed changeset for vulnerabilities. When everything is deemed OK, the task moves to Ready for QA. From there, one of our QA folks grabs the task and moves the task to In QA. QA validates that the task does what it should, fulfills the acceptance criteria that are applicable and writes automated Selenium tests to be added to our application test suite. Once all of the tasks for a story have made it to the Validated column, the story is done! Or at least mostly so.

The last thing that happens to a story is validation by the product owner. We put the large, story Post-it into the In QA column and let our product owner know that he needs to give it one final check. The product owner isn’t looking to do an exhaustive examination like QA does, but simply ensure that the user story has been implemented in a fashion that he deems appropriate. At the next scrum, the product owner tells us that the story has been validated, and the team resizes the story.

Sprint Review

The sprint review is where the team shows off the results of their work for the sprint. We schedule ours at a time convenient for the entire company. We build this software for our own sales & marketing people to both use and sell, so we invite them to come so that we can give them a detailed look at new features we’ve implemented. To encourage attendance, someone often makes treats or we stop at Costco for a case of Mexican Coke and churros.

Retrospective

The very last part of our sprint is the retrospective. We gather the whole team in the conference room to discuss things that did and didn’t go well during the sprint. The retrospective happens in a fairly agile fashion—for the first 10 minutes, everyone comes up with issues, writing them on Post-it notes and placing them in similar groups on the wall. We spend about 5 minutes summarizing those groups and letting everyone vote for the two they think are the most important. The remainder of the time is used to discuss those issues in priority order based upon that voting. In the last 5 minutes of the meeting, we choose action items in the form of something awesome, a mystery, and lessons learned, assigning someone to act upon each of those throughout the next sprint.