Example Practice of Lean Startup at Enable Quiz (Startup)

This page presents a set of example hypotheses based on a fictional company, ‘Enable Quiz’. Enable Quiz is rolling out a lightweight technical quizzing solution; for companies that hire engineers, it will allows them to better screen job candidates and assess their internal talent for skills development. For more on Enable Quiz, see the example Venture Concepts page.

Drafting a Positioning Statement

Here’s the positioning statement for Enable Quiz:

For [HR and hiring managers] who [need to evaluate technical talent], [Enable Quiz] is a [talent assessment system] that [allows for quick and easy assessment of topical understanding in key engineering topics]. Unlike [formal certifications or ad hoc questions], our product [allows for lightweight but consistent assessments of technical talent].

For more on what this is and why it’s important, see the preceding section ‘How Do I Get Started?’.

Chartering a Core Value Hypothesis

Based on what they know right now, they have this formulation of their root hypothesis around the problem scenario of hiring engineers:

‘HR and functional managers are in charge of technical hires and they struggle to effectively screen for technical skill sets, making the hiring process slower and more labor intensive and producing worse outcomes than they should reasonably expect. Currently they implement a patchwork of calling references and asking a few probing questions.

If Enable Quiz offers an easy, affordable, lightweight technical quizzing solution, we can acquire, retain, and monetize customers.’

Drafting MVP Ideas (for Core Value Hypothesis)

1. Concierge MVP

Enable Quiz could find a few HR managers who want to participate and develop a quiz for them by hand to screen candidates for a specific position they open for hiring. These quizzes might be on paper or on Google Forms, etc.. In either case, they should probably grade them ourselves since even though it’s more work, that’s closer to the real experience we’re considering (an online quizzes that obviously would grade itself).

Outcome Metrics: First, are the HR managers able to use the quizzes and what is that like? How hard is it to take a real job description and develop a suitable quiz? Second, do the HR managers actually use it? (Note: they’ll need a way to get some log or indication of how many candidates they screen so we can know that.) Third, after they develop quizzes for 1-2 positions, if they give a low-pressure offer to have Enable Quiz do it again (but where the customer pays), do they still want it? Note: This should be low-pressure since they might feel somewhat obliged to us for the free ones we did and if they buy on that basis it would not be valid to conclude anything about the market at large.

2. Wizard of Oz MVP

I see two possibilities here. First, they could go with a product demo video and test sign-up’s against that (see Sales MVP below). Second, they could build a very basic site where users sign-up and configure the quizzes they want. They’d let users know it will take a few hours and build the quizzes by hand in the background. A variation on that would be just to have them submit the job description, maybe with optional tags for skills, and build against that.

Outcome Metrics: For the first, see Sales MVP. For the second, they might look at total sign-up’s but they’d also have an opportunity to look at the choice of topics and submitted job descriptions. If they introduced the topic tagging, they’d also be able to test the users (who we’re assuming are HR managers) ability to identify specific tech topics relevant to a given job description.
They could fake up a demo for the service and put it on a site/landing page. If they want to test motivation relative to their Value Hypothesis, they’re probably best off pairing this video with a call to action for sign-ups to an email list about product updates.

3. Sales MVP
Let’s say their current business model assumes they’re selling online. In that case they’d probably focus on testing click-throughs against Google AdWords and, from there, conversions to trial and then sale on their site. If they have enough testable ideas on propositions they should just go for it, but if they encounter click-through rates below 3% after tuning their AdWords, hand selling to companies might be a good fallback since that will give them more perspective on what’s working.

Outcome Metrics: Basically, they’d be looking at the customer journey that concludes with regular usage (and paying!)–
click-through rate from Google AdWords
conversion from landing page to trial sign-up (or plan purchase)
creation of a quiz?
use of a quiz?
repeated use of a quiz?
conversion to paid plan after trial?

Storyboarding the Customer Journey

Below is an example of what this might look like for Enable Quiz’s customer.

aidaor-enable-quiz-highres

A few additional notes:
1: There isn’t just one storyboard/customer journey. Obviously, to a degree, every customer has a different journey, so don’t try to fit everyone into one of these. Just pick a journey you think is plausible and one you’d like to test and focus on making it vivid and testable.
2: This isn’t the only angle on your assumptions. Depending on the project, there may be any number of other areas that are worth exploring. This is just one angle on the assumptions that most projects have in common.

Unpacking Your Hypotheses

The table below describes Enable Quiz’s working hypotheses and plans for experimentation and validation (or invalidation). I’ve organized these decomposed hypothesis into the categories we reviewed above: Persona Hypothesis, Problem Hypothesis, Value Hypothesis, Customer Creation Hypothesis, and Usability Hypothesis.

Here are a few notes on the column headings–

Priority: This is a measure of the importance and context of the hypothesis. While it’s important to speak all your hypotheses and keep an eye on them, focus is also important. You can use any scale you want, but I like to use this rating scheme:
0: This is a core/summary hypothesis
1: This is a  pivotal hypothesis- if this turns out to be untrue, we need a pivot. It probably needs to be decomposed further for manageability.
2: This is a child of the type of hypothesis above. Individually, its disproving doesn’t mean a pivot, but if all its ‘siblings’ (all the other related hypotheses that tie to something of priority ‘1’) prove untrue, then it does.
3: This may be pivotal- we’re not sure yet. It’s definitely important.
4-5: Lighter dispositions of ‘3’.
6-10: This is a tactical detail related to product development or growth/promotion. Take note if it’s convenient, but we need to validate (or invalidate) the related items above it before we worry about- if the related items are invalidated then pursuing it could be 100% waste.

Needs Proving? Not all notable hypotheses need proving, or you may have already proven them. This is a place to log that, along with notes on why and where it needs proving.

Experimentation: These are quick notes and possibly external references to how the hypothesis gets proven or disproven.

Example Persona Hypotheses

persona-hypothesis-exampleEnable Quiz’s general area of interest is improving team performance by making it ultra simple to measure technical skill sets. This could apply to both of screening new hires and doing a more systematic review of existing staff to see how knows what (so peers and managers can be more effective in divvying up tasks and focusing professional development).

Personas-Who-v2In both cases, they believe that there are two personas that are the primary customers: ‘Helen the HR Manager’ and ‘Frank the Functional Manager’. ‘Chris the Candidate’ (being interviewed) and ‘Steve the Staff Member’ are users, but probably fairly passive users so it’s really Helen and Frank that primarily interest Enable Quiz (see Personas Tutorial- Examples for more on these).

These personas are a starting point, but since they’re (we think) the pivotal personas, the team at Enable Quiz will almost certainly need to re-segment them. We know every individual is vastly different, so, of course, there’s always an avenue to split up personas into sub-types. But why and how in this case? First and foremost, re-segmenting the personas will allow them to identify an actionable early market where they can focus and acquire their first few ‘beachhead’ customers. .

The table below describes focal hypotheses around Enable Quiz’s persona hypotheses:

Priority Hypotheses Needs Proving? Experimentation
1 The Hiring Manager persona and Functional Manager exist in roughly the form we’ve described and they are collectively responsible for technical recruiting and hiring. Not really. There may be minor variations on this but the arrangement is pretty standard and across dozens of companies they haven’t seen anything much different. ditto
1 We understand these personas well and what they think about technical recruiting, hiring and skills management (in the form of think-see-feel-do). Yes, definitely. – Observing consistent results on discovery interviews
4 Helen is likely to be our primary buyer, Frank a possible influencer and approver Yes, definitely. – Observing consistent results on discovery interviews- Trying to generate pre-sales- Asking who and how they’ve purchased (roughly) comparable services like online recruiting services

The output of this section is a set of validated personas, including think-see-feel-do.

Example Problem Hypotheses

problem-hypothesis-exampleEnable Quiz’s is generally interested in the problem space of assessing technical talent, with an emphasis on lightweight tests for tactical decision making. Their goal with the problem hypothesis is to flesh out all the material problem scenarios and then validate (or invalidate) which ones substantially exist. They also want to clearly understand their personas’ interaction with current alternatives, so they can

a) validate that they really understand the problem scenarios (if you can’t identify an alternative you likely do not)
b) better connect with customers’ current perception of the problem when they start customer creation (aka selling)
c) provide a backdrop/baseline for their value hypothesis- how much better is their solution than the alternative?

Note: It’s useful to divide up these hypothesis areas for organizational and analytical clarity, but the different areas will and should comingle. For  example, the team may (hopefully) refine their personas on the basis of which sub-types of their personas have the most acute problem scenarios. And this is a big win, allowing them to more reliably get a win in their early market.

The table below describes focal hypotheses around Enable Quiz’s problem hypotheses:

Priority Hypotheses Needs Proving? Experimentation
1 Screening technical hires for skill sets is difficult and most companies wish they could do it more effectively vs. their current alternatives. This wish is on their ‘A list’ of problems. Yes. Sure, the problem probably exists to some degree- but how important is it? To who? What are they doing now? – Discovery interviews: does it come up as a problem, unprompted- Test pre-release promotion and sign-up’s for more info.
2 The HR manager perceives this as important Yes, definitely. (ditto)
2 The hiring manager perceives this as important Yes, definitely. (ditto)
2 This is frequently relevant in the area of {IP networking, Linux sysadmin, Microsoft sysadmin, Java, PHP, Ruby, .NET, QA, devops, development management} Yes, definitely- it’s important to prioritize which areas are hottest and which (if any) have most frequent relevance to the early market. Some topics may also loan themselves to this type of test better than others. – Discovery interviews: check topics- Google Trends & Keyword Planner (search trends, keyword value, monthly searches on ‘hire ruby developer’, etc.)
1 Quizzing existing staff to understand who knows what at the current time would be useful and actionable and managers wish they could do it more effectively vs. alternatives. This is on their A-list. Yes, definitely. – Discovery interviews: does it come up as a problem, unprompted- Test pre-release promotion and sign-up’s for more info.
2 The above is useful because it would help match team members and tasks more effectively Yes, definitely. (ditto)
2 The above is useful because it would help with intra-team learning and professional development Yes, definitely. (ditto)
2 The current staff would find such a quiz at worst a benign admin task and at best a fun, friendly competitive diversion; they won’t find it too judge-y or top down-ish Yes, definitely. – Ditto but with technical staff more so than managers- ‘Steve the Staff Member’

The output of this section is a set of validated problem scenarios, including a careful description of the current alternatives.

Example Value Hypotheses

value-hypothesis-exampleSo, does anybody want some?

Since Enable Quiz is a synthetic company, let’s take the liberty of supposing that in their customer discovery they found that the problem scenario around screening potential new hires was the most acute. The current alternatives were calling cagey references and trying not to be a jerk and take up the whole interview by asking probing questions.

Now the question is whether a lightweight quizzing solution would deliver on that problem scenario and exceed the alternative enough to fuel sales and reliable customer creation.

The tables below describes some of Enable Quiz’s more detailed hypotheses.

Attention, Interest, Desire

Priority Hypotheses Needs Proving? Experimentation
3 If we run a program of Google AdWords ads, we’ll find ads that have a click-through rate (CTR) >2% Yes. Here is one example ad:
Headline 1: ‘Ruby’, ‘Devops’, ‘Golang’…HUH?
Headline 2: Are you a technical recruiter
Description: We make screening engineering candidates easier through a simple online tool.See [doc xyz or your Google AdWords account] for other ads. [no need to include this for your assignment- just an example of how this template might relate to the rest of your project]
3 If we get HR managers to a landing page with a demo, 10% will click through to other page for more information. Yes. We’ll probably have a few landing page variants for the different types of Google AdWord ads.

Action, Onboarding, Retention

Priority Hypotheses Needs Proving? Experimentation
3 If visitors click through from the landing page to another page, then at least 2% of them will convert to a free trial. Yes.
3 If users are on a free trial, 40% of them accounts will create at least one quiz. Yes.
3 If we successfully onboard the HR manager with a relevant quiz for an open position, they will use the quizzes for all the candidates they interview. Yes. Execute a manual ‘concierge’ experiment on the quiz process
3 If users create a new quiz, at least 30% of them will use their quiz at least five times. Yes.
3 If the HR manager uses the screening quiz with all candidates for a given position, [x]% fewer unqualified candidates will make their way to the functional manager Yes.  Execute a manual ‘concierge’ experiment on the quiz process
3 If users use their quiz at least five times, 20% of them will convert from a free trial to a paid plan. Yes.
 3  If we offer the service at [x] price with [y] supplemental assistance, companies that hire a lot of engineers will pay [z].  Try pre-release sales; pair with conclusion of concierge test

Example Customer Creation Hypotheses

customer-creation-hypothesis-exampleOnce Enable Quiz knows who they’re selling to, what problem they’re (really) solving, and how to deliver value against that problem, they need a way to economically and repeatably connect with that demand.

And this may involve a couple of different recipes, particularly as they transition from an early market of enthusiasts to wider distribution in the larger population of more pragmatic buyers (followers).

The table below describes focal hypotheses around Enable Quiz’s customer creation hypotheses. Notice that they’re looking at two channels: 1) direct sales and 2) online advertising and they’ve organized the hypotheses around those.

Each new channel begins with a priority 1 hypothesis with supplemental priority 2 items breaking it down across the AIDA framework.

Priority Hypothesis Needs Proving? Experimentation
1 Channel: Direct Sales.
Enable Quiz can connect with demand economically through direct sales.
Yes.  – see below
2 Attention: If we give them a contact list, our salespeople can call on [x] qualified customers/day. Yes. – Test and measure a limited set of sales activity (probably starting with a founder/senior person)
2 Interest: If a salesperson calls on [x] qualified leads, they can scheduled [y] meetings. Yes. ditto
2 Desire: If a salesperson gets a meeting, they will see follow-up from the customer [x]% of the time. Yes. ditto
2 Action: If we approach [x] qualified leads, [y] will close for a paid offer over $[z]. Yes. ditto
1 Channel: Online Advertising.
If we market to HR managers through an AdWords campaign, we’ll achieve a cost per acquisition (of free trial) of less then $[x].
Yes. – Run an initial AdWords test
2 Attention: If we run relevant AdWords ads, we’ll get the attention of HR managers. Yes. –  click through rates of at least [x]%
2 Interest: If we get the customer to a landing page with a demo, we can capture their interest. Yes. – Achieve a bounce rate of less then [x]% on a winning variation of the landing page.
2 Desire: If we are able to show the demo page to an HR manager, we’ll see desire. Yes.  – (This one is tough to observe through this channel. You could use the adjacent items (landing page bounce rate and sign-up conversion rate) as a proxy. Other channels and interactions like direct sales and social are much better channels for this.)
2 Action: If we get HR managers to a landing page with a demo, [x]% will sign up for [our email product announcements, a free trial]. Yes. – Achieve [x]% conversion rate for the objective in question
2 Onboarding: If HR managers signs up for a free trial, at least [x]% will create a quiz. Yes. – Observe this on in-app analytics
2 Retention: only [y] portion of customers will require a support call; the rest will use the online help to onboard Yes. – ditto
1  Retention: [x] portion of customers will renew/re-purchase. Yes. – Observe this on in-app or external analytics.

The output of this section is a repeatable, economical recipe for customer creation.

Example Usability Hypotheses

product-development-hypothesis-exampleLet’s assume Enable Quiz validates their key hypotheses and decides to ‘persevere’, building a working minimum viable product. Just to over communicate on this critical point, it’s not a foregone conclusion that they get to this point with the idea as it is and they should definitely not move to this point before they

  1. validate their persona, problem, and value hypothesis
  2. in many cases their customer creation hypothesis (especially if their business model involves selling over the web)
  3. exhaust productive opportunities to field test the whole proposition with non-software/non-product MVP’

One other preface not for clarity: this material is more ‘lean’ in general than specifically Lean Startup. The practice of Lean Startup is about fundamental, strategic validation of new ideas. This material here is more tactical- while none of these items individually is likely to make the venture sink or float, it is highly productive to list out key hypotheses about how customers will interact with the product and then figure out how to validate (or invalidate) those as quickly, cheaply, and above all early as possible.

The “MTP”

When, and only when, I’m building an actual product, I like to start with a ‘Minimum Testable Product’. The idea is to take everything you think is questionable about the user interface and artificially motivate users to test that before you put it out to your real users. While you don’t want to paralyze yourself with analysis, done right this is relatively quick and easy and it allows you to unbundle and debug the risk that users understand how to use your product/system from risks you have about them being fundamentally motivated to want to use it to solve their problem.

The ‘minimal’ part means that the application doesn’t actually have to be ‘working’, the responses can be static/hard-coded. That’s workable because you’re artificially motivating/directing the user to attempt something specific so you can see if they get how to do it on your product. In UI (user interface) talk, this is also known as a ‘Wizard of Oz’ prototype.

A specific example: I recently built brandlattice with some collaborators. It’s a web app that involves a lot of drag and drop, something most users still aren’t expecting from a web app, so we knew we had a substantial risk that users just wouldn’t get that part and get stuck. So we tested it and lo and behold, they got stuck. We had a toolbox of fixes to help them along, which we then tested in ascending order of disruptiveness to the overall experience We tested a few and we validated one that had a high success rate (and fortunately was minimally disruptive).

Without going into the details of Enable Quiz’s possible user interface, let’s say that they see the HR Managers progress to creating an actual quiz from the available technical topics as a big possible hurdle. They’ve discussed (argued about?) a few approaches and naturally they’d like to go with the simplest. So they build a simple front end prototype and formulate the hypothesis & experiment something like this:

Priority Hypothesis Needs Proving? Experimentation
1 If we provide them a self-service interface, the HR manager will be able to create the quizzes based on the available job descriptions. Yes. – Usability test with interactive prototype
2 The HR manager persona will understand the quiz creation process as presented and be able to complete it at least 90% of the time Yes. (see above)

Again, only undertake this step if you’re good on a, b and c above and if you feel like you’re mired in user testing, just move on- the main goal is to validate or invalidate your fundamental proposition.

Concierge Experiment Enable Quiz

ENABLE QUIZ CONCIERGE TEST

Item Example (Enable Quiz Concierge Test)
What hypothesis will this test? This MVP will test our high-level Value Hypothesis:
If Enable Quiz offers companies that hire engineers lightweight technical quizzes that screen job candidates for engineering positions, then these companies would trial, use, adopt, and pay for such a service.
How will we test it? We’ll start with custom-built quizzes on Google Forms to assess the basic value of the product to the HR manager. We have recruited five HR managers from our customer discovery work who have agreed to participate. Each has 1-2 open positions where we have custom-designed screening quizzes based on the specifics of the open position.

The quizzes have been made available to the HR managers and we’ve finished 0 day/usability testing to validate that they know how to administer the quizzes and find the scores (which we post to a Google Doc for them after grading them by hand).

What is/are the pivotal metric(s)?
What is the threshold for true (validated) vs. false (invalidated)?
Unpacking our high level hypotheses, we’d like to test-
1: If we create position-specific quizzes for HR managers, they’ll use them ~100% of the time and, after two positions, be willing to pay. For this, our metric is [quizzes administered]/[candidates interviewed].We’ll measure [quizzes administered] based on the number of position-specific quiz forms we receive. We’ve added a checkbox for ‘this is a test’ to help make it easier to discard junk forms. Also, there’s a name or initial field which we use to correlate back to the interviews. We screened the HR managers to make sure they have systematic calendaring on the interviews they do so that even if they don’t keep track of the count of candidates, we can work with them after the fact to check the count.Our target threshold on this is 90%. Given the hand-help set up, etc., we’re providing, if the quiz isn’t compelling to the HR managers we’ve hand held such that they use it for most job candidates, then we’ll likely need a substantial pivot.

2: If the HR managers use the quiz, they’ll send through 25% fewer candidates. For this, our metric is a comparison on the portion of candidates screened out by the functional manager- baseline vs. with the quiz.
This test will be of an approximation. Based on interviews with both HR & functional managers, around ⅔ of candidates are screened out by the functional manager based on some material deficit in skill set. We’ve provided a working Google Doc for HR managers to use in post-mortems for cases where they don’t already have this. We’ll check in with them weekly to (gently) work to keep this form up to date, but we expect only moderate upkeep of this.We’d like to to see the ratio of candidates screened out drop to roughly ⅓. This may be aggressive particularly since we’ve erred on the side of ‘easier’ quizzes to avoid false positives (incorrectly screening out candidates with a possibly adequate skill set).

3: If we offer the service at [x] price with [y] supplemental assistance, companies that hire a lot of engineers will pay [z].  We will measure this by our ability to sell a package where we charge them $100 for a subsequent custom-created quiz.We believe this is a better test than a pre-pay for the service since we think such a transaction for a few-hundred dollars would be difficult/not adequately compelling for an HR manager to sell internally.We’d like to see at least 50% of the subjects opt for a subsequent quiz, assuming success on the above two tests.

What will you do next if the result is true? False? If all three tests validated, we will proceed with a 1.0 of the Enable Quiz software, limited to just a few specific topics (see experiment below for decision-making on that).

If 1 & 2 only pass, we will consider the circumstances and reasons for that and review price point, purchaser, and, likely, the actual value proposition itself.

If no tests pass, we will step back and consider the a) whether a different take on the value proposition might be relevant and b) whether the problem is truly important.

How much time, money will it take to set up? Based on the current 5 technical topics we estimated that total set up for all 5 subjects will involve:- 20 hours of work by our product lead to set up, user test, and document (for user) the quiz infrastructure on Google Forms- 40 hours of work by our technical lead to formulate and validate (with subjects) the quiz questions across the 5 subjects
Roughly, what will it take for each individual test? For each subject (5), we think it will take our product lead:- 3 hours for initial Q&A and onboarding (including travel, etc.)- 3 hours across the quizzing to answer misc. questions

– 3.5 hours to grade the quizzes (assuming 20 quizzes/position)

– 1 hour of misc. follow-up by our technical lead

Roughly, how long will it take for each  test to run and produce definitive, actionable results? The interview cycle runs for 3-5 weeks after which we expect to have a full set of results on-hand.

AdWords Experiment Enable Quiz

Item Example (Enable Quiz Adwords Test)
What hypothesis will this test? This MVP will test our hypothesis about which technical topics are most promising for our hypothetical 1.0. There are many to choose from, and our intuition is that the right topics will a) by popular/in demand with employers b) overlap with the market we can reach and c) be affordable with regard to keyword phrases.
How will we test it? We have assembled a list of popular topics and workable keyword phrases ‘hire [Ruby] developer’, etc. and plan to run comparative Google AdWork campaigns to determine the top 10 most promising topics.
What is/are the pivotal metric(s)?
What is the threshold for true (validated) vs. false (invalidated)?
The pivotal metrics here are-
1: Absolute click-through-rate (CTR)
After a few iterations, we’d like to see a CTR of 2% on any topic we consider. Below this, we’re not sure our current hypotheses on our Customer Creation Hypothesis hold together. We’d like to see at least ~100 impressions on each iteration, with an estimate of 2 iterations/topics (this is a blend since we’re planning to use similar patterns across topics).2: Comparative CTR
Beyond this, we’ll initially rank topics by CTR.
What will you do next if the result is true? False? If true, we will pursue a 1.0 of the product with the top 10 topics.

If false in that none of the CTR’s are >2% after we feel we’ve tested a reasonable set of alternative keywords and ad+landing page combo’s, then we’ll a) revise our Customer Creation Hypothesis and consider alternative Channels and b) pursue an alternative assessment strategy (example: looking at job postings for target customers).

How much time, money will it take to set up? Setting up and tuning the campaign (including AdWord & landing page creation and iteration) will take:
– 20 hours by our product lead
– 20 hours by our ‘growth hacking’/marketing contractor, costing $1,600
Roughly, what will it take for each individual test? The above includes both set up and our estimate on tuning. After that, we should have a usable set of results.
Roughly, how long will it take for each  test to run and produce definitive, actionable results? Based on search frequency of our preliminary keywords and the need to iteration, we think we’ll need 10 days for each test to run.