Four Levels of Training Evaluation By James Kirkpatrick Book Summary

Kirkpatrick’s, Four Levels of Training Evaluation By James D. Kirkpatrick and Wendy Kayser Kirkpatrick


Award-winning authors and advisers Jim and Wendy Kirkpatrick honor Jim’s father, Don Kirkpatrick, whose work in the 1950s inspired a renowned method for training evaluation: the Kirkpatrick Four Levels. The authors address changes in workplace culture since the method’s inception, while explaining why the fundamentals – definitions of the levels and their application – remain the same. With examples and case studies, the authors emphasize the right way to utilize the model. They provide indispensable clarification for those already familiar with the method and the best place for beginners to start.


  • For decades, users have misunderstood and misapplied Kirkpatrick’s evaluation model.
  • Worldwide, organizational leaders express dissatisfaction with the results of training. 
  • Apply the Kirkpatrick Four Levels in reverse order.
  • Devise an evaluation strategy and plan.
  • Collect data and evaluate your program during and after its execution.
  • Define training success factors with your stakeholders. Never assume you know what stakeholders want and value most.
  • Design your measurement instruments and use the data to improve your programs.
Four Levels of Training Evaluation Book Cover

Four Levels of Training Evaluation Book Summary

For decades, users have misunderstood and misapplied Kirkpatrick’s evaluation model.

In the late 1950s, PhD candidate Don Kirkpatrick devised and published four means of evaluating the returns on training. Those four levels of evaluation – reaction, learning, behavior and results – became the global training industry’s standard. Those using Kirkpatrick’s work have misunderstood and misapplied it for decades, however, resulting in the inability of most organizations to recognize the value of training and development programs.

“We have observed that many training professionals say they are ‘using Kirkpatrick,’ yet are following dated practices that are failing to create and demonstrate organizational value with their training.”

Don Kirkpatrick’s son and daughter-in-law, Jim and Wendy Kirkpatrick, devised the “New World Kirkpatrick Model (NWKM),” which honors Kirkpatrick’s intention that training evaluations focus on behavior change and results.

The NWKM urges you to start at Level 4 reaction and work back. It emphasizes starting measurement on day one, and to keep measuring your program throughout its execution. Your evaluation plan should include the data you need, how you’ll collect it and your means of measuring training effectiveness and results from the outset.

Worldwide, organizational leaders express dissatisfaction with the results of training. 

Leaders want to know whether participants liked the program or learned anything – Levels 1 and 2. But they need to know whether the program changed behavior – Level 3 – and improved crucial business results – Level 4.

“The degree to which a program is evaluated should match its importance or cost to the organization.”

Proper training evaluation produces insight into improving learning programs and extending the benefits to more learners; it also demonstrates returns on training costs. Conducting a four-level evaluation takes time, effort and money, but the process is clear and doable. Evaluate training relative to its importance to organizational success.

Apply the Kirkpatrick Four Levels in reverse order.

Apply each of the levels. Don’t, for example, attempt to evaluate at Level 4 if you have only collected data for Levels 1 and 2.

  • “Level 4: Results” – The identified business benefit derives from the training and related follow-up activities and support. Identify a mission-level objective, such as providing an industry-leading level of customer service. Then, work backward to determine the metrics that drive that high-level outcome – for example, customer satisfaction survey results, customer retention and the like.

“Agreement surrounding leading indicators at the beginning of a project eliminates the need to later attempt to prove the value of the initiative.”

  • “Level 3: Behavior” – Identify the crucial behaviors learners must adopt to achieve the measures that drive the outcomes in Level 4. Determine what learners need to exhibit those behaviors – such as supervisory support, incentives, coaching or equipment – and turn them into metrics. Post-training, learners require reinforcement of the concepts learned to apply them in their work. With reinforcement, expect that up to 85% of your learners will use what they learned on the job. Accountability and recognition for continuing to practice learning at work factor enormously into the ultimate success of training – driving Level 4 results.
  • “Level 2: Learning” – This level covers the extent to which participants learn what training intends – including new knowledge, abilities, and the mind-set or determination to apply the new learning. Learners should finish the program knowing the subject matter and with the skills to apply it. They should believe in the program’s value and commit to using it at work. Knowledge and skills transfer rarely proves the main obstacle to application. More often, learners choose not to apply what they learn because they don’t believe the change will lead to improvements, and/or due to their work environment, which might discourage or disincentivize them to change their behavior.

“Level 1 is often overthought and overdone; it’s a waste of time and resources that could be put to more benefit at other levels.”

  • “Level 1: Reaction” – Participant satisfaction and engagement with the training and perception of its applicability aids in learning. Understandably, most organizations evaluate most of their programs at Level 1. Unfortunately, too many so focus on participant satisfaction they fail to focus on the more important measure of whether participants use the learning. Organizations evaluate only about one-third of live training sessions and less than 20% of online training to Level 3. Failure to measure Levels 3 and 4 from the start impedes organizations from adjusting and improving their training programs.

Devise an evaluation strategy and plan.

Stick with satisfaction surveys and, perhaps, Level 2 measures to evaluate non-mission critical training. Where a learning program aims to affect key organizational objectives, employ a full evaluation. Design your program to include the essential ingredients of Levels 3 and 4 success from the start.

Start your planning at Level 4. Determine the organizational objectives the training addresses. With your leaders and stakeholders, agree to what constitutes success – their “Return on Expectations (ROE)” – which then become your critical success factors. In your planning, focus on building the links between Levels 2 and 3. If you achieve sound Level 1 results, expect to achieve learning – Level 2 – and if participants use the learning on the job to achieve the drivers you identified, positive Level 4 results will follow. For most organizations and L&D professionals, the Level 2 to Level 3 bridge proves hardest to build.

“Ongoing access to participants is a key item to negotiate prior to a program, and you need to insist on the importance of it for the success of the program.”

The more accurately you identify what participants need to do on the job – critical success factors – to achieve the Level 4 outcome, the better. Work closely with line managers, supervisors, high performers, and others, while garnering their much-needed support for reinforcement post-program. Consider whether training is likely to meet the challenges and objectives sought by stakeholders. Other initiatives – from revised incentives to town hall meetings – might suit the purpose better. Take the perspective of a “performance consultant” rather than an L&D professional when assessing the appropriateness of training interventions. Where training proves the best intervention, but obstacles to Level 3 application exist, insist on the removal of barriers prior to program commencement.

As you plan, and before your program begins, know the essential data you need and how to collect it, geared to each level. Strive to blend levels as well. For example, if you survey participants for satisfaction at Level 1, insert questions that inform other levels, such as by asking participants how they plan to apply their learning at work – Level 2; how they actually applied the learning – Level 3; and the results – Level 4. Aim to capture qualitative and quantitative data to adjust and improve the program.

“Training participants should arrive at the training event with a positive, clear message as to the purpose of the training and what they are expected to do as a result of participating.”

Ensure that your stakeholders – leaders and managers – prepare participants by communicating the importance of the training and their expectation that learners will apply their new knowledge and skills when they return. During the program, instructors should reinforce that message and continuously link learning to the work. After the training, managers must stay involved to reassert their enthusiasm and expectations for results. Instructors should follow up to see what learners use on the job, how they use the learning and the obstacles to application. Analysis of these results reveals whether the program is on target. With stakeholders and graduates, instructors share the results and determine adjustments.

Collect data and evaluate your program during and after its execution.

At Level 1, observe participants. If they seem disengaged, pause, discuss any issues, then make immediate changes. In addition to surveys to gauge post-program satisfaction – ideally distributed at the end of a class/course – wait a short period of time before talking to participants to capture deeper details.

“The New World approach to evaluating at Level 2 is to incorporate a variety of activities into the training that inherently test participant knowledge.”

Integrate Level 2 evaluation – knowledge, skill, attitude, confidence and commitment – into your program. Gauge the degree of proficiency participants exhibit during exercises. Insert quizzes, discussions, role plays or simulations to help you assess knowledge and skills gains. Ask participants to present the results of small group discussions or teach elements of the training itself. Have them devise plans for how they will use the learning once back on the job. Observe participant attitudes to discern whether they believe the learning is relevant, and seem likely to use the learning. Where appropriate and possible, use pre-program and post-program tests to assess knowledge gains. Make note of learning gains to report them to stakeholders.

“When managers support training and learners, it works. When they don’t, it does not.”

Pay most attention to Level 3. Work closely with stakeholders to select and define the measurable success factors that drive your Level 4 outcome, and to craft a plan for encouraging and monitoring them post-program. Avoid training jargon and earn support of managers at this stage; otherwise, you cannot effectively evaluate Level 3 and 4 results. 

Define training success factors with your stakeholdersNever assume you know what stakeholders want and value most.

Your success factors should lend themselves to clear measurement. Rather than stating, for example, “Participants will coach all of their team members,” define it further; for example, “Participants will conduct 30, 60 and 90 day, 30-minute coaching sessions with each of their reports following the training.” Ensure learners know exactly what you and their managers expect them to do back at work. Wait an appropriate amount of time for new behaviors to take hold before you measure.

Though you should place the greatest investment of time in Level 3 evaluation, Level 4 provides the only reason you evaluate. Training only makes sense if it delivers results – a return on investment (ROI) and ROE. Progress against Level 3 success factors helps you gauge and report progress against Level 4 outcomes.

Design your measurement instruments and use the data to improve your programs.

Don’t overcomplicate your evaluations. In surveys, maximize your response rate and quality of responses by asking only questions that yield information crucial to your evaluation. Use well-labeled response scales appropriate to the questions. Review your surveys from the perspective of the busy learner and revise accordingly. Apply the same standard to all data by focusing on what matters – signal – and ignoring the data and measures you don’t need – noise.

“Continuous improvement requires continuous evaluation.”

Work with stakeholders to confirm whether progress made toward the outcome meets expectations. If not, make adjustments before the program finishes. Determine which learners – and to what extent – apply the learning on the job. If Level 3 success factors look good, learn why, so you can replicate that success.

On average, your training program will likely return mediocre results. Level 4 results rely on a myriad of factors beyond training, so you’ll never know for sure the contribution training made. Find the learners who achieved the best results, interview them and work to extend their success to the remaining learners and those who follow.

About the Authors

James D. Kirkpatrick

Award-winning authors and advisers Jim and Wendy Kirkpatrick lead Kirkpatrick Partners, where they consult to large corporate, government and military clients.

Wendy Kayser Kirkpatrick

Video & Podcast