November 2017 SEGway

2013 Logo 3d gradient
SEGway              November 2017
Research and Assessment News from SEG Measurement  
In This Issue

Follow us on Twitter


View our profile on LinkedIn


31 Pheasant Run
New Hope Pennsylvania 18938

Happy Thanksgiving from all of us at SEG!  

We have several interesting articles this month. We hope you will take some time to stop eating and escape the relatives to read them over this long holiday weekend. When you return from your holiday, please give us a call and find out how we can help you get the answers you need and increase your sales.

And, take a look at our website at, to see the latest developments in the field. Email me at with any questions about assessment or effectiveness research. I always look forward to hearing from you.
Scott Signature

Scott Elliot
SEG Measurement 

Five Important Things to Look for When Reading Efficacy Research
How do you know the research is any good?
The educational community at large and specifically funding sources (e.g., government, foundations, ESSA) are insisting on scientifically-based research to support the products and services used in schools.  But, how do you know if the research examining the product or service you are considering is any good?  Ultimately, you need to rely on the review of others in the know through the peer review process.  But, as with most purchasing these days, it pays to be an educated consumer.  So, here are five important things to look for when reviewing effectiveness research.
Mistaking testimonials and case studies for efficacy research-
Educational publishers and technology providers often include testimonials from happy users or case studies of successful implementations, in their product literature.  While feedback from colleagues who like the product and are successful is helpful, testimonials and case studies are not the same thing as scientific effectiveness research.  They are based on a single point of view or instance of use, and you are likely hearing only one side of the story. Only a properly designed and carefully executed study, with a sufficiently large number of participants, comparing users of the product or service to a similar or randomly assigned group of non-users can provide you with the evidence you need to be comfortable that the product effectively achieves desired outcomes.

Research should be conducted by an independent research organization-

It is difficult to judge objectively your own work; the same is true for product and service providers. To help ensure against a confirmatory bias--the tendency to want to prove your own position--an effectiveness research study should be conducted by a third-party, credible research organization.


Inclusion of a Control Group-  

Research that only reports on a group that used the product or service without reference to a comparison group who did not use the product or service is very limited.  Without a comparison group, we cannot know whether any effect shown (e.g., academic growth) is really due to the use of the product or service.  Sure, the group using the product may have seen academic growth, but if the amount of growth is no greater than seen with non-users the product may not be actually contributing anything.

Inclusion of a sufficiently large and representative sample-
You want to make sure that the researchers have based their results on a sufficient number of study participants that are reasonably similar in make up to the group with whom you plan to use the product or service.  It is hard to give an exact number, but small numbers of students in a limited number of classes/schools should give you pause. And, make sure that the demographic profile and other characteristics you think are important are consistent with your implementation.  For example, a study conducted in an urban high school may be less applicable to a planned implementation in a suburban middle school.


Mistaking significance for magnitude-

Most of us know to look for statistically significant results--to make sure that any relationships or differences found were unlikely to occur by chance alone.  That is all well and good, but it does not tell us how large the relationship or differences are.  The magnitude of the relationship or differences are often reported as an effect size.  Ways of reporting the effect size vary, but in most applications these effect sizes tend to be between 0 and 1, or are reported as a percentage ranging from 0 to 100. Larger values show a greater effect.  But, remember: There are many influences in education and even small effects may be enough to make product purchase worthwhile.


SEG has worked with many educational publishers and technology providers, from start-ups to the largest industry players, to design & implement efficacy research programs.
With nearly 40 years of experience in research, we know what it takes to conduct sound efficacy research.  Please contact us to discuss your needs.  Email us or call us at 800 254 7670. 

Psychometrician's Corner
Test Validation: Supporting test score claims & test-based decisions 
When a test is administered, the scores/results are used to make a decision, such as: Did the student master the skills I just taught? Does the student have the required skills to move on to the next grade? Does this employee have the requisite skills to perform a particular job? Should this person be assigned remedial instruction? What are the gaps in knowledge in a particular area? How do my students compare to other students who learned this material?
In other words, you are making claims about what it means to achieve a certain test score, receive a passing grade, or make a decision made based on the test scores.  Test validation asks the question:  Can the claims you are making based on test be supported by evidence?  Does the test score really mean what you claim it means?
This relates directly to the extent to which we can accurately make the decisions we want to make based on the assessment results. Tests are developed for a specific purpose or purposes. Unfortunately, many test users and test creators simply assume that the test can be used for the purpose intended.  Yet without evidence that the test effectively can be accurately used for that purpose, we run the risk of making incorrect or inaccurate decisions. In short, it is critical to provide evidence that the test is actually living up to the claims you are making.  
Fortunately, there are many sources and ways to collect evidence supporting the validity of a test. Here are a few critical sources of evidence that we work with our clients to obtain:
  • Item quality- Does the item measure the skill, objective or standard you intend to measure?
  • Reliability (consistency of scores) - We cannot have a valid use of a test if the test does not produce consistent results. So, reliability is necessary for validity.
  • Face validity - When stakeholders view the test with an understanding of its purpose, they should get an immediate impression that the test is suited for its purpose. This is a quick sanity check that the test appears to be measuring what it is intended to measure.
  • Construct validity - The test should cover the content and follow the underlying theory of the skills and content to be assessed.
  • Concurrent validity - The test should produce similar results to similar assessments or observations of the same measured outcomes.  
  • Predictive validity - The test should be accurate at predicting or relating to a related performance.
Some of the validity concerns can be addressed during the development of the test, the creation of the scoring/scaling rules, and the design of the report and feedback. In addition, conducting a pilot or field test of the form will also help to gather the necessary information to investigate the support for the validity of the assessment to be used for a particular purpose. Some additional external data including scores on similar assessments, course grades, or observation ratings may also be needed.
While it is tempting to repurpose an assessment that has already been developed, it is important to ensure that its use is valid before using it for a new purpose. Otherwise, there are large risks that someone could be accepted or rejected inaccurately, retained or promoted inaccurately, or inaccurately classified in some other way. There is too much at stake to not take the time to ensure that assessments are valid for the intended purpose. We should all ensure we are upholding the Standards for Educational and Psychological Testing.

Leave your psychometric worries to us.  We offer a complete suite of psychometric services.  Please contact us to help you plan and execute this work at or 800-254-7670. 

SEG At Upcoming Conferences
Let's Meet!
We are looking forward to seeing our colleagues and meeting new friends at the upcoming conferences.  We are participating in several sessions & we invite you to join us.
 Look for us at these upcoming conferences:
  • SIIA Education Business Forum, December 5 - 6, Washington DC
  • FETC Future of Technology Conference, January 23 - 26, Orlando, FL
  • Bett, January 24 - 27, London
We would love to meet with you and discuss how we can help you build strong assessments and get the proof of effectiveness you need for success.  
If you would like to meet with a representative from SEG Measurement to discuss how we might help you with your assessment and research needs, please contact us at

About SEG Measurement
Building Better Assessments and Evaluating Product Efficacy
SEG Measurement conducts technically sound product efficacy research for educational publishers, technology providers, government agencies and other educational organizations, and helps organizations build better assessments. We have been meeting the research and assessment needs of organizations since 1979. SEG Measurement is located in New Hope, Pennsylvania and can be accessed on the web at