Experiment Results: Is an evaluation distinguishable from a discussion?

Travis Dixon Assessment (IB), Curriculum, Revision and Exam Preparation, Teaching Ideas 5 Comments

While we can find differences if we look hard enough, I contend that life's easier if we treat discuss and evaluate the same.

After writing and sharing a recent post, I got some questions regarding the validity of my claim that a good evaluation and discussion are indistinguishable. So I decided to put it to the test by writing an example essay following the essay structure I advise for students and seeing if it was obvious which command term was being used. I gathered some data and I’m not surprised by the findings.

But I have a confession to make…

When I posted this essay in the FB group “IB Psychology Teachers Support Group” and asked if it was an evaluate or a discussion, it wasn’t the first time I had posted the essay to a group of teachers and posed this question. A week earlier I had posted an almost identical version of the essay (this one) in our other group (“ThemEd’s IB Psych Teachers”) and asked the same question.

My Hypothesis

I was predicting a fairly even split of votes between evaluate and discuss with essay version 1.0. But I must admit I thought discussion would have more votes, as the words “strengths and limitations” are not explicitly stated (although the essay definitely explains strengths and limitations).

The changes I made in version 2.0 to share with the larger group were to simply use the terms strengths and limitations explicitly. I thought this would result in a higher % of votes for evaluation. I was wrong.

First Results: (ThemEd’s FB Group)

In this first trial it was a pretty even split.

  • Evaluate = 6
  • Discussion = 5

NB: There were a few teachers who were on the fence and couldn’t decide.

Second Results: (IB Psych Support Group)

There was some awesome discussion generated and a few people were willing to wager a guess:

  • Evaluate = 2
  • Discuss = 3


So one of my original hypotheses was clearly wrong: even when modifying the essay to explicitly state the terms “strengths and limitations,” still there was no common trend in distinguishing which command term was adhered to.

Similarly, when we total the votes from both groups we have:

  • Discussion = 8 votes
  • Evaluation = 8 votes

It seems (to me at least) that there isn’t an objectively observable difference between an discuss and an evaluate essay.

In the discussion it was obvious that there were teachers who argued strongly that these were clearly different terms and should be treated differently. However, when it came to identifying the command term the essay followed they were on opposite sides of the fence. This suggests to me that we can find differences if we want to, but these will be subjective. And then I’d argue, why bother at all if it’s just going to add more confusion to students?

Why does this even matter?

The lack of difference between evaluate and discussion was an important discovery for me because understanding nuances between command terms is not a transferable life skill – it’s an IB exam preparation strategy. Therefore, I don’t want to teach it because I want to reduce the time and effort I spend in class focusing on things that students will never use beyond exam day.

My biggest goal for my teaching is that its impact is long-lasting and meaningful. This is why I want to reduce time and effort spent on teaching things that can only be used in the IB exam.

Here are the three things I want my students to be able to do in any essay in an IB Psych’ exam:

  1. Explain a central argument in response to the question

  2. Use evidence to support that argument 

  3. Critically reflect on their argument and/or evidence

That’s it. Simple. Whether this is a discuss, evaluate, to what extent, or contrast, I want my students to reach these three levels in their essays. Personally, I think it’s more important that students understand this basic framework so they can highlight their psychology knowledge, understanding and critical thinking skills the best they can in the limited time they have available in an exam. Time and energy spent on splitting hairs between command terms is detracting from valuable time we could be spend developing more important skills (like how to do the three things listed above), in my opinion.

I think my three level structure for essays can also be applied to other subjects and even beyond the IB. Thus, it’s worthwhile developing these transferable writing and thinking skills.

I’m not saying there can’t be a difference between evaluating and discussing. As I mentioned earlier, if we focus on nuanced connotations of the terms and write our own subjective interpretations then sure we can find differences. What I’m saying is that there doesn’t need to be a difference.

If this doesn’t ring true with you, that’s fine, of course. I’m not concerned with trying to change IB definitions of command terms or to get all IB Psych’ teachers to agree with my point-of-view. All I’ve done is share an observation I’ve made about assessment that I think has practical applications in my teaching and can reduce my stress and workload and also have a positive impact on kids’ learning and exam performance. I’m sure it will make sense and be helpful to some teachers, which is great. Likewise, I’m sure most people will think I’m full of sh*t and will ignore it. That’s fine, too.

I will continue to teach my students to treat these command terms as the same, just like I teach them to cross out “outline” and “describe” in SAQs and write “explain” before they write their answer (with one notable exception!)

I also wanted to share because, well, I just love discussing anything assessment related!

READ MORE: Download an introduction to psychology workbook for students here!

One reason why I decided to write my own indistinguishable discuss/evaluate essay in the first place is because I wanted to test my own theory. Often when I’ve thought I understand something about assessment I’ve completed the task myself then I realized I was wrong. I’ve found that it’s easy to make these mistakes when we talk about assessment in the abstract. After writing the essay, I was convinced of my original theory (and the data was reassuring).

Like I say, I hope this is helpful to some people, as it definitely makes sense to me.

I’d welcome any thoughts, critiques or questions in the comments.

Later this week I’ll post an annotated version of the essay comments to highlight points I think students should take note of when writing essays.

Once again, thanks to Christos for sewing the seed of this idea about discuss/evaluate many years ago. It took a long time for the seed to take hold, but it was worth it. Cheers.

Comments 5

  1. Since you have framed this as an experiment, I feel the need to raise again the issue of whether the essay was a good example of discuss or evaluate. You suggested it is a good example of either. In the first place, if the material to be judged is not considered by its author to be either, and there is no ‘official’ judgement, I would argue that all you have shown is that your example is not definitely not convincingly focused on either command term, and you have therefore met your objective in writing it. This makes any further conclusion a bit difficult. By my logic, you could also have shown us a picture of a dead salmon and asked us whether it is ‘discuss’ or ‘evaluate’ – we can all have a go, but because we mostly agree that a dead salmon is not discuss or evaluate, we’d have to be very cautious about drawing further conclusions. But you suggest that, ‘It seems (to me at least) that there isn’t an objectively observable difference between an discuss and an evaluate essay.’ I don’t necessarily disagree with the other conclusions you’ve made, but I think unless you have an exemplar that validly demonstrates adherence to one of those command terms (e.g. one that has scored a 4 for organization in the last Guide) then you might as well be using a dead salmon: that teachers (not necessarily trained examiners) don’t agree could simply be a sign that the exemplar you provide is ambiguous. After all, was it written as either a discuss or evaluate? If anything, it was deliberately “indistinguishable”. I think you’d get different results with a more pure prototype, or even if you had asked for ratings on a 5-point scale, for example, of how ‘discuss’ it is and how ‘evaluate’ it is, but as it stands, I think you’ve set up an experiment that almost cannot fail to prove the point you had in mind…
    But I support many aspects of what you say here regardless!

    1. Post

      Some valid evaluations of my “experiment” based on researcher bias, Alan 🙂 I call this an “experiment” very, VERY carelessly!

      I understand (and love) your dead salmon analogy. However, here’s my problem with trying to write this essay in any other structure – I CAN’T! If I follow the framework that I set out for my students (the three levels), even if I TRY to be more discursive or evaluative, it would come out the same. In fact, I did try to make it more evaluative by adding “strengths” and “limitations” in Version 2.0, but still that didn’t seem to make a difference.

      Yeah, I could go beyond my essay framework and PERHAPS try to be more one than the other, but then that would be defeating the purpose because it’s not what I will be expecting of my students.

      Perhaps my point is not that they’re indistinguishable or it’s not that they CAN’T be written to be distinguished – my central point is that to write excellent essays they DON’T HAVE TO BE treated as different. If I can kill the salmon with one smack, why give it another?

      Make sense?

      PS. (I’m not yelling, but there’s no bold or italics option in these comments).

      1. Post

        An afterthought: if I did present a dead salmon and ask: “is this a discuss or an evaluate?” Wouldn’t the common response be “…it’s neither, it’s a dead salmon!?” Of the discussion in both groups, there wasn’t a single vote for – it’s neither, or that it’s a poor example of either.

        I’d come back to the question I also posted during both discussions: would it score lower marks if it was one or the other? I don’t think it would and I think it would score high marks.

        Ergo, distinguishing the requirements of these command terms is surplus to requirements for writing excellent essays.

        PS. I love discussing assessment with you, Alan. Is this something we get from kiwi teacher-training?

  2. There are numerous issues you’ve addressed with the exercise – I wouldn’t worry about teachers not saying ‘neither’ because they kindly followed your instruction, which was to guess which command term it was addressing. Nobody sensed how dastardly you are!

    I agree that they don’t have to be distinguished in order for the student to get marks, and, as I’ve said elsewhere, it’s not relevant to the criteria any more, except that, most likely in an introduction, the student needs to demonstrate that they understand the task/issue.

    The question of whether it would score well as another is an interesting one. I feel that whether the command term has been addressed or not is more like a continuous variable than a categorical one. Narratively, I think we are comfortable saying ‘command term is not addressed’ but all essays probably address all command terms to some extent, so the question of whether the essay would score better or worse if the command term were different is highly relevant. I would argue that it SHOULD score better or worse, and I’ll bet that it WILL, but that’s because the new criteria will be applied poorly or changed, because if they are adhered to strictly, then a ‘good essay’ should suffice for all questions. It will only take a couple of years for some exemplars to appear on the web and soon thereafter in written exams around the world…

    If there’s any kiwi spirit in this conversation, I suspect it’s more related to power distance and associated (dis)respect for authority.

    1. Post

Leave a Reply

Your email address will not be published.