About the Authors:
Dr. Patrice Chrétien Raymer is an Internal Medicine Resident at the Université de Montréal, Montreal, Quebec
Dr. Jean-Paul Makhzoum, is an Assistant Professor, Hôpital Sacré-Coeur, Université de Montréal, Montreal, Quebec Robert
Gagnon, is a Psychometrician, Assessment Office, Faculty of Medicine, Université de Montréal, Montreal, Quebec Dre. Arielle
vy, is with the Department of Paediatrics, University of Montreal, Montreal, Canada
Dr. Jean-Pascal Costa is Assistant Professor, Centre Hospitalier de l’Université de Montréal, Montreal, Quebec Corresponding
Author: Patrice.chretien.raymer@umontreal.ca
Submitted: February 22, 2018. Accepted: September 27, 2018. Published: November 9, 2018. DOI: 10.22374/cjgim.v13i4.280
A CanMEDS Competency-Based Assessment
Tool for High-Fidelity Simulation in Internal
Medicine: The Montreal Internal Medicine
Evaluation Scale (MIMES)
Patrice Chrétien Raymer MD, BSc, Jean-Paul Makhzoum MD, FRCPC, Robert Gagnon MPsy, Arielle Lévy MD, MMed, Jean-Pascal
Costa, MD, MMEd, FRCPC
High-fidelity simulation is an efficient and holistic teaching method. However, assessing
simulation performances remains a challenge. We aimed to develop a CanMEDS competency-
based global rating scale for internal medicine trainees during simulated acute care scenarios.
Our scale was developed using a formal Delphi process. Validity was tested using 6 videotaped
scenarios of 2 residents managing unstable atrial fibrillation, rated by 6 experts. Psychometric
properties were determined using a G-study and a satisfaction questionnaire.
Most evaluators favourably rated the usability of our scale, and attested that the tool fully
covered CanMEDS competencies. The scale showed low to intermediate generalization validity.
This study demonstrated some validity arguments for our scale. The best assessed aspect
of performance was communication; further studies are planned to gather further validity
arguments for our scale and to compare assessment of teamwork and communication during
scenarios with multiple versus single residents.
Canadian Journal of General Internal Medicine
10 Volume 13, Issue 4, 2018
Teaching and Learning
La simulation haute fidélité est une méthode d’enseignement efficace et holistique.
Cependant, l’évaluation des performances en simulation demeure un défi. Nous avons
cherché à développer une échelle d’évaluation globale fondée sur les compétences
CanMEDS pour les résidents en médecine interne dans le cadre de simulations en soins
Notre échelle a été développée en utilisant un processus Delphi. Sa validité a été testée a
l’aide de 6 scenarios filmés sur vidéo de 2 résidents prenant en charge un cas de
fibrillation auriculaire instable, puis évalués par 6 experts. Les propriétés
psychométriques de l’échelle ont été déterminées à l’aide d’une étude G et d’un
questionnaire de satisfaction.
La plupart des évaluateurs ont jugé favorablement l’utilisation de notre échelle et ont
confirmé que l’outil couvrait pleinement les compétences CanMEDS. L’échelle a
démontré une validité de généralisation faible a intermédiaire.
Cette étude a démontré certains arguments de validité pour notre échelle. Le meilleur
aspect évalué était la communication. D’autres études sont prévues pour fournir d’autres
arguments de validité pour notre échelle, ainsi que pour comparer la capacité de notre
échelle à évaluer le travail d’équipe et la communication lors de scenarios avec un ou
plusieurs résidents.
High-Fidelity simulation is a teaching and assessment method
which has been rapidly introduced into many medical residency
High-fidelity simulation is considered by many as
an innovation for direct clinical observation
and punctual end-
of-rotation evaluations.
Simulation has proven to be an
and enjoyable teaching method, allowing for
immediate and
targeted feedback.
Simulation training has
been associated with
better subsequent performances in
technical skills
and non-
technical skills, the latter being
often deficient in problematic students and challenging to
evaluate and teach.
To address these issues, global rating scales (GRS)
developed based on recognized crisis resource management
behavioural markers
to overcome the limitations of checklists-
based evaluation. Studies have also reported GRSs to be especially
useful in preparing direct observation and providing appropriate
and organized feedback.
To be completely useful, such rating
scales must be adapted to the specific medical specialty and
complementary to the competency framework of the trainee
within that program.
The CanMEDS framework describes competencies required
by Canadian physicians to properly tend to their patients.
CanMEDS competencies have clearly been identified as overlapping
substantially with NTS’s (non-technical skills); however, up to
now, only one GRS (called the GIOSAT) has been based on
CanMEDS, and is not adapted to internal medicine.
In this study, we aimed to develop a GRS based on CanMEDS
to assess internal medicine residents participating in acute care
high-fidelity simulation scenarios. We also aim to study this
scales validity, as well as its usability, based on Kanes validity
Development of the rating scale: We initially performed a literature
search to look for other NTS GRSs. We selected all articles for which
the principal subject was evaluation of trainees in a simulation
setting. Our search revealed several GRSs of interest.
The Anesthetists’ Nontechnical Skills (ANTS) was the first
GRS developed for anesthesiologists,
, and is one of the most
studied, including in Canada.
We built an anchored 5-level
Likert scale based on the ANTS, other GRSs, and the CanMEDS
competency framework.
We further refined the scale using a
2-step Delphi method for which 4 recognized simulation-based
Canadian Journal of General Internal Medicine
Volume 13, Issue 4, 2018 11
Raymer et al.
education experts were recruited. They were provided with a video
of a pulmonary edema scenario managed by a junior resident
and used our scale to assess performance. Testers filled in a
formal questionnaire at each Delphi step, and provided informal
comments. Our scale was modified at each step incorporating
the comments provided. The final version of our scale
in French, but its English version can be found in Figure 1.
Preliminary Evaluation Study Design: To evaluate the MIMES,
we developed a retrospective observational study involving
internal medicine residents in a simulated acute care setting.
Our protocol was evaluated and improved by our institutions
Committee for Simulation-Based Medical Education and approved
by our institutions Multifaculty Committee for Research Ethics.
We used 6 anonymized videos of different teams of 2 PGY-1
residents from our internal medicine program, managing a case
of unstable atrial fibrillation. The simulated patient, portrayed
by a high-fidelity mannequin, spoke neither French nor English,
and had to be communicated with by speaking to his wife. The
wife was scripted to be very anxious of the patients state and
inquisitive of the teams management plan.
Ratings: Six experienced acute care physicians and
simulation-trained instructors were recruited to evaluate
student performances. No evaluator training or calibration
was provided for this tool.
Each evaluator evaluated the
2 students in each video independently using our GRS, after
which they completed an online questionnaire to evaluate the
scale`s validity and usability.
Statistical Analysis: We evaluated our scale using a
generalizability theory model
G-study. Facets for our G-study
were students (S), evaluators (E) and NTS categories (C).
Design was fully crossed (S/RC); student and evaluator facets
were random, while categories was a fixed facet (n=6). We also
Figure 1. The Montreal Internal Medicine Evaluation Scale (MIMES)
Canadian Journal of General Internal Medicine
12 Volume 13, Issue 4, 2018
A CanMEDS Competency-Based Assessment Tool for High-Fidelity Simulation in Internal
performed a (D) study for the number of evaluators
. The
online questionnaire completed by our evaluators was analyzed
using descriptive statistics.
G and D studies: The result of our G-study is presented in Table1;
reliability of the overall scale was 0.64 with evaluations pooled
from 6 evaluators. It is generally considered that a G coefficient
of 0.6 is the limit for acceptable reliability, and that more than
0.8 represents near-perfect reliability. In addition, assessment of
communication yielded the highest reliability (0.71).
Table 2 present the variance component of our G-study; the
highest contributor to variance was evaluator-student interaction
(49%). Results of the D studies performed are shown in Table 3.
Online questionnaire: Results from the survey can be found
in Table 4; 100% of evaluators concluded that there were no
superfluous elements in the MIMES. All evaluators concluded
that MIMES is appropriate in length and is simple to use. 83%
felt that the scale was potentially useful to provide immediate
feedback to students, and 66% felt that it could be used to plan
structured teaching. 66% of them agreed that the MIMES was
not adapted to be used as a summative assessment tool. 100% of
surveyed internal medicine specialists agreed that the MIMES
contained all necessary elements when assessing performance
of internal medicine residents during an acute care situation.
In this study, we set out to create and evaluate an educational tool
for internal medicine residents performing acute care simulation
scenarios. Results from our study showed generalization validity
arguments for the MIMES and most evaluators reported that
the tool assessed all appropriate components of the CANMEDS
framework, a recognized and comprehensive set of skills necessary
for medical practice, without assessing superfluous categories.
The formal Delphi method used to create the MIMES and its
elaboration process which was based on recognized GRSs and
well-recognized competency frameworks also adds to its scoring
validity argument.
Figure 1. (continued)
Table 1: Estimated G-coefficients for Overall MIMES NTS Categories
Canadian Journal of General Internal Medicine
Volume 13, Issue 4, 2018 13
Raymer et al.