Other diversions
University
Affairs, May 2008 - “Upper-year Exam
Malady”

So yer plannin’ on being “sick” to
get that A+? It won’t help…
There is likely a proportion of students who submit a medical
exemption note for missing a SC/BIOL 1010 midterm or exam who are in fact not suffering from any illness. Perhaps these students had fallen behind in
the course readings or attending lectures, or felt that too many exams were
close together in time but it was so unfair
that it didn’t constitute an “official” exam conflict, or had other
responsibilities (e.g. work, family, etc.), with the end result they felt
compelled to grasp some extra studying time via a medical exemption, as the
make-up midterm would be written in January or February, or the deferred exam
would be written sometime in the summer.
Ah, good strategy. But was
it? Um, NO.
If we
assume that students who were suffering from an “illness” and were absent for a
mid-term or exam actually gained some substantive advantage over “healthy”
students from the extra studying time, then we’d expect grades of Illness
Students to be on average greater than those of Healthy Students who wrote the
mid-term or exam on its original date.
It turns out that students who wrote the make-up midterm had
significantly lower marks compared to those who wrote the December midterm(average 50.4% versus 57.1%, z-test, z = 2.5, n1
= 27, n2 = 931, p = 0.01);
the same pattern was observed for the deferred final exam versus the April
final exam (average 52.3% versus 58.9%, z-test,
z = 2.0, n1 = 23, n2
= 822, p = 0.04).
So, it would appear that Illness
Students gained no advantage over Healthy Students. It is important to note that there are legitimately ill students during
exams, but in my opinion, truly valid bouts of illness are associated with only
a small proportion of medical exemptions submitted, and the majority of “ill”
students can be more accurately described as “ill-prepared”, and are trying to
buy some extra studying time. What this
analysis does not show, and cannot be determined using our dataset, is whether
or not ill-prepared students managed to avoid a truly disastrous academic
performance if they’d written the
original exam (e.g. 15%) versus the barely passing grade they received in the
make-up or deferred. The vast majority
of honest students can rest assured that “ill” students abusing the system of
medical notes are not gaining an unfair advantage and depriving them of placements
in upper year courses with limited enrolment, or will rank higher than them in
competitions for undergraduate scholarships or later admission to professional
or post-graduate studies; there were some truly dismal marks amongst students
writing the make-up or deferred exam, and only 10% of make-up or deferred exams
received above a 70% on the exam (5 of 50 exams, 2 of which were borderline
B’s, 1 was a mid B (not B+), the last 2 were borderline A’s), versus 24% of
original mid-terms or final exams (419 of 1753 exams).
Some additional notes about these analyses
For the
Fall-Winter 2006-07 offering of SC/BIOL 1010 I tried to ensure that the make-up
midterm and deferred final exam were very similar (e.g. only about 10-15% of
questions changed) and of equal difficulty compared to the original offering,
as testimonials of course directors of other BIOL courses suggested that I’d be
mired in student-complaint hell if I made the exams very different, even if I
could justify that they were of the same difficulty. Consequently, any differences in marks when
comparing Illness vs Healthy students should reliably
reflect academic ability versus difference in difficulty between the original
and make-up or deferred evaluations.
The moral of the story:
If you are
ill-prepared for the original test date, you’ll most likely still be
ill-prepared for the make-up or deferred date, and you’re still either going to
fail or just barely pass the test. My
advice – stay on top of course work and readings all the way through term, so
that exam-period studying will be more of a review of material you already
know.
What were students thinking?!?

Just out of plain ole’ curiosity I one day decided to see if academic performance by SC/BIOL 1010 students in FW 06-07 differed for students who took advantage of every bonus mark offered for filling out survey questions, compared to students who didn’t fill out a single survey (and hence received no bonus marks). The results were not totally surprising. Students who completed every survey and received every bonus mark did significantly better on the final exam compared to students who didn’t fill out a single survey (average multiple choice score 64% versus 52%; t = 9.4, n1 = 338, n2 = 167, p < 9.0*10-19). I am assuming that the students who filled out every single survey were the ones who came to every class, arrived on time and listened and paid attention when I announced the surveys at the start of class, or regularly used the course WebCT site where survey announcements were also posted. I am assuming that these students invested more in the course throughout its duration as opposed to cramming it all in for the last few days before the exam. What is the moral of story? Continually keep on top of things and you’ll do substantially better in the course.
average score ± 1
standard deviation
The evolution of my driving experiences to York
University
I live outside of the
GTA. I commute a long way to work. It is not for the faint-hearted.
Accumulated mileage
This graph illustrates the mileage on a used car I purchased
in February 2005, as my spouse finished her maternity leave and required our
only car to return to work. I’m amazed
that, despite managing to carpool to

Odometer readings were
compiled
via car maintenance receipts retained
since November 2000
This graph illustrates the monthly fees I have paid for
using the Hwy 407 toll highway since I started at York University in July
2005. This graph best illustrates the
evolution of my commuting strategy to get to

(xi) (x) (ix) (iv) (vi) (vii) (viii) (v) (iii) (ii) (i)
Initially, I lived within a 5 minute walk to a GO Station, so
I commuted mostly via GO Transit bus service (i),
with only occasionally driving in to York during the summer term. By August 2004 the 2+ hr commute (in one direction) via GO bus compelled
me to carpool with another York employee (ii), where we came to the arrangement
that, in return for picking me up at 7 am (rather than 6 am), we would drive
via 407 to York in the morning, and I would foot the bill – was a good idea at
the time, as I was a new father with a new job, and I wished to retain every
minute of sleep I could get my hands on.
As the term progressed (iii), traffic from
Personal consumption habits and fiddling with time
series
Out of curiosity (again), I wondered how my family’s rate of natural gas consumption (we have a natural gas furnace) varied with climate e.g. cold winters versus warm summers, particularly during abnormally warm or cold winter months. So, first off I plotted the average daily consumption of natural gas (provided each month on my gas bill, easy to access as I’ve retained about every monthly bill I’ve ever received since about 1996, tucked away in labeled file folders arranged alphabetically in a filing cabinet in my home office) against mean monthly air temperature (obtained from the Canadian Meteorological Service’s website for the closest monitoring station to where I live). The following plot (Figure 1) is produced:

This plot seems simple enough, when temperature goes up, natural gas consumption goes down. This makes sense, as when it is warm out why would the furnace being chugging away like a locomotive to heat the house?
Figure 1. Mean monthly
temperature (°C; hollow triangles)
versus daily natural gas
consumption (m3/day; black dots).
However, I was dissatisfied with this plot, I wanted something that really showed the strong relationship between monthly temperature and consumption. So, I flipped the y-axis for the consumption data, producing the next plot (Figure 2).

Ah, much better. Okay, now this graph really shows the
relationship between the two. It even
shows the little ‘blip’ in both time series for January 2006, which was an
anomalously warm month. This plot seems
to show that natural gas consumption is always cycling at a higher value
compared to the monthly temperature…but wait a minute, that can’t be
right. That perception seems to be
coming from the fact that I’m graphing two different variables with different
units. This perception is flawed, as it
comes from the different scaling for each separate variable being produced on
the same graph.
Figure 2. Same as Figure 1, gas consumption axis
inverted
So, my next task was
to standardize each time series so that they’d be in units and scaled to equal
ranges, so that an X increase in one
variable of very similar magnitude to a Y
increase in another variable would show up as similarly-sized shifts in time
series values. A good transformation to
standardize data is to convert raw data to Z-scores, achieved using the following
equation: Zi = (ni
– Averagen-i)/Stdevn-i,
where Zi is the resultant z-score for data
point i, ni is the raw value of data point i, Averagen-i is the average for the whole dataset
(with i data points), and Stdevn-i
is the standard deviation of the whole dataset.
The way to interpret z-scores is as follows: if a data point has a
z-score of +0.6, it has a value of 0.6 standard deviation units greater than
the average of the whole dataset. If a
data point has a z-score of -1.2, it has a value of 1.2 standard deviation
units less than the average of the whole dataset. Confused?
Let’s look at this again using raw numbers. If a dataset has an average of 20 units, and
a standard deviation of 6 units, then a data point with a value of 15 units
would have a below-average value, and as it is within 5 units of the average,
it is within one standard deviation (6 units) of the average. This data point of 15 units expressed as a
z-score would be = (15-20)/6 = -0.83.
Follow me? Okay, so, by
standardizing the data points in Figure 2 to z-scores, I get the following plot
(Figure 3).

Now we’re getting
somewhere! This plot really shows how
the magnitude in change in mean monthly temperature produces a change of
similar relative magnitude in natural gas consumption. The coldest month in the whole series,
February 2007, is also the month with the highest rate of natural gas
consumption for the month, where for both data sets the value for this month is
almost 2.5 standard deviation units away from the average for the whole time
series.
Figure 3. Symbols as in Figure 1, with mean monthly
temperature and
daily
natural gas consumption converted to Z-scores.
Gas consumption
axis
inverted as in Figure 2.
Okay, now what I was really curious about was whether or not
there was a substantial difference in gas consumption rates during periods when
my wife was home on maternity leave (within this time series, from Nov 2004 to
Feb 2005, and Jul 2006 to Jul 2007).
Normally, when we go to work we turn down the heat (during the winter)
or shut off the air conditioning (during the summer). When my wife was on leave she was typically
home all day, with the occasional venture outside (e.g. to the park, or
shopping) to maintain her sanity; this would mean the furnace was on the whole
day, either heating the house in the winter or cooling it in the summer. So, in this case, I was interested in the
differences in z-scores for each time series – if a winter month was unusually
warm, yet gas consumption was not correspondingly low, then in Figure 3 you
should see the white triangle ‘above’ a black dot. With this in mind, if you squint
your eyes a bit, the winters of 2004-05 and 2006-07 (winters when my wife was
home during the day) appear to have differences between the two data sets that
suggest higher gas consumption relative
to the temperature, compared to the winter of 2005-06. As I was a bit constrained in terms of months
for 2004-05, I compared the z-scores for temperature vs
gas consumption for the Nov – Feb period of each winter (2004-05, 2005-06, 2006-07). As I had 3
‘treatments’ (first-maternity-leave, no-maternity-leave,
second-maternity-leave), I ran a one-way ANOVA.
This statistical test was not statistically significant (F = 1.9, df
=2,8, p = 0.2), indicating that the
differences in z-scores when comparing the two time series followed a similar
pattern in the 2 maternity-leave winters compared to the no-maternity-leave
winter. I thought that maybe the low
number of data points (n = 4) in each treatment was resulting in low
statistical power (e.g. even if the test was
significant, I had a low chance of properly determining that). So, I pursued a different way of comparing
datasets, this time I compared the whole period my wife was on her 2nd
maternity leave (Jul 2006-Jun 2007) with the whole previous year (Jul 2005 –
Jun 2006) when my wife was not on leave.
Was there a statistically significant difference in z-scores when
comparing the two data sets in the ‘Leave’ year versus the ‘No Leave’
year? Yes, there was (paired t-test, t = 2.1, df =
11, p = 0.03).
The pattern of difference was such that my
suspicions were confirmed – there was relatively higher gas consumption during
the maternity-leave-year compared to the no-leave-year (Figure 4). This is manifested by a greater positive
difference when subtracting the gas z-score from the temperature z-score (note:
the sign of the gas z-score was inverted (+ to -, and vice versa) to make a
comparison analogous to that in Figure 3, so that a warm month had a positive z-score
and a month of low gas consumption also had a positive z-score. So, if a month was unusually warm (high
z-score) and gas consumption was correspondingly unusually low (high z-score;
the sign was inverted, remember?), then subtracting one z-score from the other
should produce a value close to zero. If
a month was unusually warm (high z-score) yet gas consumption was not
correspondingly low, then subtracting the gas z-score from the temperature
z-score would produce a positive value.
Figure 4. Difference
when subtracting an inverted gas z-score from a
temperature
z-score, for a 12-month ‘No Leave’ period versus
a
12-month ‘Maternity Leave’ period.
So, in Figure 4,
negative values mean that gas consumption is low relative to the temperature
conditions (relatively low consumption in cold winter months and hot summer
months) = savings in $$$. The moral of the story?
When figuring out the finances of having a child, here in southern
Ontario where we have both cold winters and humid hot summers, one should also
take into account increased heating and air conditioning costs, if the pre-baby
pattern was that both partners worked and the house sat empty (and un-heated or
un-air-conditioned) during the day. No
book ever mentioned that in their Finances section…
Disclaimer:
Materials presented in the personal section of Roberto
Quinlan’s webpage solely reflect his personal opinions and do not convey the
viewpoints or opinions of