Thoughts on Psychology

Just another site

Studying memory for words (and confounding variables generally)

leave a comment »

Studies investigating memory for word lists are amongst the most popular to do for research methods assignments. This is for good reason, since such studies are usually straightforward both conceptually and methodologically. However, problems arise due to people’s choice of words. These problems are important to address in their own right, but also teach us something about doing research in general.

When people do studies of memory for word lists, it’s common for them to make up lists of words off the top of their head. This is a Bad Thing. As an example, consider the study I described in the last post, comparing memory for long words with memory for short words. The logic of this study is that people are given either a list of long words and/or a list of short words to remember. Recall for the respective kinds of list is measured, and then compared using an appropriate analysis, e.g. a t-test. If a significant difference is found, then we conclude that the length of words affects how well we can remember them, and particularly conclude that the phonological loop has a time limited capacity.

The logic above is sound, but only as long as we can discount any alternative explanations. However, if there can be an alternative explanation for why memory for the two lists differs, then we can’t confidently conclude that word length has an effect. Take the following two lists as an example:
cat xylophone
hat phenomenon
mat evolution
bat conundrum
Clearly, one list consists of short words, and one list consists of longer words. However, that’s not the only difference between the two lists. The lists also differ in that:
* all the words in list one are familiar, the words in the second list less so. Familiarity affects memory
* the words in the first list rhyme with each other, words in the second list don’t. Rhyming affects memory.
* words in the first list all refer to concrete objects, words in the second list are more abstract. Concreteness affects memory.
If we find a difference between people’s memory for list one and their memory for list two, we’d want to conclude that it’s because of the word length. Actually, there are at least four possible reasons for such a difference, because there are at least four kinds of difference between the lists: word length; rhyming; familiarity; and concreteness.

(Actually, we probably wouldn’t find a difference, because even the four long words fit into the 2 second phonological loop. Almost everyone would score 4 for each list. We need more words in each list to detect a difference, which illustrates the importance of measures being fine grained enough to measure what we want.)

So, the study described above is clearly flawed because there are alternative explanations for the results – the study is potentially confounded. That’s the general issue I talked about above.

The specific issue is as follows. If you’re doing a study of memory for words, you need to think carefully about the words you use. If you’re doing a between (unrelated) design where the same list of words can be used, e.g. recall with and without interference, then you can relax a little – there’s only one list of words, so no need to worry about differences between those lists. If you’re doing a between design where there are separate lists though, e.g. long and short, you need to worry.

If you’re doing a within (related) design, then you almost always need to choose word lists carefully, because participants will be remembering more than one list of words. If you’re doing a within design where you’re testing memory with and without interference, then you can’t use the same list of words because of practice effects. You need two (or more) lists of words, but you also need to make sure that the lists are equally difficult to remember, so that the only explanation for any difference you find is that for one list there was interference.

In general, when you’re doing a study of memory for word lists where you’re using more than one list of words, then you need to design two matched word lists that you can show are equivalent on any possible confounding variables. Of course, the words will differ on the one criterion you’re interested in as an independent variable, if any. So, if you’re looking at the effects of interference in a within design, you need to ensure that the words in each list are of equal length; equal familiarity; equal concreteness; etc. If you’re doing a within design looking at the effect of word length, then you need to ensure that the words in each list are of equal familiarity; equal concreteness; etc.; but different in terms of word length.

(A quick note: the word length effect arises because of the time based capacity of the phonological loop. Length in this context refers to articulatory length – how long it takes to say a word – not the number of letters in the word. The number of syllables is a rough guide to articulatory length, and certainly a better one than the number of letters.)

So, how do you get these magical matched word lists? Luckily, some kind souls have developed a publicly available database of words marked up with various psycholinguistic characteristics, including articulatory length, familiarity, concreteness, etc. The database allows you to select words according to whichever of these characteristics you want to focus on. Use the database to generate words according to whatever criteria you choose, then randomly choose the number of words you need for each list. You can then write about this in your materials section, to show how much care you’ve taken to eliminate confounding variables. You can access the database at the following address:

Use the “Dict Utility Interface” link to access the old, web searchable version of the interface.

I’d recommend using sections 2 & 3 of the interface to select required values of NSYL, the number of syllables; FAM, the familiarity, where 100=not familiar, 700=very familiar; CONC, for concreteness, 100=not concrete, 700=very concrete; and PDWTYPE, part of speech, choosing INClude N, for nouns. Adjust these, then click the GO button to generate a list of words. If anyone wants help, give me a shout or leave a comment.


Written by daijones

September 11, 2010 at 6:54 pm

Posted in Full post, Research methods

Tagged with

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: