The lab experience

Tra­di­tion­al­ly, most behav­iour­al research is done in lab­o­ra­to­ry set­tings. As a PhD student or a research assis­tant, you spend time design­ing flyers to hang up in your depart­ment, or use your university’s par­tic­i­pant pool to recruit people to take part in your experiment. You then painstak­ing­ly sched­ule indi­vid­ual experiment ses­sions for each of your par­tic­i­pants and invite them to your uni­ver­si­ty build­ing to do some tasks on a com­put­er for an hour or so, quite pos­si­bly in some tiny cubicle in a dingy basement.

Addi­tion­al­ly, you also have to spend that same time in that same dingy base­ment, usually not doing any­thing more pro­duc­tive than engag­ing in small talk about the weather, typing in your par­tic­i­pants’ ID, and perhaps start­ing up a new experiment pro­gramme every now and then. And that’s if you’re lucky and your par­tic­i­pant actu­al­ly shows up.

I’ve done my fair share of lab-based data col­lec­tion and I have to say, it really is exhaust­ing. It’s a massive time com­mit­ment from the researcher; you can easily spend months on end going through the recruit­ment-sched­ul­ing-testing cycle.

Apart from the researcher’s time, lab-based data col­lec­tion is also resource-inten­sive for your uni­ver­si­ty. They need to provide enough rooms and equip­ment for all the active researchers to be able to run their exper­i­ments, and – as an exper­i­menter – you often find your­self com­pet­ing for those spaces with your col­leagues. This com­pe­ti­tion can get par­tic­u­lar­ly tire­some during prime ‘testing season’, where sadly, it is often the senior researchers and pres­ti­gious labs that win.

Lab-based research has another big problem: its par­tic­i­pants. What pop­u­la­tion do you think is the easiest to recruit for exper­i­ments that are run in the dingy psy­chol­o­gy depart­ment base­ment? That’s right, people who are already phys­i­cal­ly in the depart­ment. It’s no sur­prise that much of the classic lit­er­a­ture in psy­chol­o­gy is based on samples of uni­ver­si­ty stu­dents. They are there, they are willing, you can bribe them with course credits – it’s just so convenient!

The problem with this approach is clearly that the typical under­grad­u­ate student tends to be quite dif­fer­ent to the pop­u­la­tion that we as psy­chol­o­gists usually want to make infer­ences about. For starters, they might be par­tic­u­lar­ly homo­ge­neous when it comes to socioe­co­nom­ic status, and often also race and gender. It’s com­mend­able when researchers try to widen their par­tic­i­pant pool to com­mu­ni­ty samples, e.g. by putting up their flyers some­where other than the student union. However, when we’re recruit­ing people for lab-based research we will always auto­mat­i­cal­ly restrict our sample to those who are both phys­i­cal­ly able and willing to come to the university.

With the repli­ca­tion crisis looming over all our heads, the field has quite rightly recog­nised that we need to run our exper­i­ments on bigger and more diverse samples. Sadly, this is often not real­is­tic with lab-based research – espe­cial­ly if you’re the sole exper­i­menter on your project. There’s only so many hours in the day for testing!

So, what can be done about this?

Online research

The last few years have seen a change in where research and data col­lec­tion happens. Inter­net browsers and people’s own elec­tron­ic devices are now pow­er­ful enough to display complex stimuli and can measure response times remote­ly with a great degree of accu­ra­cy. These tech­no­log­i­cal advances have allowed researchers in the behav­iour­al sci­ences to shift from lab-based to web-based research.

Recruit­ment plat­forms like MTurk and Pro­lif­ic Aca­d­e­m­ic allow researchers to adver­tise their exper­i­ments online and attract par­tic­i­pants far away from the uni­ver­si­ty campus who can then do the tasks on their own device and in their own time.

It really seems like a no-brainer:

  • Running exper­i­ments online saves the researcher valu­able time because there is no need to sched­ule indi­vid­ual ses­sions, you can easily test mul­ti­ple people at the same time and you can use the time it takes to collect the data in more pro­duc­tive ways.
  • Online research saves uni­ver­si­ty resources as there is no need to provide as many rooms and com­put­ers specif­i­cal­ly for data col­lec­tion pur­pos­es. More room for coffee machines in psy­chol­o­gy departments!
  • Recruit­ing par­tic­i­pants online allows us to diver­si­fy our par­tic­i­pant pool and collect larger sample sizes to increase the power of our analyses.

The chal­lenges of online research

It is worth noting that nothing is perfect. Like lab-based research, online research has its chal­lenges. When I first started think­ing about running my own exper­i­ments online, I admit that I had doubts. However, I have since found that many of my initial con­cerns either equally applied to research con­duct­ed in the lab or were avoid­able by making sen­si­ble deci­sion at the experiment design stage.

Does the absence of an exper­i­menter lead to lower-quality data?

By its nature, online research does not offer the same control over the experiment envi­ron­ment as the lab does. Without an exper­i­menter there to check, you may ask your­self whether your par­tic­i­pants are in a rea­son­ably quiet envi­ron­ment, and whether they really are doing the task by them­selves. Are they maybe lis­ten­ing to dis­tract­ing music or pod­casts while they’re sup­posed to be con­cen­trat­ing on your task? Are they check­ing their email? Are they having a con­ver­sa­tion with someone else in the room at the same time? How can we be sure that online par­tic­i­pants take the tasks seri­ous­ly, and don’t just, say, press buttons at random? It is good to think about these issues before start­ing an online experiment. Luckily, a lot of these worries can be resolved and there are various ways of address­ing these questions.

Can I really trust my participants?

Are my “par­tic­i­pants” real people? The infil­tra­tion of Amazon’s Mechan­i­cal Turk par­tic­i­pant data­base with bots has recent­ly led to a big scandal, see Max Hui Bai’s blog post about the issue.

So how can we tell real people from bots? Can we trust that online par­tic­i­pants truly fit our par­tic­i­pa­tion cri­te­ria? What if people lied about e.g. their age or other demo­graph­ic details in order to be able to par­tic­i­pate in our study? The issues men­tioned above are both legit­i­mate con­cerns and they are impor­tant ques­tions that need to be con­sid­ered by anyone wanting to collect their data online.

The truth is, we can never be 100% sure that your online par­tic­i­pants took your task as seri­ous­ly as you would have liked them to. However, similar con­cerns apply to your par­tic­i­pants in the lab. Can you always be 100% sure the par­tic­i­pants who step into your dingy testing base­ment are as old as they say they are, or that they fulfil your demo­graph­ic inclu­sion cri­te­ria? Even with you watch­ing over them, can you always be sure that they’re con­cen­trat­ing on the task, rather than being lost in their own thoughts?

In one of my eye-track­ing studies I used a remote tracker without a head­rest, so I made sure to remind par­tic­i­pants mul­ti­ple times to remain in approx­i­mate­ly the same posi­tion in front of the com­put­er screen and not to move their head too much during the task. One par­tic­i­pant went on to grab her water bottle from under the table and had a drink THREE TIMES during her session, even after I had remind­ed her of the instruc­tions and re-cal­i­brat­ed the tracker twice. After the second time, I knew her data was going to be useless, so I just waited for the task to finish and sent her home. The point is: some­times, you have to deal with shoddy data, whether that’s in the lab or online.

There are many ways to max­imise the like­li­hood that your remote­ly col­lect­ed data will be of good quality. Firstly, I’d rec­om­mend making good use of tools that will make your life much, much easier. Sec­ond­ly, by making sen­si­ble adjust­ments to your tasks you can opti­mise your data quality and increase your chances of catch­ing any bots in your sample.

The ulti­mate proof is in the quality of online behav­iour­al research that we can see being pub­lished. Many sci­en­tists are taking their research online, col­lect­ing data faster and accel­er­at­ing their research. You can read about some exam­ples here

10 tips to opti­mise the quality of the data you collect online:

1. Adapt your paradigm

While many par­a­digms can be taken online with no changes, some — par­tic­u­lar­ly those with a memory assess­ment — might need some tweak­ing. When we collect our data remote­ly, we cannot monitor what the par­tic­i­pant does during the task. This doesn’t mean that data col­lec­tion is impos­si­ble, it just means that we need to think care­ful­ly about how we can adapt the tasks we use.

For example, one of my exper­i­ments includ­ed a short-term memory par­a­digm. In the lab, I would have been able to prevent par­tic­i­pants from writing down the infor­ma­tion that they were sup­posed to hold in memory – this was of course impos­si­ble when I ran the task online. So instead, the encod­ing phase includ­ed a task element where par­tic­i­pants needed to use their mouse to click on images on the screen. I could then check in my data whether people used the mouse to click. If they did use the mouse, I inferred that they couldn’t have taken notes with their hand during the encod­ing phase. This is of course quite a crude method, but it illus­trates that we need to be think­ing cre­ative­ly to make our tasks suit­able for online use, if your task demands it.

2. Use a good experiment platform

If you’re like me and started your PhD without any pro­gram­ming expe­ri­ence, and are, frankly, a bit scared of learn­ing how to deal with vari­ables, func­tions and loops, then you’ll save your­self a ton of time and frus­tra­tion by finding a good experiment platform.

I started using Gorilla, which was super intu­itive and allowed me to easily set up my exper­i­ments and tinker with them – all mostly without having to ask for help. This gave me more time for think­ing about appro­pri­ate ways to adapt my tasks for online data collection.

3. Build in checks for data quality

One of the main con­cerns about online research is that your research will be com­plet­ed by par­tic­i­pants who aren’t really paying atten­tion. However, you should­n’t worry too much, as there are a variety of ways to trip up inat­ten­tive par­tic­i­pants and exclude their data. There are a variety of ways to trip up bots and inat­ten­tive par­tic­i­pants during your experiment, so that you can later discard their data, for example:

  • You can set a maximum or minimum amount of time for indi­vid­ual tasks or ques­tion­naires if you’ve got a rea­son­able idea of how long people should be spend­ing on them. With Gorilla, you can check the time your par­tic­i­pants spent reading your instruc­tion screen, for example. This way, you can exclude data from par­tic­i­pants who clearly didn’t read the instruc­tions prop­er­ly. Or you could check your par­tic­i­pants’ overall time spent on a task, and discard those who spent an unrea­son­ably long time com­plet­ing a section, under the assump­tion that maybe they took a tea break that they weren’t sup­posed to. Sim­i­lar­ly, if you’ve got a good idea of how long par­tic­i­pants should be spend­ing on indi­vid­ual trials, you can use average response times as exclu­sion criteria.
  • You can also include a spe­cif­ic ‘atten­tion check’ trial within your experiment. These might be par­tic­u­lar­ly easy ques­tions, and you could use these trials to exclude anyone who got them wrong.
  • You could also include some audi­to­ry filler trials to ensure people wear their head­phones for the dura­tion of the task if you wanted them to be in a quiet envi­ron­ment. Instruct them to put on their head­phones in the begin­ning and have them respond in some way to the audio to check that they were in fact listening.
  • If you want to make sure that par­tic­i­pants’ demo­graph­ic data are legit­i­mate, you can ask the same ques­tion twice and check for any incon­sis­ten­cies in responses.

Again, be cre­ative and use the type of ‘bot catch’ that will work best for your par­tic­u­lar paradigm.

4. Make your experiment exciting

To ensure that your par­tic­i­pants happily com­plete your whole experiment, I strong­ly rec­om­mend making your task as short as pos­si­ble, and as fun as pos­si­ble. Part of this is making your experiment look nice and pro­fes­sion­al, which is pretty much a given if you’re using Gorilla as your experiment platform.

I have found that par­tic­i­pants really appre­ci­ate feed­back about their per­for­mance, e.g. you could gamify your task by letting par­tic­i­pants collect points for correct answers or give them their overall score at the end. With Gorilla, for example, you can easily set up feed­back zones that you can indi­vid­u­alise with your own graph­ics: Click here to have a look at example.

5. Pilot, pilot, pilot!

Fig­ur­ing all of this out relies on trial & error. You will need to find the appro­pri­ate adjust­ments to your task that still ensure you get the kind of data that you want, and you will need to test out whether your ‘atten­tion checks’ actu­al­ly work. I’ve had a col­league who found that even legit­i­mate par­tic­i­pants tended to fail her check trials for some unknown reason. The way to iron out those kinks is to try out your experiment first.

Invest time in experiment design and do a lot – and I mean a lot! – of pilot­ing before you go ‘live’ with your experiment!

6. Data Analysis

Once you’ve piloted your study with a small sample, set up your data analy­sis. This will make sure that you’ve got all the meta-data set up cor­rect­ly for easy analy­sis. Excel Pivot Tables are super pow­er­ful and tremen­dous­ly useful in many careers, make them your friend.

7. Choose a fair and reli­able plat­form for par­tic­i­pant recruitment

During my PhD, I linked my Gorilla exper­i­ments to the Pro­lif­ic Aca­d­e­m­ic plat­form for par­tic­i­pant recruit­ment. So far, Pro­lif­ic has been spared any scan­dals about bots in their data­base, and my per­son­al expe­ri­ence sug­gests that the people who are signed up as members are genuine and gen­er­al­ly quite eager to perform exper­i­men­tal tasks diligently.

Members of Pro­lif­ic provide their demo­graph­ic infor­ma­tion when they first sign up to the plat­form, so I was able to direct­ly target only those who were eli­gi­ble for my experiment, without having to worry about them lying only to be able to take part in my study.

Prolific’s large data­base meant that I could collect data from over 100 people within a day.

8. Pay your par­tic­i­pants fairly

It’s impor­tant to ensure that your par­tic­i­pants are reward­ed appro­pri­ate­ly for their time – not just for the obvious ethical reasons, but also because you are much more likely to get good quality data from people who are sat­is­fied and feel like their time is valu­able to you.

I have paid my online par­tic­i­pants at the same hourly rate that my uni­ver­si­ty rec­om­mends for lab-based participants.

9. Run your experiment in batches

This is a major, major tip to avoid any data loss. Rather than setting your recruit­ment target to the maximum imme­di­ate­ly, I rec­om­mend recruit­ing your par­tic­i­pants in batches of, say, about 20 at a time.

It’s also sen­si­ble to do a brief data quality check once you’ve run each batch (without peeking at sta­tis­tics of course!) so that you have a better overview of how many more datasets you still need to collect.

10. Final thoughts

I am by no means an expert in online research, but I hope that these tips will be helpful for anyone plan­ning their first (or even their 100th) online study. For more infor­ma­tion about all things online research, you can check out Jenni Rodd’s fan­tas­tic article, and watch videos from the BeOn­line con­fer­ence.

By running 3 out of 5 of my exper­i­ments for my thesis online, I saved a lot of time and learnt a lot about appro­pri­ate exper­i­men­tal design. It also meant that I was able to run more exper­i­ments than I had orig­i­nal­ly planned and inves­ti­gate some inter­est­ing but tan­gen­tial research ques­tions. It also meant that I could be involved in the super­vi­sion of under­grad­u­ate stu­dents who were able to easily set up their own exper­i­ments and collect their own data within the time frame of a typical under­grad­u­ate research project.


The future of behav­iour­al research?

The future of behav­iour­al science may well lie online. Online research gives us the ability to reach more diverse par­tic­i­pants, groups that may have pre­vi­ous­ly been par­tic­u­lar­ly dif­fi­cult to recruit, and to collect larger samples.

Asking people to do exper­i­ments in the comfort of their own home, on their own devices, gives us the oppor­tu­ni­ty to collect data that is not influ­enced by exper­i­menter-par­tic­i­pant inter­ac­tions. In fact, one could even argue that this setting is more “natural” than the lab.

For online research to be suc­cess­ful, however, we need to be flex­i­ble and cre­ative – our tasks will inevitably need to be adapted. It is impor­tant that we as a field find ways to adapt our stan­dard par­a­digms and stan­dard­ised testing bat­ter­ies for use with online populations.

To ensure high data quality, we need to invest time and effort into exper­i­men­tal design, as well as show appre­ci­a­tion to our par­tic­i­pants by paying them fairly.