Embrac­ing Online Research Dur­ing Your PhD

The lab experience

Tra­di­tion­al­ly, most behav­iour­al research is done in lab­o­ra­to­ry set­tings. As a PhD stu­dent or a research assis­tant, you spend time design­ing fly­ers to hang up in your depart­ment, or use your university’s par­tic­i­pant pool to recruit peo­ple to take part in your exper­i­ment. You then painstak­ing­ly sched­ule indi­vid­ual exper­i­ment ses­sions for each of your par­tic­i­pants and invite them to your uni­ver­si­ty build­ing to do some tasks on a com­put­er for an hour or so, quite pos­si­bly in some tiny cubi­cle in a dingy basement.

Addi­tion­al­ly, you also have to spend that same time in that same dingy base­ment, usu­al­ly not doing any­thing more pro­duc­tive than engag­ing in small talk about the weath­er, typ­ing in your par­tic­i­pants’ ID, and per­haps start­ing up a new exper­i­ment pro­gramme every now and then. And that’s if you’re lucky and your par­tic­i­pant actu­al­ly shows up.

I’ve done my fair share of lab-based data col­lec­tion and I have to say, it real­ly is exhaust­ing. It’s a mas­sive time com­mit­ment from the researcher; you can eas­i­ly spend months on end going through the recruit­ment-sched­ul­ing-test­ing cycle.

Apart from the researcher’s time, lab-based data col­lec­tion is also resource-inten­sive for your uni­ver­si­ty. They need to pro­vide enough rooms and equip­ment for all the active researchers to be able to run their exper­i­ments, and – as an exper­i­menter – you often find your­self com­pet­ing for those spaces with your col­leagues. This com­pe­ti­tion can get par­tic­u­lar­ly tire­some dur­ing prime ‘test­ing sea­son’, where sadly, it is often the senior researchers and pres­ti­gious labs that win.

Lab-based research has anoth­er big prob­lem: its par­tic­i­pants. What pop­u­la­tion do you think is the eas­i­est to recruit for exper­i­ments that are run in the dingy psy­chol­o­gy depart­ment base­ment? That’s right, peo­ple who are already phys­i­cal­ly in the depart­ment. It’s no sur­prise that much of the clas­sic lit­er­a­ture in psy­chol­o­gy is based on sam­ples of uni­ver­si­ty stu­dents. They are there, they are will­ing, you can bribe them with course cred­its – it’s just so convenient!

The prob­lem with this approach is clear­ly that the typ­i­cal under­grad­u­ate stu­dent tends to be quite dif­fer­ent to the pop­u­la­tion that we as psy­chol­o­gists usu­al­ly want to make infer­ences about. For starters, they might be par­tic­u­lar­ly homo­ge­neous when it comes to socioe­co­nom­ic sta­tus, and often also race and gen­der. It’s com­mend­able when researchers try to widen their par­tic­i­pant pool to com­mu­ni­ty sam­ples, e.g. by putting up their fly­ers some­where other than the stu­dent union. How­ev­er, when we’re recruit­ing peo­ple for lab-based research we will always auto­mat­i­cal­ly restrict our sam­ple to those who are both phys­i­cal­ly able and will­ing to come to the university.

With the repli­ca­tion cri­sis loom­ing over all our heads, the field has quite right­ly recog­nised that we need to run our exper­i­ments on big­ger and more diverse sam­ples. Sadly, this is often not real­is­tic with lab-based research – espe­cial­ly if you’re the sole exper­i­menter on your project. There’s only so many hours in the day for testing!

So, what can be done about this?

Online research

The last few years have seen a change in where research and data col­lec­tion hap­pens. Inter­net browsers and people’s own elec­tron­ic devices are now pow­er­ful enough to dis­play com­plex stim­uli and can mea­sure response times remote­ly with a great degree of accu­ra­cy. These tech­no­log­i­cal advances have allowed researchers in the behav­iour­al sci­ences to shift from lab-based to web-based research.

Recruit­ment plat­forms like MTurk and Pro­lif­ic Aca­d­e­m­ic allow researchers to adver­tise their exper­i­ments online and attract par­tic­i­pants far away from the uni­ver­si­ty cam­pus who can then do the tasks on their own device and in their own time.

It real­ly seems like a no-brainer:

  • Run­ning exper­i­ments online saves the researcher valu­able time because there is no need to sched­ule indi­vid­ual ses­sions, you can eas­i­ly test mul­ti­ple peo­ple at the same time and you can use the time it takes to col­lect the data in more pro­duc­tive ways.
  • Online research saves uni­ver­si­ty resources as there is no need to pro­vide as many rooms and com­put­ers specif­i­cal­ly for data col­lec­tion pur­pos­es. More room for cof­fee machines in psy­chol­o­gy departments!
  • Recruit­ing par­tic­i­pants online allows us to diver­si­fy our par­tic­i­pant pool and col­lect larg­er sam­ple sizes to increase the power of our analyses.

The chal­lenges of online research

It is worth not­ing that noth­ing is per­fect. Like lab-based research, online research has its chal­lenges. When I first start­ed think­ing about run­ning my own exper­i­ments online, I admit that I had doubts. How­ev­er, I have since found that many of my ini­tial con­cerns either equal­ly applied to research con­duct­ed in the lab or were avoid­able by mak­ing sen­si­ble deci­sion at the exper­i­ment design stage.

Does the absence of an exper­i­menter lead to lower-qual­i­ty data?

By its nature, online research does not offer the same con­trol over the exper­i­ment envi­ron­ment as the lab does. With­out an exper­i­menter there to check, you may ask your­self whether your par­tic­i­pants are in a rea­son­ably quiet envi­ron­ment, and whether they real­ly are doing the task by them­selves. Are they maybe lis­ten­ing to dis­tract­ing music or pod­casts while they’re sup­posed to be con­cen­trat­ing on your task? Are they check­ing their email? Are they hav­ing a con­ver­sa­tion with some­one else in the room at the same time? How can we be sure that online par­tic­i­pants take the tasks seri­ous­ly, and don’t just, say, press but­tons at ran­dom? It is good to think about these issues before start­ing an online exper­i­ment. Luck­i­ly, a lot of these wor­ries can be resolved and there are var­i­ous ways of address­ing these questions.

Can I real­ly trust my participants?

Are my “par­tic­i­pants” real peo­ple? The infil­tra­tion of Amazon’s Mechan­i­cal Turk par­tic­i­pant data­base with bots has recent­ly led to a big scan­dal, see Max Hui Bai’s blog post about the issue.

So how can we tell real peo­ple from bots? Can we trust that online par­tic­i­pants truly fit our par­tic­i­pa­tion cri­te­ria? What if peo­ple lied about e.g. their age or other demo­graph­ic details in order to be able to par­tic­i­pate in our study? The issues men­tioned above are both legit­i­mate con­cerns and they are impor­tant ques­tions that need to be con­sid­ered by any­one want­i­ng to col­lect their data online.

The truth is, we can never be 100% sure that your online par­tic­i­pants took your task as seri­ous­ly as you would have liked them to. How­ev­er, sim­i­lar con­cerns apply to your par­tic­i­pants in the lab. Can you always be 100% sure the par­tic­i­pants who step into your dingy test­ing base­ment are as old as they say they are, or that they ful­fil your demo­graph­ic inclu­sion cri­te­ria? Even with you watch­ing over them, can you always be sure that they’re con­cen­trat­ing on the task, rather than being lost in their own thoughts?

In one of my eye-track­ing stud­ies I used a remote track­er with­out a head­rest, so I made sure to remind par­tic­i­pants mul­ti­ple times to remain in approx­i­mate­ly the same posi­tion in front of the com­put­er screen and not to move their head too much dur­ing the task. One par­tic­i­pant went on to grab her water bot­tle from under the table and had a drink THREE TIMES dur­ing her ses­sion, even after I had remind­ed her of the instruc­tions and re-cal­i­brat­ed the track­er twice. After the sec­ond time, I knew her data was going to be use­less, so I just wait­ed for the task to fin­ish and sent her home. The point is: some­times, you have to deal with shod­dy data, whether that’s in the lab or online.

There are many ways to max­imise the like­li­hood that your remote­ly col­lect­ed data will be of good qual­i­ty. First­ly, I’d rec­om­mend mak­ing good use of tools that will make your life much, much eas­i­er. Sec­ond­ly, by mak­ing sen­si­ble adjust­ments to your tasks you can opti­mise your data qual­i­ty and increase your chances of catch­ing any bots in your sample.

The ulti­mate proof is in the qual­i­ty of online behav­iour­al research that we can see being pub­lished. Many sci­en­tists are tak­ing their research online, col­lect­ing data faster and accel­er­at­ing their research. You can read about some exam­ples here

10 tips to opti­mise the qual­i­ty of the data you col­lect online:

1. Adapt your paradigm

While many par­a­digms can be taken online with no changes, some — par­tic­u­lar­ly those with a mem­o­ry assess­ment — might need some tweak­ing. When we col­lect our data remote­ly, we can­not mon­i­tor what the par­tic­i­pant does dur­ing the task. This doesn’t mean that data col­lec­tion is impos­si­ble, it just means that we need to think care­ful­ly about how we can adapt the tasks we use.

For exam­ple, one of my exper­i­ments includ­ed a short-term mem­o­ry par­a­digm. In the lab, I would have been able to pre­vent par­tic­i­pants from writ­ing down the infor­ma­tion that they were sup­posed to hold in mem­o­ry – this was of course impos­si­ble when I ran the task online. So instead, the encod­ing phase includ­ed a task ele­ment where par­tic­i­pants need­ed to use their mouse to click on images on the screen. I could then check in my data whether peo­ple used the mouse to click. If they did use the mouse, I inferred that they couldn’t have taken notes with their hand dur­ing the encod­ing phase. This is of course quite a crude method, but it illus­trates that we need to be think­ing cre­ative­ly to make our tasks suit­able for online use, if your task demands it.

2. Use a good exper­i­ment platform

If you’re like me and start­ed your PhD with­out any pro­gram­ming expe­ri­ence, and are, frankly, a bit scared of learn­ing how to deal with vari­ables, func­tions and loops, then you’ll save your­self a ton of time and frus­tra­tion by find­ing a good exper­i­ment platform.

I start­ed using Gorilla, which was super intu­itive and allowed me to eas­i­ly set up my exper­i­ments and tin­ker with them – all most­ly with­out hav­ing to ask for help. This gave me more time for think­ing about appro­pri­ate ways to adapt my tasks for online data collection.

3. Build in checks for data quality

One of the main con­cerns about online research is that your research will be com­plet­ed by par­tic­i­pants who aren’t real­ly pay­ing atten­tion. How­ev­er, you should­n’t worry too much, as there are a vari­ety of ways to trip up inat­ten­tive par­tic­i­pants and exclude their data. There are a vari­ety of ways to trip up bots and inat­ten­tive par­tic­i­pants dur­ing your exper­i­ment, so that you can later dis­card their data, for example:

  • You can set a max­i­mum or min­i­mum amount of time for indi­vid­ual tasks or ques­tion­naires if you’ve got a rea­son­able idea of how long peo­ple should be spend­ing on them. With Gorilla, you can check the time your par­tic­i­pants spent read­ing your instruc­tion screen, for exam­ple. This way, you can exclude data from par­tic­i­pants who clear­ly didn’t read the instruc­tions prop­er­ly. Or you could check your par­tic­i­pants’ over­all time spent on a task, and dis­card those who spent an unrea­son­ably long time com­plet­ing a sec­tion, under the assump­tion that maybe they took a tea break that they weren’t sup­posed to. Sim­i­lar­ly, if you’ve got a good idea of how long par­tic­i­pants should be spend­ing on indi­vid­ual tri­als, you can use aver­age response times as exclu­sion criteria.
  • You can also include a spe­cif­ic ‘atten­tion check’ trial with­in your exper­i­ment. These might be par­tic­u­lar­ly easy ques­tions, and you could use these tri­als to exclude any­one who got them wrong.
  • You could also include some audi­to­ry filler tri­als to ensure peo­ple wear their head­phones for the dura­tion of the task if you want­ed them to be in a quiet envi­ron­ment. Instruct them to put on their head­phones in the begin­ning and have them respond in some way to the audio to check that they were in fact listening.
  • If you want to make sure that par­tic­i­pants’ demo­graph­ic data are legit­i­mate, you can ask the same ques­tion twice and check for any incon­sis­ten­cies in responses.

Again, be cre­ative and use the type of ‘bot catch’ that will work best for your par­tic­u­lar paradigm.

4. Make your exper­i­ment exciting

To ensure that your par­tic­i­pants hap­pi­ly com­plete your whole exper­i­ment, I strong­ly rec­om­mend mak­ing your task as short as pos­si­ble, and as fun as pos­si­ble. Part of this is mak­ing your exper­i­ment look nice and pro­fes­sion­al, which is pret­ty much a given if you’re using Gorilla as your exper­i­ment platform.

I have found that par­tic­i­pants real­ly appre­ci­ate feed­back about their per­for­mance, e.g. you could gam­i­fy your task by let­ting par­tic­i­pants col­lect points for cor­rect answers or give them their over­all score at the end. With Gorilla, for exam­ple, you can eas­i­ly set up feed­back zones that you can indi­vid­u­alise with your own graph­ics: Click here to have a look at exam­ple.

5. Pilot, pilot, pilot!

Fig­ur­ing all of this out relies on trial & error. You will need to find the appro­pri­ate adjust­ments to your task that still ensure you get the kind of data that you want, and you will need to test out whether your ‘atten­tion checks’ actu­al­ly work. I’ve had a col­league who found that even legit­i­mate par­tic­i­pants tend­ed to fail her check tri­als for some unknown rea­son. The way to iron out those kinks is to try out your exper­i­ment first.

Invest time in exper­i­ment design and do a lot – and I mean a lot! – of pilot­ing before you go ‘live’ with your experiment!

6. Data Analysis

Once you’ve pilot­ed your study with a small sam­ple, set up your data analy­sis. This will make sure that you’ve got all the meta-data set up cor­rect­ly for easy analy­sis. Excel Pivot Tables are super pow­er­ful and tremen­dous­ly use­ful in many careers, make them your friend.

7. Choose a fair and reli­able plat­form for par­tic­i­pant recruitment

Dur­ing my PhD, I linked my Gorilla exper­i­ments to the Pro­lif­ic Aca­d­e­m­ic plat­form for par­tic­i­pant recruit­ment. So far, Pro­lif­ic has been spared any scan­dals about bots in their data­base, and my per­son­al expe­ri­ence sug­gests that the peo­ple who are signed up as mem­bers are gen­uine and gen­er­al­ly quite eager to per­form exper­i­men­tal tasks diligently.

Mem­bers of Pro­lif­ic pro­vide their demo­graph­ic infor­ma­tion when they first sign up to the plat­form, so I was able to direct­ly tar­get only those who were eli­gi­ble for my exper­i­ment, with­out hav­ing to worry about them lying only to be able to take part in my study.

Prolific’s large data­base meant that I could col­lect data from over 100 peo­ple with­in a day.

8. Pay your par­tic­i­pants fairly

It’s impor­tant to ensure that your par­tic­i­pants are reward­ed appro­pri­ate­ly for their time – not just for the obvi­ous eth­i­cal rea­sons, but also because you are much more like­ly to get good qual­i­ty data from peo­ple who are sat­is­fied and feel like their time is valu­able to you.

I have paid my online par­tic­i­pants at the same hourly rate that my uni­ver­si­ty rec­om­mends for lab-based participants.

9. Run your exper­i­ment in batches

This is a major, major tip to avoid any data loss. Rather than set­ting your recruit­ment tar­get to the max­i­mum imme­di­ate­ly, I rec­om­mend recruit­ing your par­tic­i­pants in batch­es of, say, about 20 at a time.

It’s also sen­si­ble to do a brief data qual­i­ty check once you’ve run each batch (with­out peek­ing at sta­tis­tics of course!) so that you have a bet­ter overview of how many more datasets you still need to collect.

10. Final thoughts

I am by no means an expert in online research, but I hope that these tips will be help­ful for any­one plan­ning their first (or even their 100th) online study. For more infor­ma­tion about all things online research, you can check out Jenni Rodd’s fan­tas­tic arti­cle, and watch videos from the BeOn­line con­fer­ence.

By run­ning 3 out of 5 of my exper­i­ments for my the­sis online, I saved a lot of time and learnt a lot about appro­pri­ate exper­i­men­tal design. It also meant that I was able to run more exper­i­ments than I had orig­i­nal­ly planned and inves­ti­gate some inter­est­ing but tan­gen­tial research ques­tions. It also meant that I could be involved in the super­vi­sion of under­grad­u­ate stu­dents who were able to eas­i­ly set up their own exper­i­ments and col­lect their own data with­in the time frame of a typ­i­cal under­grad­u­ate research project.

 

The future of behav­iour­al research?

The future of behav­iour­al sci­ence may well lie online. Online research gives us the abil­i­ty to reach more diverse par­tic­i­pants, groups that may have pre­vi­ous­ly been par­tic­u­lar­ly dif­fi­cult to recruit, and to col­lect larg­er samples.

Ask­ing peo­ple to do exper­i­ments in the com­fort of their own home, on their own devices, gives us the oppor­tu­ni­ty to col­lect data that is not influ­enced by exper­i­menter-par­tic­i­pant inter­ac­tions. In fact, one could even argue that this set­ting is more “nat­ur­al” than the lab.

For online research to be suc­cess­ful, how­ev­er, we need to be flex­i­ble and cre­ative – our tasks will inevitably need to be adapt­ed. It is impor­tant that we as a field find ways to adapt our stan­dard par­a­digms and stan­dard­ised test­ing bat­ter­ies for use with online populations.

To ensure high data qual­i­ty, we need to invest time and effort into exper­i­men­tal design, as well as show appre­ci­a­tion to our par­tic­i­pants by pay­ing them fairly.