Interrater Reliability Testing for Team Projects

This topic is empty.

Viewing 9 posts - 1 through 9 (of 9 total)

Author

Posts
13th September 2010 at 1:56 pm #2552

QDATRAINING Admin
Member

We got an enquiry today on the “shoutbox”.

Anonymous: how does a coding team determine the interrater reliability?

Thought I’d post my reply here as well:

benmeehan: You would first get the team to code the same transcript or media file. Then, for an initial rough comparison, you could turn on the coding stripes filtered by user rather than theme (node). This process will raise a discussion amongst the team about labels (node names) and rules for inclusion (node definitions). Then, a second coding by every one of the same transcripts should yield better results. You could then use a “coding comparison query” for more detailed scrutiny and you could set a benchmark or threshold (say 80% for example) which the team feel makes the coding reliable. For a more in depth answer, contact support@qdatraining.eu . This is a free service.

See also: http://www.qdatraining.eu/plan

5th November 2010 at 12:15 am #2840

Anonymous
Inactive

Hi,

I’m not sure of your name from the user name you registered so forgive me.

Is the issue more conceptual in that you are not sure of the process or is it that you understand the process but not how to do it in NVivo 8? Let me set out very briefly a short answer to each and you can come back to me for clarity or an on-line demonstration if it is purely a technical matter.

Conceptual:

The process is that you get the coders together and everyone codes the same primary source. Then you measure levels of agreement between coders. These agreements/disagreements may be very simple things like coders choosing different labels for the what is essentially the same theme or category which is easily fixed through a bit of dialogue, or, they could be based on more fundamental disagreements regarding interpretation which hopefully can be ironed our through dialogue but may require the principal investigator adjudicating on these matters. Then, the same transcript (usually it’s transcripts and not audio) is coded again and a benchmark is set for levels of agreement. Eventually, after two or three runs, the desired level of agreement is reached and coders are better armed to go off and code on their own.

Application of concept in NVivo 8

You would first get the team to code the same transcript or media file. Then, for an initial rough comparison, you could turn on the coding stripes filtered by user rather than theme (node). This process will raise a discussion amongst the team about labels (node names) and rules for inclusion (node definitions). Then, a second coding by every one of the same transcripts should yield better results. You could then use a “coding comparison query” for more detailed scrutiny and you could set a benchmark or threshold for agreement (say 80% for example) which the team feel makes the coding reliable. The coding reliability query uses Kappa’s coefficient http://en.wikipedia.org/wiki/Cohen’s_kappa to measure agreement.

The mechanics of doing this are more easily demonstrated than trying to set out in a forum. If you would like me to show you an example, just let me know and we can arrange this.

Hope this helps!

Kind regards,

7th November 2010 at 9:38 am #2842

Anonymous
Guest

Thank you for your reply! Right now what the team members in the lab have planned out, is that we will have three different coders coding the same media/transcriptions
into the nodes. Then, we’d like to compare the three different versions of the same thing, and find out the % of whether the reliability is low or high.
The problem is, this is my first time experiencing Nvivo 8, and I have no idea how, in our case, should I compare, calculate and analyze the reliability of our work.
I searched online and found out about the Merge with Nvivo, but I have also seen people discussing about certain flaws that Merge has which had lead to the
results where experimenters go back to the most original comparison — print out the codes then compare all of them one by one, then calculate the reliability that way.

I was hoping to find a way to calculate our ICR, without having to go through it the tr adition way. Is Merge actually designed to calculate ICR? Or are there
other ways that you can guide me through to make this ICR possible?
I hope my question and concern is making sense, thank you very much for your help.

7th November 2010 at 9:45 am #2843
Anonymous
Inactive
The problem of coders’ work reappearing is a training issue I’m afraid. NVivo merges and does not synchronise and sometimes people do not understand the difference. I can give you a detailed explanation but it is probably unnecessary unless you request it.

Follow these steps and the problem will not occur:

I am going to assume Rose that you are competent in NVivo because it sounds like it from your e-mail. However, if you need more detail on the mechanics of this, just let me know and I will provide a step by step guide.

To conduct the test:
1. Make a maser copy of the project file (there should always be a backed up master file in any case in a team project)
2. Copy the master file to each coder and to you self (you are not working on the master)
3. Conduct the test and merge the coders files into yours
4. Analyse the test results and have your team discussion before commencing the second round of coding
5. Delete all the old project files and re-issue the master
6. Set a benchmark for acceptable levels of agreement
7. Repeat steps 2, 3 & 4 as often as needed to reach acceptable levels of agreement
No uncoded content will reappear if you follow these steps. Version 9 has a networked release that allows people to simultaneously work on a live project file which would also obviate the problem you identified about uncoded data ‘reappearing’.

If you need clarity on any of this, just let me know.

Kind regards,
7th November 2010 at 9:47 am #2844

Anonymous
Guest

Thank you very much for your speedy reply. I am not sure if I completely understand #3 from your previous email “Conduct the test and merge the coders files into yours”. Does this mean that there is an option on Nvivo 8 that allows me to combine works done by different individuals? If it is mergable, will the data look messy in the end, or will it cause confusion? Also, how should this merge be done? Would you please guide me through the steps to merge the files from different coders
as I have not yet experienced that part of the program. Another question I have, is that you mentioned “Delete all the old project files and re-issue the master” could you please explain this a little further please, what consequences would result if I don’t re-issue the master file?
My apologies for series of questions, this is my first time experiencing Nvivo and I real ly would like to understand it more.
Thank you.

7th November 2010 at 9:56 am #2845

QDATRAINING Admin
Member

Yes, you can merge the coders’ work into a single a file which will not look messy. To demonstrate this, I have uploaded a short animated tutorial on the website to demonstrate what I mean. This was based on an inter-rater reliability test I conducted myself in Tanzania two weeks ago using six coders Forgive the editing error just after the start as I did not have time this evening to fix it. You will need to log in to see it but you can find it here:

http://www.qdatraining.eu/intercoder

This will not address address all of your questions but at least it will give you a sense of how it works and if you see the clip, then make a note of your questions, I can deal with the questions immediately afterwards. I suggest you go to the support page, read the simple instructions and download the requisite file. This will allow me to demonstrate the process live on your desktop while we have a conversation. It might be easier and more efficient than conducting such detail through e-mail. I will leave you to decide but the support page may be found here:

http://www.qdatraining.eu/support or, simply reply by e-mail if you prefer. If you decide to go for an interactive demonstration then we will need to arrange a mutually suitable time to do this.

You can see a blog I wrote some time ago about team projects here:

http://www.qdatraining.eu/plan

7th November 2010 at 10:01 am #2846
Anonymous
Inactive
Hi,

If you have looked at the screen cast by now, you should have some idea about the steps in NVivo for the testing process itself in NVivo 8. However, the clip does not include the file management aspect so I have set it out below as requested in your last posting:

Appoint a database administrator for the project who will be responsible for database management (probably you)
1. Set up your project and import your data; create your case nodes and link them to your demographics
2. Save a copy (file/create copy) to a secure location and call it the master file (fileneme.master.nvp)
3. Ensure the master is backed up
4. Take a copy of the master file (renaming each file) and keep one and give the coders one each
5. Agree the common data for each coder to code and get them to code it
6. Make sure that each coder not only labels his/her node but creates a description for each node or ‘rule for inclusion’. For example label or node name = motivation & description = “contains references to being motivated or not“. This information goes into the description properties of the node itself.
7. On your copy of the project file – go to nodes (free or tree depending what your coders coded to) and select ‘view/list view/customize current view’
8. In the dialogue box that appears add ‘description’ by moving it from the left pane to the right pane (click on it and use the arrow in the middle to move it to the right pane) Then, move it up beside the node name (again, click on it and use the arrow up on the right to move it beside name)
9. You can now see the node or theme label AND the node definition side by side (you will see that was done in the clip)
10. Import each of their project files into yours as shown in the clip
11. Where labels are different but the definition is same merge these nodes (cut and merge not cut and paste)
12. And evaluate the coding agreement as shown in the clip
13. Assuming you need a further test delete all NVivo files on all machines and redistribute the master as in steps 1 & 2 (if you don’t do this, your coding from the first round may reappear because coder A may ‘uncode his/her coding in his/her copy but it will still exist in the others. So when you merge (not synchronise) coder A’s coding will reappear.
14. When you are satisfied with the levels of agreement, merge your copy into the master and your quality testing is now retained for your methodology chapter (use screenshots of the coding stripes and coding query as evidence of your inter-rater reliability testing)
I hope between this and the on-line demonstration, I have addressed most of the questions in your last posting but do let me know if you need further help.

Kind regards,
23rd February 2011 at 12:59 pm #2853

QDATRAINING Admin
Member

See above or create an account (it's free) and you we can give you direct support.

Kind regards,

25th July 2011 at 2:37 am #2872

Jo43426988
Member

Thank you for this valuable explanation. I have another, related question.
I am trying to run a matrix query which picks up the number of instances that each node (rows) were cross-coded to Case A or Case B (columns). I have then limited the query to:
– a selected source (a transcript)
– a selected user (Sara)

I would have thought that the result accurately displayed the number of times that Sara cross-coded the transcript of each node to each case (the results are set to coding references). However that isn’t the case. Some of the figures are correct and others aren’t. A careful check of the coding shows that there weren’t any problems with the coding.

Do you have any suggestions?

Many thanks
Author

Posts

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.