Do You Think Your Analyst is Worth Trusting
In most corporate contexts, information collection comes first. Words are the usual medium for data collection (also called qualitative data or unstructured data).
For instance, marketing researchers conduct focus groups, undertake in-depth interviews, or utilize open-ended questions in surveys to assist product managers and salespeople in determining the best product design and the most effective message to express to clients.
Human resource managers are another type of example, as they are the ones responsible for conducting interviews with potential employees. After all of the information has been gathered, the professionals who gathered it conduct an analysis of the words gathered.
A recent study (Craigie M., Loader B., Burrows R., and Muncer S., "Reliability of Health Information on the Internet: An Examination of Experts' Ratings," This study (Journal of Medical Internet Research, 2002, 4(1):e2) examined the reliability of experts' qualitative data analysis.
Information was gathered from 18 threads (a sequence of connected posts) created by people with chronic illnesses who used an online message board to discuss their experiences. Each discussion began with a question or statement and continued with a series of responses. Five clinicians with at least five years of expertise treating the selected ailment worked together in the same specialty unit to analyze the data.
The data was processed using two measures developed by the medical staff. There was a 6-point system used to assign a letter grade to the opening message or question: A for excellent, B for very good, C for good, D for fair, E for misleading or irrelevant, and F for unintelligible. Each response or answer was assigned a grade from A (evidence-based, excellent) to F (potentially harmful) on a second, 6-point scale.
Following data analysis, three statistical tests (kappa, gamma, and Kendall's W) were used to compare the codes assigned by all five experts. The results demonstrated low levels of concordance between the codes of the five experts, both in terms of the original question and the answers. In addition, there was a statistically significant discrepancy between the codes assigned by two of the five experts, and there were discrepancies between the codes assigned by different pairings of experts. What this means is that although one doctor might label an answer as "A" (evidence-based, another might label the identical response as "E" (untrue) or "F" (perhaps
Factors to think about
First, all of the analysts in this study were medical doctors with at least five years of experience in the treatment of the chronic disease in question. In comparison to even the most seasoned market researchers studying qualitative consumer data or the most seasoned human resource managers analyzing candidate data, these analysts have a considerably deeper level of competence in the research area. What are the odds, then, that professionals with less training will demonstrate consistent analysis of their data if these highly qualified experts could not demonstrate consistent processing of qualitative data?
The criterion in this study was whether a response was "evidence-based" (see code A) or not. In other words, this is a hard-and-fast rule. Most qualitative research in business, in contrast to this one, relies on amorphous criteria like taste, morals, values, or preferences. Why should we have faith in less experienced professionals to reliably apply a wide range of subjective criteria when evaluating qualitative data if the doctors can't even be relied upon to apply a single objective criterion when coding the text?
How scared should you be when a market researcher is reviewing your focus groups? A typical focus group holds roughly 12,000 words. There were 18 different strands of information in this investigation. Typically, a thread will have 5 posts, and each post will be around 120 words long.
These results show that the data in this study totaled 10,800 words, less than a single focus group. In contrast, the average market research study includes four to eight separate focus groups, or four to eight times the amount of words.How likely is it, then, that a market researcher will demonstrate consistency with a much bigger dataset, given that the experts in this study failed to do so with data similar to that of a single focus group?
When a human resources manager is looking at potential employees, how much should you worry? A one-hour interview transcript typically contains around 6,000 words (when hiring middle and top managers, the interviews might take a whole day with an order of magnitude more words). It is not uncommon to collect 30,000 words or more of data from a handful of interviews (for five candidates). What are the odds, then, that a human resource manager will demonstrate consistency with a much larger dataset, given that the experts in this study failed to do so with data similar to two interviews?
To what extent should you be concerned while having an investment analyst look at several companies on your behalf? Tens of thousands of words are not uncommon in yearly reports. The 2004 IBM annual report, for instance, spans 100 pages and contains almost 65,000 words. How likely is it, then, that an investment analyst will show consistency when analyzing a much larger dataset (such as the annual reports, financial statements, and press releases of a few companies), given that the experts in this study failed to do so when analyzing a dataset that holds less than 15% of the data and includes the annual report of a single company?
In this study, pairs of clinicians were asked to assign unique codes to the same set of questions and responses. One physician would mark an answer as "A" for "evidence-based, good," while another might mark it as "E" for "untrue" or "F" for "possibly harmful." Which side is right? It's medicine; therefore, obviously, both of them are wrong.
Who do you think you can trust? And as the decision-maker, what should you do? If you agree with the first doctor, the response offers excellent guidance. If you think the second doctor is correct, your only option is to get the heck out of there. How can we have faith in professionals now that they have failed to convince us that they can analyze even a tiny dataset accurately, or at least consistently?
The collection of relevant information is the starting point for any sound business decision. Words are the usual medium for data collection. When the words are finally available, the data analysts analyze them and deliver their findings to the decision maker. According to the research conducted by Craigie et al., these experts commonly make mistakes while analyzing qualitative data, leading to conclusions that inhibit the decision maker from reaching the best possible conclusion.
Post a Comment