• Do people’s responses to constructs reflect their levels of honesty and/or integrity?
  • How one might’s desire in how they are seen affect the validity of interpretations of the responses?

Factors of response biases:

  • Some response biases are affected by factors such as:
    • The testing content- the actual topic (personal or impersonal) o The testing context- learning about yourself or applying for a job? o The testing format- open ended or forced-choice?
    • Conscious efforts to distort ‘reality’ o Unconscious efforts to distort ‘reality’

Types of response biases:

Acquiescence bias

Extreme and moderate responding

  • Social desirability
  • Malingering
  • Random responding
  • Guessing

Acquiescence bias:

  • Been a known problem for a long time (80+ years)
  • rccurs when an individual agrees with statements without regard for the meaning of those statements
  • Acquiescence is a problem for tests such as personality inventories, attitude questionnaires, interest inventories, clinical inventories, marketing surveys o That is, self-report inventories
  • g. an inventory may have an item such as ‘I enjoy my job’, ‘I dislike my job’
  • In its extreme form, people who engage in acquiescence will respond to strongly agree to both these items, even though they are polar opposites of the same construct
  • Recall that acquiescence implies the person is responding- independently of the content o People who will agree or say ‘yes’ to just about anything  Factors that increase acquiescence:
    • Ambiguous items-people can’t be bothered spending time to figure out the item o Long items- can’t be bothered reading the whole question
    • Large-number of items- people get bored or tired along the way and start to acquiesce to finish

Spurious correlations:

  • Suppose a researcher was interested in estimating the association between job satisfaction and perceived prestige of the job o Creates two questionnaires (one for job satisfaction and for perceived prestige) o Two composite scores are formed based on the results of each questionnaire on a 7point Likert scale
  • It is perfectly conceivable that two people might obtain the same composite score, but not really have the same level of job satisfaction
  • Instead, one person really does enjoy their job, but the other person is indifferent about their job hut has engaged in acquiescence.
  • Same can be said about job prestige, would not be able to distinguish between genuine believers their job is prestigious vs. acquiescence
  • Impossible to know how much of a correlation is acquiescence rather than truly shared construct variance
  • Same can occur (although rare) for nay-sayers, those who disagree with just about everything

Item phrasing:

The item phrasing is important, across all items agreeing or endorsing the item implies that you have more of the construct o In psychometrics this is known as positively keyed

  • The fact that all items (using this example) are positively keyed renders the questionnaires susceptible to acquiescent responding  There are three aspects to acquiescence:
  • Lulling people into engage in it o Personality types that are acquiescent
  • Not being able to gauge the level of acquiescence occurring
  • When all items are positively keyed, people get in a habit of responding in a certain way (stop reading the items)

Method of matched pairs:

  • Developed to measure acquiescence
  • Consists of embedding several pairs of items which are polar opposites o ‘prescription drugs do more harm than good’, ‘ prescription drugs are mostly hepful’
  • If you have enough matched items (say 12), you could calculate a acquiescence index
  • Each time a person responds with agree or strongly agree, they would get a score of 1 o People would score between 0-12
  • Then decide to omit people from the study if they scored higher than a given number (2)

Extreme and moderate responding:

  • Problem of extreme and moderate response biases refers to differences in the tendency to use or avoid extreme response options
  • Some people are simply more likely to endorse the extreme of a response scales o Even though they do not possess the attribute to that extreme degree
  • Conversely, some people will choose a response somewhere in the middle to avoid making a strong claim o This is known as moderate responding
  • Extreme bias can generate artificial differences among respondent’s test scores o Alternatively it can mask true differences
  • g. two people have the same measured level of ‘true anxiety’ o Because person 1 is an extreme responder, they had an observed total score from their questionnaire of 16, compared to person 2 with an observed total score of 12

o This effect in the data confound interpreting any statistical analyses; correlation between true and observed scores is totally different

Acquiescence vs. extreme responding:

  • People engaging in acquiescence will tend to affirm opposite qualities
  • By contrast, people who engage in extreme responding will not affirm opposite qualities
  • Instead they will use both extremes of the scale (strongly disagree and strongly agree)

Extreme responding- no solutions:

Research in the area tends to be based on an examination of group differences However there is no valid way to measure extreme responding, people may in fact hold extreme views

Social desirability (SDR):

  • Tendency for people to respond in a way that seems socially appealing, regardless of his or her true characteristics
  • If you wanted the job, you would likely respond in a way that appeals to the employer

(scores are compromised with respect to valid interpretations)

  • Can diminish validity of the measurement process, not so much reliability (everyone may react in a socially desirable way)  Sources of SDR:

o Test content o Test context o Personality of respondent

Test content:

  • Some items are more susceptible to SDR, because they are generally viewed as appealing characteristics
  • g. being dependable is probably viewed as desirable by everyone
  • Items with high validity are more susceptible to SDR o g. measuring impulse control (should be obvious to everyone in health how to respond to this item

Test context:

  • Some context lends themselves to greater or lesser amounts of socially desirable responding
  • If the testing context is anonymous, people may respond more honestly o g. drug surveys
  • If the testing context has important implications, people may respond in a favourable way  rver claiming- ‘catching’ people who claim to undertake actions that cannot exist

Personality:

  • Some people have a personality type that disposes them to engaging in socially desirable responding
  • For example, individual differences in the ‘need for autonomy’ are known to correlate negatively with socially desirable responding o Some people have a personality type that leads them to be honest (possibly because they are not affected by social disapproval)

SDR and spurious correlations:

  • The phenomenon of SDR can have a similar effect on correlations between scores as was demonstrated for the case of acquiescence

People who tend to engage in SDR will report they have higher levels of positive affect than they actually do.

  • Similarly they will also rate their relationships as better because in fact they are
  • This ‘causes’ the correlation between the two scores (PA and RQ) to be higher than it actually

is

The nature of SDR:

  • Marlowe-Crowne scale was the most popular method to measure SDR
  • The M-C scale was considered limited because it conceptualised SDR as a unitary process
  • Research now indicates that there is at least two moderately correlated dimensions o Impression management o Self-deceptive enhancement

Impression management:

  • This process is considered more conscious o Test takers intentionally attempt to appear socially desirable
  • Its relevant to whether people will admit that they sometimes engage in peccadilloes (‘small sins’)
  • Typically the item used to measure impression management tend to be verifiable and observable o I never swear o rther people may verify whether this is true

Self-deceptive enhancement (SDE):

  • This process is considered less conscious
  • People who engage in self-deceptive enhancement tend to actually belief their own ‘exaggerations’
  • SDE is correlated positively with narcissism
  • The concept is related to people having inflated impressions of their own abilities
  • Typically the items used to measure self-deceptive enhancement tend to be difficult to verify by others o My first impressions of people usually turn out to be right

IM and SDE:

  • Argued that the impression management (IM) is more of a state-like attribute, which can fluctuate across time and circumstances
  • By contrast, self-deceptive enhancement (SDE) is more trait-like, it is consistent across situations and contexts

Malingering:

Affects psychologists when respondents attempt to exaggerate their psychological problems Also known as ‘faking bad’

  • It is prevalent in situations where the respondent perceives some sort of benefit for appearing more injured or distressed than they really are
  • g. disability evaluations, workers comp claims, criminal competence hearings
  • If people purposely perform poorly on a test, the validity of the interpretation of the scores will be compromised
  • Estimated to occur in as much as 27% of general psychological evaluations, 45% of forensic investigations

Careless or random responding:

  • Careless responding is a threat to the valid interpretation of scores
  • People will simply respond the same way for all questions irrespective of content
  • It’s not necessarily acquiescence, they don’t even read the questions and respond semirandomly

Guessing:

  • rccurs more often in multi-type tests where there is only a specified number of alternatives from which to choose
  • Someone gets to a question, doesn’t know the answer and simply guesses, if they guess correct their score may be higher than it should be
  • Affects internal consistency downwardly o Someone with low ability may guess a hard item correctly, reducing the correlation between items
  • Difficult to estimate what level of effect this has, although it is present

Coping with response biases:

  • Three general strategies:

o Manage the testing context o Managing the test content and/or scoring o Use specially designed ‘bias’ tests

  • Additionally there are at least three goals that these strategies are intended to accomplish:

o Minimise the existence of response biases o Minimise the effects of response biases o Detect biased responses and intervene in some way

Managing the test context:

  • Best way to cope with response biases is to prevent them occurring in the first place
  • This can be done by trying to reduce situational factors that may elicit the response bias
  • g. less likely to engage in SDR if they are convinced they are responding anonymously
  • Possible however that anonymous testing may also increase random responding
  • Create a testing situation that minimises respondent fatigue, stress, distraction or frustration o When people are tired, they often increase biased responding

 

o Testing should be limited to a maximum of one hour

  • Bogus pipeline technique- inform participants that faking good or lis can be detected by the test being administered o Many respondents believe this

Managing test content:

  • Write clear, concise and unambiguous items
  • Write items that are endorseable (e.g. include using words like ‘sometimes, often’  Forced choice formats:
    • Requires the respondent to choose one alternative amongst one or more other equally attractive alternatives
    • g. two dimensions pitted against each other- by picking the most attractive indicates which dimension they see in themselves
    • Problems-
      • it is difficult (if not impossible) to calculate internal consistency reliability
      • It is not valid to compare scores between people (can only interpret personal scores)
      • Is actually difficult to meet the main criteria of test construction (to pit items of equal social desirability and topit items that are uncorrelated with each other)

Manage test content or scoring:

  • In reality, it is very difficult to eliminate the existence of response bias
  • Instead, test developers typically try to minimise the effects of response bias
  • A very common method used to help minimise the effects of acquiescence is create ‘balanced scales’
  • These consist of an equal number of positively and negatively keyed items

Balanced inventories- practicalities:

  • Before you can do any analyses, you must reverse score each of the negatively keyed items o You want high scores to be indicative of the attribute of interest

Manage guessing:

  • rne method to managing guessing, is to inform people that they will be penalised for guessing incorrectly
  • g. SAT for each incorrect answer penalised .25 of a mark

Detect bias and intervene:

  • Some inventories incorporate validity scales
  • Validity scales are designed to specifically measure the degree to which people may be engaging in a particular type of response bias
  • Most famous self-report inventory is MMPI

o Used to measure both personality but also along the psychopathological ends of the spectrum

  • Helps determine whether the respondent’s scores are interpretable, the consult the validity scales
  • The three main validity scales within the MMPI are the L,F,K scale o The L(ie) scale- assess naïve or unsophisticated attempts by people to present themselves in an overly favourable light
    • The F(infrequency) scale- represents a deviant form of responding that is consistent with malingering, acquiescence, or serious psychopathology
    • The K scale- attempts to assess more subtle distortion of response, particularly clinically defensive responses
  • MMPI gives instruction of what to do if someone has elevated scores on the validity scales o Might totally exclude interpretations of scores

o Might interpret them cautiously o Use statistical procedures to ‘correct’ the original scores into more valid scores

Specialised tests:

  • Marlowe-crown social desirability scale
  • Balanced inventory for desirable responding
  • Exist independently so can be used for a variety of contexts