Employee testing has now reached a global scale as more and more multinational companies start to use a standard set of tests worldwide for their HR decisions. Typically, these tests are developed in the headquarters country (mostly North America), and then translated to different languages, and administered in subsidiary companies over the world. When a test doesn’t work well in a foreign country, people often blame the translation. While it is true that poor translation can lead to ambiguities and misinterpretations, cultural differences oftentimes cause more problems than linguistic idiosyncrasies.
Many of you may wonder, as long as we adjust norms to the local population and get a good estimate of individuals’ standings relative to their peers, why should we care about culture. Well, it’s a fair argument, but things are more complicated than this. Imagine that a personality item is culturally sensitive – agreeing with the item means disregarding what is generally accepted in that society. Then how much variability would you anticipate in the responses? If 95% of the population answers “strongly agree” to an item, what is the added value to keep it in the test? This is simply one example of how culture can affect the functioning of a test item. Culture is rooted in every social human being and it has profound influence on people’s values, beliefs, thinking styles, behaviors patterns, and even information they receive and internalize.
Cultural differences take various forms. Below are a few you will commonly encounter when developing and delivering employee assessments globally.
1. Construct differences. The psychological constructs you intend to capture in a selection test, whether it be a personality trait or cognitive ability, may be defined dissimilarly or not even exist in another culture. The concept “face”, for example, is indigenous in Chinese culture, meaning that one should protect people’s public self-image to maintain harmonious interpersonal relationships. While in China saving face is a critical dimension of social intelligence to predict career success, it is rarely built into western selection tests. Culture values also determine the appropriateness of behaviors. Taking subordinates to drink after work is regarded as a desirable leader behavior in Japan, yet it’s not generally the case in other countries.
2. Stimulus familiarity differences. When it comes to test content, people from different countries may have unequal exposure to materials in the test, depending on the customs and conventions in that society. For example, a quantitative problem solving question using inch as a measurement unit will likely place test takers from metric system countries at a disadvantage. As another example, a study in the 1970s found that Arab students scored surprisingly low in a figural inductive reasoning test . Later researchers realized that because Arabic writing is from right to left, it is more difficult for them to discover and apply rules in the test picture from left to right.
3. Test administration differences. Even if you have developed culture-invariant content and constructs, the test can be delivered in completely distinct procedures and modes across countries. As an example, in some cultures people prefer larger social distance, so ensuring enough amount of space between test takers when it’s in a group setting is important to reduce test anxiety and uneasiness.
4. Response style differences. Three types of response styles often occur (and vary by culture) in self-reported measures such as personality and integrity tests. Mexico and some Asian countries show greater social desirable responding by endorsing items that are socially favorable. This is because people in these countries tend to have a stronger need for social approval and feel obligated to conform to societal norms. Acquiescence responding is often seen among South Asians and Latinos. As they are more tolerant to conflicting ideas, respondents from these countries tend to agree with the items regardless of the content. Lastly, individualistic and masculine cultures show a greater tendency of extreme responding (e.g., endorsing 1 or 5 on a 5-point scale), whereas East Asia such as China, Japan, South Korea use middle points of a scale more frequently. Such a difference echoes the opposing cultural values: the former encourages people to be unique and outstanding, whereas the latter advises people not to stand out from their social group and avoid extreme actions.
Culture is complicated because it’s abstract and intangible but affects every aspect of human behavior. Rather than vainly seeking for a universal solution to eradicate culture from global testing, we’d better learn about the uniqueness of the culture we target for testing and factor it in at every stage, from design to delivery.