HAN Huamei

State-level Chinese language proficiency tests, Hànyǔ shuǐpíng kǎoshì (HSK), have been implemented since the mid-1980s to provide gatekeeping functions in the academic world (to set standards for university admissions) and in the business sector (to facilitate hiring and promotion decisions). The test takers most targeted include foreigners, overseas Chinese, and Chinese ethnic minorities. The status and longevity of HSK tests depend on their valid assessment of actual communicative skills and on the politics involved among institutions that develop, test, and administer them.

The abbreviation HSK stands for Hànyǔ shuǐpíng kǎoshì 汉语水平考试 (Chinese proficiency test). Several specialized tests that measure a person’s proficiency in the language are supplementary to the main test, however, and the main test itself has two official versions, while each version has three formats. Hereafter “HSK tests” refers to the collective term, but “the HSK” refers solely to the first version of the main test.

HSK tests are designed to assess the Chinese proficiency of non-native speakers, including foreigners, overseas Chinese, and students of Chinese national minorities backgrounds. Even though large numbers of Chinese national minorities have taken the HSK and have been included in the aggregated statistics, academic research and professional discussions of HSK tests have primarily focused on non-Chinese citizens learning Chinese as a foreign language. This article follows this common practice because of the lack of available HSK statistics and literature on Chinese ethnic minority test takers, and therefore omits the topic of HSK for Chinese National/Ethnic Minorities, which deserves separate treatment.

As standardized language proficiency tests, HSK tests resemble the Test of English as a Foreign Language (TOEFL) in testing principles (Liu 1994), designs, and improvements over time, as well as in its actual and projected gatekeeping functions in academic and business settings. As state-level tests, HSK tests are high stakes. With China’s continued economic growth, the interests in and influence of HSK tests will likely to increase accordingly. As of 2012 the HSK consists of three formats: basic (HSK Basic), elementary and intermediate (HSK E&I), and advanced (HSK Advanced).

Inception and Growth 1984–2005

Development of the HSK began with the HSK E&I in 1984 at the Beijing Language Institute 北京语言学院 (BLU), the only postsecondary institution with an explicit focus on teaching Chinese as a foreign language (TCFL) to international students. The HSK E&I passed experts’ appraisal and had its first official test in 1990. The HSK Advanced was initiated in 1989, passed experts appraisal in 1993, and had its first official test in Singapore the same year. The HSK Basic was initiated in 1995, passed the experts appraisal in 1997, and was instituted in 1998. By then, a series of three-grade and eleven-level HSK was established: HSK Basic Levels 1 to 3; HSK E&I Levels 3 to 8, including Elementary Levels 3 to 5 and Intermediate Levels 6 to 8; and HSK Advanced Levels 9 to 11. Among the three formats, the HSK E&I has been the most common.

The HSK is held regularly each year in China and abroad, and certificates of various levels are issued to those who achieve the required scores. The first official HSK test took place on 15 June 1990, with 391 international students participating simultaneously in Beijing, Tianjin, Shanghai, and Dalian (Ren 2001). Since then, the number and locations of HSK test takers have grown exponentially both in China and abroad. The Chinese Ministry of Education (MOE) officially instituted the HSK as a state-level test in 1992, and in 1995 it further required relevant HSK scores as prerequisites for all foreign students applying for admission to degree programs in Chinese universities. By the end of 2000, thirty-eight HSK testing centers (including some for Chinese national/ethnic minorities) were set up in twenty-six cities in China, and forty-seven centers were established in nineteen foreign countries (Ren 2001). By December 2005 “about a million examinees (including students of ethnic minorities in China) from more than 120 countries” had taken HSK (HSK n.d.). In addition to a growing number of testing centers in China, over 150 HSK testing centers were set up in thirty-four foreign countries (Chen 2005).

Two institutions, BLU and the National Office for Chinese as a Foreign Language, played important roles in the rapid development and wide reach of the HSK. Several BLU scholars, influenced and inspired by their Western counterparts in the early 1980s, formed HSK Design Team in 1984. In 1989, the original team expanded to become the HSK Center of BLU, which contributed to the development, implementation, and administration of the tests.

The National Office for Chinese as a Foreign Language (Hànbàn 汉办) was established in Beijing in 1987 as the highest state organization governing TCFL. Responding to the increasing popularity of the HSK, in 1997 Hanban set up the China National Committee for Chinese Proficiency Test (CNCCPT) to oversee daily operations related to testing. Hanban, now officially called the Office of Chinese Language Council International / Confucius Institute Headquarters, is a public institution affiliated with the MOE, and focuses on promoting Chinese language and culture among non-Chinese citizens, with specific missions such as developing multiculturalism and building Chinese language and cultural teaching resources for foreign learners of Chinese worldwide.

Function, Design, and Related Research

The HSK’s main function has been to provide a basis for granting non-native speakers admission to degree programs in Chinese universities. HSK tests have been expanded to serve other functions in hiring and promotion decisions in the workplace.

HSK tests mainly measure linguistic competence (knowledge of standard Mandarin as a bounded linguistic system comprised of vocabulary, Chinese characters, and grammar), plus some sociolinguistic knowledge (cultural knowledge embedded in the production and interpretation of linguistic products). HSK test papers are organized into a vocabulary and Chinese character section and a grammar section. The HSK Basic and HSK E&I mainly consist of listening comprehension, grammatical structures, and reading comprehension (all in multiple-choice format), with an additional cloze (sentence completion, using Chinese characters) segment in the HSK E&I. These objective sections typically test grammar points in isolated sentences and test comprehension based on listening to or reading isolated sentences, short dialogues, or short paragraphs. Only the HSK Advanced includes “subjective items” that test productive skills beyond word and sentence levels: a writing segment requires a composition of 400–600 words, and a speaking segment using audio recording requires reading aloud and speaking about given topics.

Therefore, the HSK, particularly the HSK Basic and HSK E&I, have been challenged as a valid measurement of proficiency (McNamara 1996 and 2000), that is, for mainly testing linguistic knowledge instead of actual communicative ability. In addition, the three HSK grades levels with inequitable scores violate a common statistical principle, a flaw that weakened the HSK’s overall validity (Zhang 2004). Furthermore, the validity of some particular testing formats, such as using reading-aloud in the HSK Advanced speaking test (Shen 2009), were also questioned and discussed. The old TOEFL, before its revision in 2005, received similar criticism, except that TOEFL was not a graded test.

As a state-level standardized test, the HSK must maintain a reputation for being objective and scientific. HSK research has focused on validity, reliability, and fairness (e.g., privileging some test takers over others based on gender, age, location, cultural background, and so on) (see Ren 2002; Tian 2007; Xie 2005). By 2005, several volumes published on TCFL had all included sections on HSK (Chen 2005; Xu and Wu 2005), and a series of five volumes focusing specifically on HSK were published, including one volume of HSK research articles (Xie 2005). In this sense, the years from 1984 to 2005 reveal exponential growth in HSK implementation and major progress in test development and research. The technical aspects of the HSK have been maturing gradually.

While the HSK Center of BLU kept working on revising, improving, and developing new HSK tests, Hanban commissioned various universities to develop specialized tests starting in 2003 and started setting up Confucius Institutes abroad in 2004. All newer tests showed an increased awareness of stress on the testing of communicative ability. But conflict of interests developed between the two institutions over implementing the HSK and developing the new tests. From 2006 to 2010, each institution launched various HSK tests.

Diversification 2006–2010

From 2006 to 2010, Hanban marketed three state-level tests that it commissioned various institutions to develop: the Youth Chinese Test (YCT) and the Business Chinese Test (BCT) are two specialized tests; the New HSK is a new version of the main test. Each includes an independent speaking test using audio recording format, and all tests adopt levels corresponding to the Common European Framework of Reference (CEFR), which improves the grading issue faced by the HSK, and makes HSK levels comparable with internationally recognized foreign language testing programs.

Launched in Singapore in 2006 and in Beijing in 2007, YCT targets learners under age fifteen. The written test is divided into four levels, while the speaking test includes two levels. YCT is tailored to youngsters’ interests in terms of topics, presentation, and length. Statistics and research about YCT are difficult to find.

Hanban commissioned Peking University to develop BCT, which was launched in Singapore in 2006, to assess Chinese proficiency in business communication and facilitate decisions about hiring, promotion, and training at workplace. The listening-reading test has five levels and the speaking-writing test has three levels. Peking University’s BCT R&D Office studied the results of three BCT tests in Singapore from 2006 to 2007 and compared them with the results of HSK tests that took place in China around the same time. Their technical reports indicate that BCT scores were highly comparable with the HSK scores, thus indicating high reliability. The Singapore Workforce Development Agency has adopted the BCT as its tool for testing Chinese proficiency.

Hanban launched the New HSK in November 2009. The written test is composed of three grades/formats and six levels, and the oral test has three levels. Hanban has not released the identity of experts who developed the test, and it is unlcear whether relevant statistics, technical reports, and research articles in the development and trial phase were compiled, analyzed, and released to the public, or to TCFL professionals and researchers based in China and abroad.

On 28 June 2010, MOE announced that the passing score for the New HSK levels 4, 5, and 6 was 180, same as the corresponding HSK certificate, and that the two would be in concurrent use for a period. While this announcement left individual universities to decide if they would require the HSK or the New HSK score, the announcement suggests that the New HSK is poised to replace the HSK when the overlapping period ends.

Three Products from the HSK Center of BLU

From 2006 to 2010, the HSK Center of BLU marketed three tests to improve or complement the HSK: the Revised HSK; the HSK Threshold; and the Test of Practical Chinese, known as C. Test. All three were implemented between 2006 and early 2007.

The C. Test focused on communicative ability in social and everyday life in international environments. In addition to the main test, which resembles the HSK E&I, the C. Test has an independent speaking component that utilizes the interview format, generally recognized as having higher validity in testing actual communicative ability than an audio-recording format. Similar to the IELTS (International English Language Testing System) speaking test, the C. Test is the only test developed in China that uses an interview format. The HSK Center of the BLU implemented the C. Test in China and Japan in 2006, and introduced it to South Korea in 2007. C. Test certificates are issued by the HSK Center of BLU.

In March 2007, the HSK Center implemented the Revised HSK with notable improvements: (1) grading was improved by having an elementary, intermediate and advanced test with equitable scores vertically; (2) structurally, an independent writing test and an independent speaking test using audio recording were added to all three formats; and (3) at the micro level, the validity of test formats and test items was improved. In 2007, the HSK Center of BLU administered the Revised HSK Intermediate four times, and the Basic and the Advanced twice. Hanban, however, explicitly announced on its website (in Chinese only) that the HSK Revised as well as two other tests developed by the BLU were illegitimate in July 2007. The HSK Center of BLU continued administering the Revised HSK tests regularly for three more years. But after a MOE announcement of June 2010 stated that the New HSK transcript and the HSK certificate would be used in tandem only for a period of time, the Revised HSK disappeared from the HSK Center’s 2011 testing schedule posted online in November 2010.

The HSK Center developed the HSK Threshold to cater to adult beginning learners who were below the HSK Basic levels but wanted to assess their progress in language learning. No statistics or report indicates when and where it was implemented. But the “HSK Threshold” button continues to be on the HSK Center’s official website as of March 2012.

Products, Market Shares, and Politics

The status of various HSK tests discussed so far has been affected by how effectively the content and skills tested could measure progress and be relevant to targeted demographics and intended markets—as well by politics. Although the targeted demographics of the BCT and the C. Test overlap, for instance, the BCT emphasizes business content and has won the endorsement of the Singaporean government, while the C. Test distinguishes itself with its speaking test and has gained a moderate overseas market share in Japan, South Korea and China. These two tests may develop in tandem in the foreseeable future.

The status of the YCT and the HSK Threshold seems to depend on targeted demographics and markets as well as on the economic and political resources the test implementers could mobilize. The YCT targets the young learners and/or youth, including those with relatively low proficiency levels, particularly in overseas contexts. This small HSK testing market has potential to grow given the increasing number of children and youths of both Chinese and non-Chinese backgrounds learning Mandarin Chinese outside of China. The explicit Hanban mission of promoting TCFL overseas, backed by financial resources from the Chinese state, is likely to aid this growth. In fact, since 2004, Hanban has been controversially setting up Confucius Institutes / Classrooms abroad, and has been offering scholarships for outstanding students from these institutes, as well as those who obtained excellent HSK scores, to participate in Chinese language and composition competitions, summer camps, and half-a-year to full scholarships in degree programs in China. The HSK Threshold targeted a small testing demographic that potentially overlapped with the HSK Basic and the YCT, and targeted a largely overseas market where BLU as a university had limited reach. Under these circumstances, the political and socioeconomic resources Hanban enjoys easily eclipsed the HSK Center of BLU, and the HSK Threshold left little mark in HSK testing.

The trajectories and status of the HSK, the New HSK, and the Revised HSK seem to have hinged on the material and symbolic resources the testing institutions could mobilize. The three tests targeted essentially the same and the largest testing demographics and the most profitable testing markets both in China and abroad, partly attributable to the “state-level” test status. Institutions and individuals involved in implementing the main test, particularly the elementary and intermediate formats, directly benefit financially. Under these circumstances, the political and economic power of Hanban as a state-level institution seemed to have played an important role in the rise of the New HSK, the termination of the Revised HSK and the possible fading of the HSK. The merit of the respective tests seemed to play a secondary role in their respective outcomes, particularly in regard to the New HSK and the Revised HSK.

The New HSK has merits in terms of levels corresponding to the CEFR but has a lot to learn from the Revised HSK in terms of formats and test items. For instance, the Revised HSK replaced most of the isolated grammar and vocabulary items with listening and reading comprehension items based on longer stretches of audio and textual materials that provide better context, which increased its validity in testing actual comprehension of contextualized materials that resemble real-life communications. In similar segments, the New HSK tends to use short conversations with less contextualization. Furthermore, the Revised HSK removed the read-aloud segment from the HSK Advanced speaking test and shifted the focus to actual speaking. This again increases this test’s validity since the ability to read aloud does not necessarily equal to the ability to communicate, nor is it an essential skill for most of the Advanced test takers unless they intend to teach Chinese or becoming news anchors in Chinese (Shen 2009). The New HSK speaking test has retained the read-aloud format. Additionally, for the extended speaking segment of the speaking test, each level of the Revised HSK includes one narrative and one opinion/debate topic, and thus tests two types of communicative abilities in two different genres: the more personal and specific narrative and the more abstract argumentative genre. On the contrary, in its online sample tests, the New HSK has one narrative and one opinion/debate in the elementary test, two opinion/debate topics in the intermediate test, and two narrative topics in the advanced topics.

The role of politics in the status and outcome of some HSK tests is understandable given that language testing is far from a purely linguistic matter in any context, and language testing worldwide is an industry and business in which politics, profits, and the quality and reputation of specific tests interact in complicated ways. All gatekeeping tests are concerned with legitimizing the distribution of scarce or limited symbolic and/or material resources to some while denying them to others, and language testing serves as one such way of legitimization. In this case, those who earn 180 in the New HSK Level 3 are deemed worthy for admission to degree programs in Chinese universities because they have sufficient linguistic skills to ensure successful completion of their studies, while those who earn 179 are deemed inadequate and thus ineligible for admission. Of course, the rationale for setting this cut-off point is as much about how many seats are available in the universities as about an individual applicant’s linguistic skills, among other factors. As of 2012, reaching this cut-off point is non-negotiable for foreign students on Chinese government scholarships and Chinese national minorities, but serves as a reference only for self-sponsored foreign students. Because other support factors play major roles in academic success or failure, the quality of a test matters, but it is not the only, or even the most important factor in students’ future success, and it is not necessarily the make-or-break factor in a test’s life.

Indeed, considering the current socioeconomic and political stage of China as a developing country learning from the West, it is particularly interesting that the HSK Center of BLU contested and competed with Hanban for several years if not more than a decade, over the financial profits and ownership of the Center’s intellectual products, which may have partially motivated or forced Hanban to develop the New HSK. The struggle between the two institutions and the uncertainty it created may partially explain why few statistics seem to be available from either side from 2006 to 2010.

Challenges and Opportunities

HSK tests, like other language proficiency tests, have been challenged to reconsider what constitutes successful communication, what the tests intend to measure and how to better measure it, and with what consequences for individuals and groups of diverse backgrounds. Critics of language testing believe that the conceptualization of communicative competence as individual capacity and highlighting linguistic competence at the expense of other factors ignores the co-constructed nature of real-life interactions, and ignores the issue of legitimacy in determining who have the right to speak and be heard. HSK tests have a very short history and relatively limited reach in their gatekeeping functions comparing to other large-scale standardized language proficiency tests, such as TOEFL. But the HSK tests’ reach and impact are likely to increase. Therefore, increased test validity and ethical responsibility of HSK tests would be welcome in the increasingly plurilingual world today.

I thank SHEN Mengmeng for gathering, scanning and sending research papers for me in Shanghai, without which this entry would have been impossible.

Further Reading

Source: Han Huamei. (2012). HSK (Chinese proficiency test). In Zha Qiang (Ed.), Education in China: Educational history, models, and initiatives. Great Barrington, MA: Berkshire Publishing.