Sign in or Register

Fictron Industrial Supplies Sdn Bhd
No. 7 & 7A,
Jalan Tiara, Tiara Square,
Taman Perindustrian Sime UEP,
47600 Subang Jaya,
Selangor, Malaysia.
+603-8023 9829
+603-8023 7089
Fictron Industrial
Automation Pte Ltd

140 Paya Lebar Road, #03-01,
AZ @ Paya Lebar 409015,
Singapore.
+65 31388976
sg.sales@fictron.com

How Language Shapes Password Security

03 Oct 2019
How Language Shapes Password Security
View Full Size
It does not matter the dissimilarities in language and culture, both Chinese- and English-language Internet users obviously find common ground in using easily guessable password variants of “123456.” However a recent study comparing password patterns among the two languages also found notable and unique features in Chinese passwords that have big implications for Internet security beyond China.
 
The password habits of Chinese-language users have been amazingly understudied given that they make up more than 20 percent of all Internet users worldwide. A little over 854 million people use the Internet in China alone — more than double the entire population of the United States. That's the reason a group of Chinese and U.S. researchers set out to test how password security among both Chinese- and English-language users stands up on the best cracking algorithms.
 
“Our work may be among the first studies to examine the passwords of different languages,” says Ding Wang, an information security researcher at Peking University, in Beijing.
 
Wang and his peers analyzed 106 million real passwords from nine Web services — 73 million passwords from six Chinese-language services and 33 million passwords from three English-language services — unveiled by hackers and leaked online between 2009 and 2012. They were careful to directly compare the security of passwords only from similar Web service counterparts among the mix of social forums, gaming services, e-commerce websites, and programmer forums, plus the Yahoo Internet portal on the English-language side of the data set. Their results appear in a paper [PDF] presented at the 28th USENIX Security Symposium held in Santa Clara, Calif., from 14 to 16 August.
 
What may seem like a strong password based on English-language assumptions could actually be quite weak and easy to guess from a Chinese-language perspective. Yet many of the world’s popular Web services, including some homegrown Chinese services, approach password security from an English-language perspective.  
 
The specialists pointed to the example of the popular Chinese password “woaini1314” that is currently rated “strong” by password strength meters used by AOL, Google, and even the well known Chinese social network Sina Weibo (and by IEEE Spectrum’s parent organization, IEEE). And yet speakers of Mandarin Chinese, the most popular spoken dialect of Chinese, can very quickly guess the “woaini1314” password because “woaini” in Chinese pinyin (romanized system of Chinese characters) means “I love you,” and “1314” sounds like “forever” in Chinese.
 
One key difference between Chinese-language and English-language passwords is that many Chinese-language users favor passwords consisting entirely of digits. Beyond the infamous “123456” password, other popular passwords among Chinese-language users include “111111,” “123123,” and “123321.” Playing on the love theme, “5201314” is used because it sounds just like the phrase “I love you forever and ever” in Chinese. Some popular password segments will add a letter to the string of digits, such as “a12345” and “12345a.”
 
Chinese-language users also frequently use their mobile phone numbers or certain dates (perhaps their birthdays) in passwords — something that English-language users don’t do as often. Instead, English-language users usually compose passwords made solely of letters and lean toward certain words or phrases such as the easily guessable “password,” “letmein,” “sunshine,” and “princess.” Some of the most popular passwords include “abcdef” and “abc123” alongside “123456.”
 
Passwords that use only digits are less difficult to crack than passwords made only of letters because the digit combinations are based on just 10 possible digits as opposed to 26 letters in the modern English alphabet. But Chinese-language speakers quite often demonstrated incredibly complex and creative passwords: Some members of the Chinese Software Developer Network (CSDN) service combined programming language commands with traditional Chinese poems.
 
“Chinese users can be really creative with combinations of letters and digits,” says Yuan Tian, a computer scientist at the University of Virginia in Charlottesville, Va., and coauthor on the study. 
 
The password files used by researchers contained hashes of leaked or taken passwords, not plain-text versions of the passwords themselves. The researchers attempted to decode both Chinese-language and English-language passwords using two state-of-the-art algorithms for cracking passwords. They tested the Markov-chain model, which assigns various probabilities to password characters based on their relationships with one another, and the probabilistic context-free grammars (PCFG) model, which parses passwords into letter segments, digit segments, and symbol segments before estimating the order of the most likely combinations.
 
The team also upgraded the PCFG approach by customizing it to account for certain password patterns more common to Chinese-language users. To illustrate, they added number segments in the popular date format and Chinese names as written in the romanized Pinyin system. They will also gave their PCFG-based algorithm the capability to process the interleaving patterns — strings of switching digits and letters — found in so many Chinese passwords.
 
Together, those efforts boosted the modified PCFG-based algorithm’s performance versus the Chinese password data sets — it cracked between 98 percent and 188 percent more passwords than the standard version of the algorithm.
 
The results also pointed out primary strengths and weaknesses of Chinese-language passwords in comparison with English-language passwords. Both types of algorithms cracked more of the easier Chinese passwords when compared to English passwords when limited to 10,000 or a fewer amount of guess attempts. But the remaining Chinese passwords proved stronger than their English password counterparts as the number of guesses grew beyond 10,000 attempts.
 
The number of guesses matters because a few Web services limit the number of online guesses before momentarily locking a user’s account. Leaked or stolen password storage files can allow hackers to make a theoretically unlimited amount of offline guessing attacks because they do not have to handle possibly being locked out of a Web service. But even offline guess attacks are still limited by the cost-effectiveness of spending computing time and resources on numerous guess attempts.
 
It’s also clear that individual Chinese-language speakers can do themselves a favor by preventing using predictable digit patterns such as “123456” and “111111” for their passwords, along with the predictable letter and letter/digit hybrid patterns based on romantic themes of eternal love. (The same goes for English-language speakers still using “123456” and “abcdef”—just stop!)
 
The complexities of language’s influence on passwords may go even much deeper within just the Chinese-language community. Chinese-language users commonly rely on the same set of Chinese characters for reading and writing, but spoken Chinese has many different regional differences based on local dialects that can sound different when it's about pronunciation. As just a good example, the pronunciation of “I love you” in Mandarin Chinese — considered mainland China’s official national language — does sound different from the pronunciation of the same phrase in the Cantonese branch of Chinese spoken by so many people living in or originating from places such as Hong Kong, Macau, and Guangdong.
 
Those regional distinctions in spoken Chinese were beyond the scope of this special study. But Tian observed that there may just be differences in password patterns if speakers of Cantonese, Hokkien, Shanghainese, or other regional variants of Chinese tried making passwords based on pronunciation.
 
Together with a deeper dive, researchers hope to continue evaluating Chinese-language password patterns by using surveys to better understand what Chinese Internet users are thinking when creating their passwords. And they raised the possibility of continuing their comparative studies of passwords in different languages beyond just Chinese and English. “For our future work, we want to cover passwords around the world beyond China,” Wang says.
 

You have 0 items in you cart. Would you like to checkout now?
0 items
Switch to Mobile Version