China has data problems. Even for the government, good information about everything from the economy to where people live can be hard to come by. So how does it plan to systematically track the political views of its university students? The answer is big data.
One reason China’s data are so bad is simply the difficulty of collecting data on so vast a country. This is especially true for rural areas, where there is little technological infrastructure. But a bigger problem is that many of the people responsible for reporting data have incentives to massage them. The performance of local leaders is often assessed by looking at GDP growth, which leads many of them to exaggerate that figure. Big data is a powerful alternative because it relies on the raw, unstructured data of millions of individuals, meaning it is not subject to such manipulative middlemen.
Consider the example of “ghost cities,” built urban areas with huge but empty housing units. Finding them has long been an impossibility, because China’s data on vacancy rates are unhelpfully thin. But Baidu, the country’s largest search engine by far, was able to circumvent these problems by drawing on its database of hundreds of millions of users, accurately identifying places with lots of housing but no people.
This approach also allows the government to see not just where people are and what they are doing, but how they think.
A new report (link in Chinese) from Studies in Ideological Education, a publication put out by China’s education ministry, suggests using big data to track the political views of individual university students. The report—written by a member of the propaganda department committee at University of Electronic Science and Technology in Chengdu—advocates creating a “political ideology database” that pulls data from library records, surveys, social media, and other sources to collect “quantifiable, accurate, and personalized information” and “improve the effectiveness of ideological education.”
“Collecting and analyzing data on ideological behavior,” the report continues, “can reveal trends in students’ thinking and values, as well as the social issues they are paying attention to.” The report specifically calls out the problems of the pre-big data era, which it says relies on the “flawed logic of experience and intuition.”
It’s still just a suggestion, and the report does not say what exactly the government should do with such a database. Still, its tone throughout suggests that the country is rather behind the times for not having this already. “There is a severe lack of big data knowhow,” it says.
The database could of course be used for big brother-esque surveillance—helping to systematically identify students with opinions that appear to oppose the government—by tracking critical posts on social media or controversial library book checkouts. It could also be used more innocently, as a way to quantify public sentiment that is more scientific than crude polling. Or it could do both of those things.
It adds that data collection should be done “within the scope of the law,” but what that means will probably be for the Communist Party to decide.
It is certain, however, that the Chinese government thinks big data will be an important tool going forward, both for measurement and control. Officials told Aeon magazine that they regretted shutting down the internet following protests in the restive western province of Xinjiang—where ethnic tensions are always high—because it meant they weren’t able to collect data on citizens there.
Also: there is now a “social credit system” that collects Chinese citizens’ finances and personal information, creating a sort of credit score for life.