Japan’s Self-Defense Forces personnel examine a still-inundated area following the Kinu River’s 10 September flood.
After the Kinu River’s 10 September flood, Japan’s Self-Defense Forces personnel examine a still inundated area near the site where the river overflowed its banks. Credit: Yomiuri Shimbun via AP Images

Twitter’s 140 characters may be the perfect venue for offhand comments, but the terse notes can speak volumes when it comes to mapping flood hazards.

That is the main message of a new study that compared two maps of a flood near Tokyo. One map was created with water level estimates derived from Twitter. The other was a map of the actual flood data. Researchers found that the maps were remarkably similar.

The Kinu River flood of 10 September 2015 killed two people and injured at least 30 more. The water level began rising at 6:00 a.m. By noon, broken levees gushed water that reached up to 8 kilometers beyond the breach, carrying homes and cars with it.

“It is impossible to put measuring devices everywhere. So I collect information via social media.”

The flood prompted Yongxue Shi from the Graduate School of Engineering at Kyoto University to ask how people use social media during floods. Can Twitter be used to reliably create hazard maps in real time?

“It is impossible to put measuring devices everywhere,” Shi explained. “So I collect information via social media.”

From Tweet to Map

Shi’s technique borrows from a method frequently used by scientists who want to recreate historic events that happened during times when precision instruments weren’t available. Without precise data, scientists turned to written descriptions in historical documents. In a similar fashion, Shi wondered whether she could map flood events from information documented in posts and tweets.

Seeking answers, Shi sifted through hundreds of tweets by inputting keywords into a text-mining algorithm, then hand selecting those of good quality. In all, she found 109 tweets about the flood that passed her screens. Tweets that included photos or location information such as street names or local landmarks were valued higher than those that were vague in nature.

Map of the 2015 Kinu River Flood, overlain with information from Twitter.
Map of the 10 September 2015 Kinu River Flood, overlain with information from Twitter. The map shows the maximum water depth of the flood, with dark blue representing deep water and yellow representing shallow water. The red outline encircles the flooded area at 6:00 p.m. on 10 September 2015, according to data collected by the Geospatial Information Authority of Japan. Shi derived flood data from Twitter for locations represented by red dots. Credit: Yongxue Shi

For example, a tweet saying “Apita Supermarket suffered severe damage, it looks like this” paired with a photo provided useful flood data. Of the 109 tweets, almost half were posted with photos, half could identify specific locations, and one third were posted in real time.

Shi plotted each Twitter-driven data point on a map, which she presented with her research Tuesday at the American Geophysical Union’s (AGU) Fall Meeting in San Francisco, Calif. She then compared that map with actual flood inundation extent data, measured via helicopter by the Geospatial Information Authority of Japan.

The comparison showed strong agreement between the Twitter-informed flood data and actual flood inundation data, leading her to conclude that social media can be relied on to project the extent of flood inundation.

“It is trustable. I will use social media for my research,” said Shi, who studies flood disaster prevention.

Text Mining and Data Quality

Now that Shi established that relevant tweets can provide information for mapping floods, the next step would be to refine how relevant tweets are identified and interpreted. Shi’s highly involved methods required that she read hundreds of tweets, culling out the best by hand before plotting the data on a map to analyze for trustworthiness.

However, other methods circumvent human involvement by using only a text-mining program to select tweets. Such methods have the potential to reduce manpower time when mapping floods.

In one such study, also presented at AGU’s Fall Meeting, Christopher Scheele from the Department of Geography at the University of Wisconsin–Madison assessed the accuracy of using an enhanced text-mining program to extract disaster-related data from social media. The text-mining program finds relevant tweets using keywords or machine-learning algorithms.

Right now, however, computers make mistakes. Scheele says that the biggest hurdle is that some tweets get misqualified.

For example, a tweet that says, “Check out this high water,” could pass the search for flood-related tweets, but it lacks quality of information. The tweet could be describing a local flood just as easily as it could be describing a flood on television or even in the bathroom.

The trick is to ensure that the text-mining algorithms pull the right tweets.

“We need to be skeptical with data from social media,” says Scheele, whose concern is that text-mining applications misinterpret vague messages. “But not a lot of people are looking at the data as detailed as she is. That is how her study is so useful.”

Shi’s results suggest that the tweets are accurate in describing the flood. The trick then, Scheele explained, is to ensure that the text-mining algorithms pull the right tweets.

Enhancing Disaster Relief with Social Media

Shi surmises that mapping data from social media could give early warning to people living in downstream areas if the map can be created quickly enough. This information could also target relief efforts to speed up response time.

“If we can guide people in the proper way, warn schools, residents, then many lives can be saved.”

Her next step is to combine knowledge gathered through social media with hazard simulation programs. Hazard simulation programs are used to predict and respond to future flooding. By entering known flood details, these computer programs estimate the potential impacts, pinpoint dangerous regions, or project the geographical extent of the flood.

Using information gleaned from social media to inform simulations could make the resulting projections more accurate and timely, explained Shi, who once escaped a flood herself by fleeing to a rooftop. “If we can guide people in the proper way, warn schools, residents, then many lives can be saved.”

—Teresa Leigh Carey (email: teresacarey003@gmail.com; @teresa_carey), Science Communication Program Graduate Student, University of California, Santa Cruz


Carey, T. L. (2016), Can data extracted from Twitter help map flood hazards?, Eos, 97, https://doi.org/10.1029/2016EO065183. Published on 16 December 2016.

Text © 2016. The authors. CC BY-NC-ND 3.0
Except where otherwise noted, images are subject to copyright. Any reuse without express permission from the copyright owner is prohibited.