TY - GEN
T1 - VizByWiki
T2 - 27th International World Wide Web, WWW 2018
AU - Lin, Allen Yilun
AU - Ford, Joshua
AU - Adar, Eytan
AU - Hecht, Brent
N1 - Funding Information:
This work was funded in part by the U.S. National Science Foundation (IIS-1702440, IIS-1707319, CAREER IIS-1707296, and IIS-1421438).
Publisher Copyright:
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
PY - 2018/4/10
Y1 - 2018/4/10
N2 - Data visualizations in news articles (e.g., maps, line graphs, bar charts) greatly enrich the content of news articles and result in well-established improvements to reader comprehension. However, existing systems that generate news data visualiza-tions either require substantial manual effort or are limited to very specific types of data visualizations, thereby greatly re-stricting the number of news articles that can be enhanced. To address this issue, we define a new problem: given a news ar-ticle, retrieve relevant visualizations that already exist on the web. We show that this problem is tractable through a new system, VizByWiki, that mines contextually relevant data visualizations from Wikimedia Commons, the central file reposi-tory for Wikipedia. Using a novel ground truth dataset, we show that VizByWiki can successfully augment as many as 48% of popular online news articles with news visualizations. We also demonstrate that VizByWiki can automatically rank visualizations according to their usefulness with reasonable accuracy (nDCG@5 of 0.82). To facilitate further advances on our "news visualization retrieval problem", we release our ground truth dataset and make our system and its source code publicly available.
AB - Data visualizations in news articles (e.g., maps, line graphs, bar charts) greatly enrich the content of news articles and result in well-established improvements to reader comprehension. However, existing systems that generate news data visualiza-tions either require substantial manual effort or are limited to very specific types of data visualizations, thereby greatly re-stricting the number of news articles that can be enhanced. To address this issue, we define a new problem: given a news ar-ticle, retrieve relevant visualizations that already exist on the web. We show that this problem is tractable through a new system, VizByWiki, that mines contextually relevant data visualizations from Wikimedia Commons, the central file reposi-tory for Wikipedia. Using a novel ground truth dataset, we show that VizByWiki can successfully augment as many as 48% of popular online news articles with news visualizations. We also demonstrate that VizByWiki can automatically rank visualizations according to their usefulness with reasonable accuracy (nDCG@5 of 0.82). To facilitate further advances on our "news visualization retrieval problem", we release our ground truth dataset and make our system and its source code publicly available.
KW - Data visualizations
KW - News articles
KW - Peer production
KW - User-generated content
KW - Wikimedia commons
KW - Wikipedia
UR - http://www.scopus.com/inward/record.url?scp=85076996444&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076996444&partnerID=8YFLogxK
U2 - 10.1145/3178876.3186135
DO - 10.1145/3178876.3186135
M3 - Conference contribution
AN - SCOPUS:85076996444
T3 - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
SP - 873
EP - 882
BT - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery, Inc
Y2 - 23 April 2018 through 27 April 2018
ER -